What I Learned From 12 Weeks of Open Source With MLH
The week before I started my first software engineering internship, I saw a post on LinkedIn that caught my eye. It went something like:
“The solution to almost all technical tasks a junior developer will be asked to do exists on the internet. The job of the junior dev is simply to find that solution, understand it, and implement it. This should be comforting to you.”
At the time I thought this was correct.
Within a week of starting my internship I changed my mind.
Some things would be much more difficult if someone hadn’t already tried them, failed, and written about how to succeed, but there will always be so many things that nobody has written about yet.
I had the privilege of participating in a fellowship with Major League Hacking this year. The MLH fellowship aims to build vital technology skills in computer science students around the world. As a part of this fellowship at MLH I was paired with a pod of other student developers, two mentors / project maintainers, and an open source project that I was expected to contribute to. For me and one other student, this project was Hack. Here’s the repo if you want to take a peak at it. Hack is a programming language built by Facebook in 2014 on top of PHP. The project is open source, but is still maintained by Facebook.
The challenges I encountered were many, but I was so lucky to have the support of amazing mentors and brilliant peers.
Here are the things I tried, what I learned, and how I failed.
Navigating a massive codebase
The first thing I noticed when diving into the codebase for the first time was its scale. I had never worked on a project with more than a few thousand lines of code across a few dozen files. The hack project has somewhere around 74,012,058 lines of code across 134,484 files (I counted).
The first task I was given for the project seemed straightforward: find a specific poorly worded error message and rewrite it. This ended up being a good challenge for me because I had no idea where to start.
I quickly became close friends with the VS code search tool (cmd + shift + f). Being able to rapidly search the entire workspace for a code snippet was invaluable to me. If you aren’t a VS code user then “ git grep ‘search text’ ” does the same thing for tracked code in a git repo. This type of searching might seem obvious to you, but at the time it just didn’t occur to me.
Another thing that was helpful to me was the “jump to definition” feature from the OCaml IDE plugin I used. I wouldn’t have gotten anything done without it, seriously. Cmd-clicking a type opened up the file where that type was defined, and it gave me easy access to see all of the methods that I had at my disposal for that type. Again, this might seem trivial to you, but to me, this feels like a superpower.
Working in a massive codebase can be challenging, but there are some benefits to it too. One of these benefits is that almost all common problems across the project have already been solved somewhere. Finding where this code is can be tricky, but if you know what you’re looking for to some degree you can often find the code snippet you’re looking for to solve your current problem elsewhere in the project. I know on multiple occasions I asked for help doing something, and my maintainer responded with a code snippet from elsewhere in the project that did exactly what I needed.
OCaml? Like the animal?
My assignment within the Hack language project was to write syntax quick fixes. These quick fixes are essentially code suggestions that appear for developers when they make silly mistakes.
One example of a quick fix I wrote involves creating an instance of a class. When a new instance of a class is created, it is done with the ‘new’ keyword. If the developer forgets to put ‘new’ before the class name, they’ll be prompted to “quick fix” the code. If they do so, the keyword ‘new’ will automatically be added.
To my chagrin, the side of the project that makes these syntax quick fixes is written in OCaml.
I had never heard of OCaml before my first day working on the project, nor had I ever studied a functional programming language before. The phrase ‘out of the frying pan and into the fire’ comes to mind. I had worked with a few object oriented programming languages in the past and was familiar with their style of syntax.
OCaml syntax and patterns are significantly different from many other widely used languages. My first instinct when I started to get confused with the weird syntax was to consult the internet. Instead of finding a solution to my confusion, I found a Quora question thread.
The question basically asked why a school would ever teach OCaml over something more well known. The top answer was from a guy who had previously asked his professor this same question. The professor answered that it was because you couldn’t search up questions about OCaml on the internet like you could other languages. Basically, OCaml isn’t mainstream enough* to find significant help on the internet. This is less true today, but this answer held pretty true for me.
The maintainers of the project suggested my project partner and I read a book, Real World OCaml. Being a primarily self-taught programmer I had never read a programming language book and was convinced it was unnecessary. I was so convinced that I spent the first week panicking and struggling to understand any part of the project that I needed to make contributions to. After that first week, I asked my project partner, who was in the same boat as me, what she had done to be able to understand OCaml so well. She told me she read the book. The next day I broke down and read the book. It basically pulled back the curtains. I had no massive confusion from then on, and when I did, I just referenced the book. If someone smarter than you says to read a book, read the dang book.
*In France OCaml is commonly taught at Universities across the country. OCaml was created at INRIA, a French national research institute.
If All Tests Pass The First Time — Be Suspicious
I struggled with testing my code for the majority of my time working on the project. I ran the tests, and still my code failed on deploy. One time I entirely forgot to git add my tests at all.
Don’t be like me. Please. Testing is only as hard as you make it. Towards the end of my time on the project I realized that I had spent the entire time running the wrong set of tests. I wish I was joking.
I could pass this off as just a silly mistake, but it brought up something really important. I told my maintainers about my mistake, and they told me to write my tests first, watch them fail, and then write the code to fix them.
Though I learned about test driven development in school, I had completely neglected to following testing best practices in any way. I know some people aren’t big fans of test driven development, but it would have saved me headaches on about a dozen occasions. Looking back on all of the mistakes I could have prevented and time I could have saved with this one cheap trick is sickening. So until I know better, I think TDD is gonna ride shotgun with me.
Go be better than I was
I hope you found something interesting here, or at least had a good laugh at some of the silly mistakes I made. Most of all, I hope this helps at least one person succeed where I failed.