I joined the company last year in April into a team of Data Engineers. The team consisted of 3 interns and 3 Data Engineers collaborating using the version control system, Git. We didn’t have much of a structure in place to peer review the code that went into production. We did use pull requests on Github at the time, but we had all the issues in our git repositories that an early stage startup might have. We added inessential files into the repositories (such as .DS_Store) along with hardcoded paths to local files, no proper commit messages, no logical commits to name a few.
Since then, we have come a long way. Our team has grown from 6 engineers to 14 engineers. We have made significant improvements to our code review process. We ensure that at least two engineers see every piece of code before it goes into production. Our git commits are much more logical and always accompanied with much more meaningful and detailed commit messages than before. Through all of this, one of the decisions we had to make was to choose the right Git Workflow, specifically whether to “merge” pull requests or use “rebase and merge” instead.
In this article, I first discuss the “rebase” workflow that we follow at zeotap in the Data Engineering team. Then, I argue that the “rebase” workflow builds better team dynamics compared to the “merge” workflow. While the “rebase” workflow is trickier to follow, it has numerous benefits. For instance, it forces the developers to make logical commits resulting in better code readability and code ownership. It also fits well with the agile software development strategy by ensuring frequent commits and pull requests.
At zeotap, every developer works on a particular part/feature of a project, building commits in their own branch. Once a logical portion of the feature is complete, the developer raises a pull request on Github. Then, at least two different peer engineers review this pull request before it can be merged into the master/sprint branch.