Engineering Efficiency Explained | Developer Experience
Don Brown
November 15th, 2021
Top ten tips for pain-free pull requests
So my first step for you, if you're trying to improve your pull request process, is to measure how long it takes to do a pull request. There are a number of tools out there to measure how long your pull request takes. For example, here's one that I found on a quick Googling. It's called the Pull Request Analytics for Bitbucket. So if you're using Bitbucket, you can use this. You will get some graphs and start measuring how long it takes a pull request.
But, if you want a more general tool that measures this and other things, well, as you'd expect, I'm going to tell you, Sleuth is a great option. So with Sleuth, in particular this change lead time, it's going to be looking at how long you're encoding, how long you're waiting for someone to review, and how long you're reviewing. Very simple. Just find out how long your pull requests are taking. Then you can break it down further. And that's tip number two, focus on the biggest bucket.
So, before you can focus on the biggest bucket, you need to know what the biggest bucket is. First is, when you open a pull request, how long until the first person reviews it. We'll call that review lag time. Second is this number, how long once someone has done a review, does it take until it is done? And actually, that could probably be split into two, which is how long does it take until it's ready to be merged, and how long does it take to actually merge? And based on where the slowdown is, it depends on what kind of changes you're going to make to your process to fix those slowdowns.
So, for example, I'm going to give you the next two tips are going to be, what if waiting for the review is what's taking so long? This is your big bucket. As you can see for us, that's not the big bucket. But let's say that for your organization, this is the big bucket, which is review lag time.
Only assign one person to the pull request. A common problem of pull request is, you want people to review it. And you would think, "I'm just going to put everybody on the review." In a team of 15, that basically means that no one's going to do it. If everyone's responsible, no one's responsible. Now, it's not always one person. It might be two people. But the point is, you just want to narrow that down? You want to create responsibility. If I know that you're waiting on me, I'm going to be much more motivated to do it, than if you're waiting on anybody in my 15-person team.
What you end up happening is one or two developers who feel the guilt like, "Ah, I need to review the code." And they're the one or two developers that are always doing the review. The rest of the people rarely do the review. And so what you're doing is you're burning out that person or couple of people that is responsible and always doing the review, and you're letting the rest of them off the hook. That's not what you want. So by assigning one or two people to review is a great way to create that personal accountability.
Another tip though is to consider multiple things when you are going to assign a person to review. And the two things that I've found to be useful, and in fact, actually, I first heard about this from Google, is technical expertise and domain expertise. In Google, if you wanted to make any kind of change request, you needed to have two people on the review. First, someone who is an expert in that service, someone who knew the product, the domain. But the second person needed on that review was someone who is an expert in the language. When you're having someone review your code, you want someone who's going to be an expert in the domain, like, "Hey, that button doesn't make any sense." And someone who's an expert in the language to say, "Hey, the way that you wrote that code doesn't make any sense."
There's a great way that I found to do this in GitHub. And that is the code owners file. So, in a code owners file, if you don't know it, GitHub has a way that you say that this group owns this directory. So this is a great way to set up those domain experts.
The second set of tips is, how you make the pull request period itself be as short as possible. Going back to my team, you'll see that, that's actually where we spend the most time. So if I wanted to optimize this process, I could optimize coding time. That one's an interesting one. Or the other biggest chunk is going to be review time. And that's why this analysis, again, is so important. Because if I'm focusing on, if I'm thinking, man, we need to get people to assign reviews, and that's the problem, then I would create assign review. I would create a process. I would create tracking, all these stuff. But even though I have to look at the metrics, I would see, actually, it's not that big a deal for us. We need to figure out how do we improve the review time. And so, if that's your case, then one of the best tips I can tell you for improving your code review time or you pull request review time is to automate the small stuff.
It drives me crazy to no end to see a pull request and a bunch of people who have commented on all these pull requests and say things like, "Oh, you need to add a space here. This should be capitalized." Or my least favorite, "Oh, this import order is wrong." Almost every language, Python, Java, many of them have imports at the top of the file, and some people really want them ordered in a certain way. And that's fine to want these things. You shouldn't waste developer time on it. That is one of the dumbest things you can have a developer do, which is make comments about code formatting. This is 2021. We have better ways to do this.
For example, I like to use Black. And when I say, I like to use Black, I want to be clear, I don't like Black. I don't think they do a good job of formatting. I don't think they format the way that I like it to format. However, the value of having a consistency in your code base, that you don't have to discuss, fight about it and deal with pull requests, is way higher than getting my perfect formatting the way I want. Definitely recommend Black.
The other one that Andy points out is isort, which sorts your imports. The one that I'll recommend for that one, although isort's good as well is, Andy writes code, our very own Twitch streamer, he writes this, reorder-python-imports. I find this one really good. It's exactly what I like. So I use this one. I use Black. I use Pylint, which will do things like, is this variable not being used? Is this [type signature back to the call 00:07:19], stuff like that. And then there's also mypie. If you're using Python, and you're typing, mypie is awesome. Almost every modern language nowadays has an automatic formatter, has a linter, has things to reorder you imports. Use them. Stop arguing about it on pull request. You're just wasting everybody's time.
This is one you're not going to want to hear. In fact, few developers want to hear, "Oh, you should probably write a tech spec for that. No, don't get started coding it. Don't just build it. You should sit down and design it. You should sit down and write some diagrams, write some words." It's the worst. It slows you down. It's a pain in the ass. But, it is way better than the alternative, which is going into a pull request and having the top comment saying, "Why'd you do it this way? We should do it totally different." Because by the time the pull request is usually created, the code has already been written. You already spent a day or hours or days on it. And that's not the moment you want to hear that you did the whole thing wrong.
So, what I recommend doing is, for anything that has any impact to an API, anything that touches database schema, anything that touches something would be hard to reverse later and/or would take a long time to implement, do a tech spec for it. Tech specs don't have to be long. A common strategy I do, so if I look at a pull request, and if it has more than five or 10 comments, something's wrong. And that something could be multiple things, which leads me to my next tip.
You want to keep your pull request small. If you have a pull request which has 10 plus comments, often, a cause, it is a giant pull request, which means one of two things. One, it means that nobody reviewed it because it's just too much work. They looked at it and said, "Ah, screw it." So either they don't review it, or they review it too late, because they put it off, put it off and say, "Ah, I need to do that pull request. Oh my God, it's going to take so long." So they'll put it off, causing your lag time to go huge.
My next tip is to take it offline. What do I mean by that? One of the biggest causes for giant pull requests is where you have two or more developers going back and forth, back and forth, back and forth, arguing, debating, point, counterpoint, point, counterpoint. The pull request is not the forum to do that. My rule is, if there's ever more than a message, a comment and a reply, and another message at that point, you should take it offline. And what I mean by offline is, if you're in the same office, take those two developers, put them in a room and tell them to have it out.
Don't have it all on the pull request. It makes it take way too long. It's annoying to read as another reviewer. It's clutter. It's noise. And often, it can have some heat to it. And you don't want heat. You want pull requests to be things where you're discussing, how do we get the best code out of this process? Not, am I right? Are you right? Am I better than you? Are you wrong? Whatever. You don't want that. You want it to be about the code to have the best possible code and get that out as quick as possible.
All right. So now we've talked about how to make the lag time better, so get your pull requests reviewed quicker. We talked about how to make the pull requests reviewed time smaller. Now let's talk about how to make the period between pull request is ready and when it's deployed, how do we make that quicker? And there's a couple of suggestions I have for you. Automate your deployment. It sounds pretty basic, but I'm always surprised how many people don't do this. If you're trying to optimize just your pull request, and you still have a really manual deployment process that you do every six months, it kind of doesn't matter. That's really not going to get you anywhere.
You need to be thinking holistically. You need to think about, from change to production. If it's hard to get any changes into production, then as a developer, I'm not going to make those little changes. I'm going to focus on the one task I'm supposed to do and not do anything else, because it's too painful. But if I make it easy to deploy, then I'm going to be more motivated to make all those little changes. And so, in the end, my tech debt of a project is going to be way lower than it would be otherwise. And as you know, in a large source project, or maybe you don't but I'll tell you, in a large project, tech debt, what once was a two-day task, becomes a nine-month to 12-month task. So the more you can tackle things early, the better off you're going to be in the long run. And again, the best way to do that is to make deployments as painless as possible, fully automated.
Last point is automate merging. So, once you've automated deployment, actually, this is a more advanced tip, you can think about automate your merging. So basically what happens is, once your pull request is ready, GitHub itself will automatically merge it for you. GitHub can actually, as a new setting, it just came out a few weeks ago, where it can automatically merge pull requests that have met all the conditions you set. So you can set up all the rules on when it can be automatically merged, what type of merge, things like that. And then GitHub will merge it once all your CI checks go green, once all your approvals come in. You don't have to deal with it.
There's a lot of things it doesn't take into account. So if you want a more advanced example of it, I recommend Mergify. I don't use it myself. I don't think our team is big enough where we're blocked. It's mostly useful when you have a lot of developers trying to deploy at once, and you single-file deploy, and you need to create a queue. Then these kind of tools can be really useful. If it's just one dev, eh, actually it still might be useful. It depends. It depends on you and your team, and if it makes sense.
Those are some tips that I have found. The takeaway, if you remember nothing else from this conversation, is pull requests don't have to suck. Pull request, for me anyways, will never be something I look forward to, never be something I am delighted by. However, they can not suck as much. And in fact you can, eh, I take it back. You can make them delightful, in the sense of, you can make them not get in your way to deploy code. You want to have that testing done, so that you don't break production. Because remember, if you go fast and break production, that's a negative. If you don't do anything and production's positive, healthy, that's also negative. You want to find that balance between going fast and not breaking production, and pull request is a really valuable way to do it.