Real World Stories | Stories from the Trenches
Don Brown
August 18th, 2021
Deploying a monolith vs. microservices at Confluence
I've worked on the Atlassian Confluence team. Confluence is a product that is a collaboration tool, a wiki which was really popular in 2005. Not so much popular anymore, but basically, the idea of a Google Docs, where multiple people can collaborate on a document and content and then share that with the rest of the team. Confluence was interesting, because Confluence traditionally is a big monolith. It's a big code base, millions of lines of code, tons of developers. I think last I heard, they have 200 developers on it. I know compared to some teams, that's not that big, but for lasting at the time, especially even now, that's a good-size team.
A lot of people making changes, but because it's a single monolith, a single code base that needs to be released, it's naturally very slow to work on because you can't deploy every single change, because the deploy can take hours. In fact, actually with their current deploy process, as I understand it, it takes about a day or two. With all the different environments soaking, they want to send it to dog food and soak it there for a while and whatnot, so it takes a while to get a change out. As a result, each batch of changes is pretty large. So they'll ship a release, even if they ship it once a day, and that's going to have a bunch of different changes in it.
When you're a developer on the Confluence monolith, you're pretty disconnected from production. In fact, you're even really disconnected from the deployments themselves. As far as you're concerned, you develop something, you merge it into master, the main branch, and at some point that gets deployed to customers. You're not even really sure where or when or how, and you're not even really sure that you care, because it's disconnected. It's kind of like if I put you on a task and I said, "Well, you're going to have to wait 15 minutes for this one step to be done," what are you going to do? You're going to alt tab over and start reading the news or Reddit or Twitter or Instagram or do something. And then an hour goes by and then you think, "Oh, crap. I was doing something. What was that?" And then you go back to your task.
So just by inserting a delay of five, 10, 15 minutes is just enough for us to shift focus and then take hours or even days to get back into it. And so naturally, if you're a developer on the monolith, it can take a while. You're very disconnected from the changes. In contrast, I was part of a team that had taken a piece of the monolith, broken it off into a microservice, and just was working on that. What was fun about our project is we were able to ship every single change individually, so if I made a change, I could ship it out to production myself without anyone else being involved. If that broke, I know exactly what happened, because I'm the one that did the change. The change was small, low risk, all those wonderful things.
And so it was really interesting to be in the same general team as Confluence, but have such a different development experience as the majority of the Confluence developers. So much so that it motivated a whole decomposition effort to take this monolith and break it off into pieces, because the monolith developers would see the experience of what the microservice developers got to deal with and said, "I want that. I want to be able to have a change and ship that out immediately. This deploying something who knows when and to some big set of changes, and something fails and it all gets pulled back, I want out of it. I want to be able to deliver my change." So that was pretty exciting to see the old way and the new way completely side by side, and really get a pretty accurate feel of which is the future. And I think the future is smaller teams delivering software, owning it end to end.