Continuous deployment vs. continuous delivery. Here are some ways to not lose track of the ultimate goal: continuous improvement
Accelerating releases all the way to continuous deployment, that is, releasing more often, was never simply about speed or velocity. Those were simply great selling points to justify investing in such modern practices to a less tech-savvy or more business-minded stakeholder.
After all, time to market is a great way to pitch investing valuable engineering efforts in technical processes. Even that pitch is becoming less and less necessary. Today most organizations internalized the necessity of investment in the dev pipeline, and because of today’s deployment tooling and more established best practices, the investment required is not that huge of a deal.
There are many benefits to being able to deploy often. Code that is left in development branches, tucked in a drawer far away from reality, users, and real-world feedback will tend to fester.
Assumptions on what users want, how they use the system, and how the code will perform start piling up, take wrong evolutionary turns, and eventually lead to disappointment, lack of traction, or many months’ worth of technical debt. Deploying code faster does not inherently solve any of those issues. At its worst, it just means you’re continuously hurling features over the fence.
A Tale of Two Developers
Jim is a great developer. He is, in fact, the very model of a modern full-stack generalist. He speaks five programming/scripting languages fluently; he’s very well acquainted with matters related to DevOps and how to automate code deployments, and he’s also a huge advocate of agile, testing… the works.
He is writing his code using the latest microservices technology and continuously deploys using a plethora of strategies to the Kubernetes cluster. There are GitHub actions that launch every time someone blurts out even a minor code change, and the best of APMs is set up to monitor the production environment.
All of that is commendable and indicates just how much we’ve evolved and matured as an industry. However, there is one thing that is not included in Jim’s otherwise masterpiece pipeline of moving code seamlessly to production: learning.
While all of the right systems are in place — unless something catastrophic happens the feedback loop is minimal. The focus of Jim and his team is on the steps required to move the next piece of code down the pipeline, and the next one. They have optimized for bandwidth, velocity, and speed of deployment. They have even managed to keep the pipeline stable and reduce downtime drastically.
They have maximized the cadence of their releases and are pushing more features out each iteration, but they have not optimized for continuous improvement.
It is easy to concentrate on just one KPI — speed, and deliver the wrong code at a higher velocity. Feedback will eventually arrive, along with the technical debt check
Our other made-up developer is Jullian. He is not as fluent as Jim; he tries to automate the entire release pipeline but still has a few technical hurdles to resolve. However, Jullian has invested much of his time on two topics that Jim did not yet consider: What does he want to measure when he rolls something into production? How does he include that measurement in the development process in such a way that it is not hiding behind a dead dashboard that requires that people remember to check it?
Jim and Jullian both check-in their changes, merge their PRs, and get their code deployed to production after going through various testing and validation hoops. However, when Jim completed merging his, code he races on to the next feature in the backlog (he even wins points for that; this is what the organization is measuring after all). He would not think about what was already committed unless some catastrophic failure has already occurred.
Jullian also starts the next item in his backlog, but after a few minutes, the Slack bot informs him that his code was used for the first time in production by a few of the alpha/canary users. As usage grows, he gets updates: did his new query affect overall performance? How does it scale with concurrent usage? He also learns that the number of deadlocks increased somewhat since merging his code. Looking at the trend, he realizes more work needs to be done to separate the read/write properties of the operation and improve the query bandwidth.
Processes Will Do What They Are Optimized to Achieve
I am reminded of a video I watched recently by Gil Tene, in which he discusses the ‘great lie of the 99th percentile’ in performance testing. Basically, he was wondering why it is always up to the 99th percentile that systems more or less scaled predictably in performance, following which performance starts degrading exponentially. What is so important about the number 9?
The answer — this is simply what engineers were optimizing for! They were asked to optimize for the 99th percentile, and therefore, the system could handle requests with any conceivable lag past that point, even if one percent of the users would wait long minutes.
I was reminded of this example because it is important to think about what we want to optimize for in our development process. Similar to the above example, if all we optimize for is speed and short-term stability, we would not create systems that align to usage over time and find ourselves accruing technical debt and miss out on an opportunity to really enjoy the benefits of continuous deployment.
And Jim? In this example, the code he checked in was never used in real life. No one noticed any issues, but no one bothered to check as well whether the infra code change was invoked in production. It wasn’t, and it was a few months before it was discovered.
What You Can Do to Ensure Continuous Feedback and Learning?
Granted, this was a somewhat contrived example. The are many other opportunities when Jim’s foresight in automating deployment and testing will save him from the trouble that might befall Jullian. However, we need to also look at the long game. This is where optimizing for scale, not just in terms of complexity or usage but in terms of supporting a code base that developers will collaborate on for years and that will inevitably have issues, bugs, and misalignments we need to watch out for.
So, how can we improve the development process and ensure the code continually improves?
- Observability — If you can’t observe it, you can’t measure it and you definitely can’t fix it. Look into OpenTelemetry and the different eco-system tools that can make it relevant for dev. I’ve written about it more here with some concrete examples in the follow-up article.
- Adjust the process to accommodate for feedback — Developers should have periodical meetings following a major feature release to discuss and measure feedback. Numbers can win arguments on architecture and design. How is the code being used? What else should we be looking for to validate it?
- Beware of the dashboard. It can be your best friend but also your biggest bias. A dashboard represents exactly what it aims to measure, just like a huge pile of green CI tests, it can lead to a false sense of confidence. I have seen companies put too much emphasis on specific metrics while completely ignoring other metrics. Why? Because they weren’t in the dashboard.
- Allocate time to explore the data. Any data scientist will tell you they need data in a sandbox to start tuning their insights. Just like exploratory testing, you’ll be surprised at the type of insights you’ll find out. Concentrate on areas of focus like usage, performance, or errors.
A long time ago, the Mythical Man-Month brought forth the proposition that adding people to a late project doesn’t make a late project ship earlier. Shattering years of management dogmas. I would add to that that being ‘fast,’ or ‘shipping earlier’ is only a part of the equation.
Even if adding more tools, more developers, and more technology would make a project faster —would the end result be better as well? ‘Fast’ is an important metric because it gets us feedback earlier — but what we if don’t collect the feedback? What would happen if the project ships two weeks earlier but it doesn’t work as intended? Who said the finish line is drawn at ‘launch’?
What Does the Future Look Like?
The future looks bright for observability and feedback tools. From a technical perspective, OpenTelemetry is bringing democratization to the observability data that allows eco-system, and open source tools to appear, answering the needs of developers to include such feedback in their design.
Developers are taking more responsibility for more of the pipeline, and so are open to receiving more metrics and information to support that ownership and improve their code.
Continuous deployment, feature flagging tools, and pipeline automation from code to prod were required first step to even allow collecting meaningful feedback. With that out of the way, though, many organizations are already thinking about the next step — automating the flow of information in the opposite direction, from prod and back to the code.
Want to Connect? You can reach me, Roni Dover, on Twitter at @doppleware. Follow our project for continuous feedback at https://github.com/digma-ai/digma.