DTAP is dead

Do you know why the daughter always chops off 1” from the steak before putting it in the pan? Because her mother did it. Do you know why the mother did it? Because the pan was too small.

We do the same in software development. Some things are so common and have been used for so many years that we no longer think about it. One of those things is DTAP.

DTAP stands for Development, Test, Acceptance, and Production. It’s a standard way of splitting up your environments that’s been around for decades.

I visit a lot of customers for consulting on their DevOps pipeline. Almost all customers I visit have a pipeline that looks like this:

The DTAP principle is followed religiously and each and every pipeline has those exact environments. I’ve even heard someone say: ‘DevOps is automating your DTAP’. The horror!

And let’s not even talk about this one:

If that’s your environment, please reach out on Twitter and let me help you.

Looking beyond DTAP

DTAP has its origins in an on-premises world. Requesting servers (physical or virtual) costs time and after that you pay a fixed amount of money. There are variations where some teams say that D(evelopment) is their own PC. Others add an extra environment for Education (training users) or Backup.

Fortunately, we now have the Cloud. It doesn’t matter if this is private, public or hybrid. As long as you can easily request and destroy environments, you can do whatever you want and add as many characters to DTAP as you want. And since you only pay for what you use, you shift from fixed costs upfront to a flexible model. This way you can save money, especially for environments that get destroyed once you’re finished. The automation of Cloud also helps you when it comes to speed. Creating an environment in the Cloud is way faster then the on-premises processes that most companies have to suffer through.

A simple step: parallel environments

Having the ability to create environments on demand allows you to be more flexible in your release pipeline. Let’s start with something simple. If you want to run performance tests in parallel with some exploratory testing, just add some parallelization to your pipeline:

The Performance Test stage creates a new environment on demand, runs the load tests against it and then destroys the environment. Both Test and Performance Test have to succeed before continuing to Acceptance. If you add containers to the mix this becomes even easier. You can easily run multiple environments on a single Kubernetes cluster and remove them one you’re done.

Azure DevOps helps you already in calling each box a Stage, not an environment. Stages are flexible and not directly tied to any physical or virtual representation of your environments.

Progressive deployments with Feature flags

Another popular DevOps technique is using Feature Flags to turn features on or off at runtime. This can be a simple if-statement that checks a value in your configuration file or a more complex implementation that enables features based on user or location or a certain percentage of all traffic

Feature Flags allow you to work on a single branch of your code (avoiding merge work) and to run experiments by enabling functionality for a subset of your users and monitoring the usage, performance and errors to determine if you want to increase the usage or turn the feature off.

Testing in Production is an often heard term in this context. Instead of relying on a test or acceptance environment that doesn’t completely mirror production, you expose functionality to a small subset of users in the actual production environment. If something goes wrong, your (automatic) monitoring picks up on it and disables the feature. You can even run the backend code of your new feature but disable the frontend UI. That way, you can test all paths through the code but users won’t notice any degradation if you turn a feature off. Examples of this are the new timeline on Facebook or the new navigational UI in Azure DevOps.

With A/B testing you show two different versions of your application to different groups of users. You then monitor usage and use the data to decide which version of your application is the most successful. This can be small things like a textual change in your shopping card to a larger new feature like a new search approach. By monitoring the two (or more!) versions for statistically significant differences you can influence your backlog and the direction of your product. You can even ask users in product for their opinion (NPS is often used for this) to get even more data.

Launch Darkly is a great place to get started if you want to get started with feature flags.

Ring based deployments

Feature Flags offer fine grained control over feature availability but that’s only after the code has been deployed. Doing the actual deployment in phases is what ring based deployments are for. You split your users into groups and deploy your code to one group at a time. This means that your different groups of users end up on different environments, for example if you have multiple data centers. This is called a ring based deployment.

That’s the deployment model used by Azure DevOps Service. Being a cloud service that’s online 24/7 and deployed all over the world, the ring based deployment model allows the team to gradually rollout changes and monitor their success without affecting all users at once.

Ring 0 is where the team itself lives and where the Microsoft MVPs have their account. Ring 0 is the first one to get new bits but also the one that could potentially break. Other rings expose new functionality to customers. Between each ring there is a wait period where the team monitors for problems. If the deployment isn’t cancelled, the next ring is deployed. This is why the release notes always show that the deployment will take a couple of weeks and it could be that new functionality isn’t visible in your organization yet.

Release Gates

The monitoring between stages can be done with Release Gates. Release Gates sit between environments and monitor some external condition. As long as the condition is not met, the gate is closed. Once the gate opens, the release continues. Out of the box you get the following Release Gates in Azure DevOps:

Invoking an Azure Functions is the most extensible option. In your Azure Function you can run all the checks you want such as integrating with third party applications like ServiceNow. Once the Azure Function returns a successful result, the pipeline continues.

Conclusion

Traditional DTAP can still be useful but I find DTAP is too restrictive. If you can easily create and destroy environments, why not add stages on demand to speed up your pipeline? Combine that with feature flags, ring based deployments and release gates and your pipelines can be way more flexible then just a sequential DTAP pipeline.

So please, take a second look at your release pipeline and see if DTAP is really what you want or if there are better ways to deploy your application and make your users happy.

Wouter de Kort works as a lead architect and consultant. He helps organizations stay on the cutting edge of software development. Wouter focuses on DevOps. He loves solving complex problems and helping other developers to grow. Wouter authored the book DevOps on the Microsoft stack and a couple of other books. Wouter is a Microsoft MVP and an ALM Ranger. You can find him on Twitter (@wouterdekort), on his blog at wouterdekort.com and at the various conferences where Wouter speaks.

Share

9 Responses

  1. I would love to see more about implementing ring based deployments. All of the articles I’ve read about it, for the last 2-3 years, are at the same high level as yours…

    • Hi Sam,

      I’m currently working on a demo project using GitHub and Azure DevOps to showcase these kinds of DevOps concepts in a working demo. Ring based deployments are definitely on my backlog! Once I have something to show I’ll post an update on my blog.

      Thanks!
      Wouter

  2. I love your analogy, a great tension building and run up to the point you are trying to make. The title was a bit click baity, misdirection in a day and age where we fear content no longer has a merit on its own but the number of clicks an article generates rules it seems. Indifferent of the actual result of that click, be it setting the right parameters for the context, supplying truthful information or sketching a “new world order” like yours. And you make some solid points which hit home with me, and I love that incentive to look beyond our own horizon if we are willing to stand on eachothers’ shoulders once in a while.

    Readers with less knowledge of the subject might take away the wrong idea though as you are trying to stay as close as possible by your initial statement of DTAP is dead, as that is what the traffic has generated. Yet what you are saying is, DTAP as a classical locally hosted infra solution has passed it sell by date and the new kid on the block is IaC, IaaS. It is more flexible, it provides opportunities to approach DTAP even more finegrained yet without the multitude of environments to host as you up and downscale based on the needs of your development stage (assuming a CMP is used to minimize cost)

    Yet the process and procedures defined by DTAP, namely using separate environments (or instances, or containers or whatever unit of measure you want to apply to the process) is still being applied, simply because governance and compliance requirements set by either expectations of your customer (availability, capacity, accessibility, security, to name a few) require you to come up with an absolute answer to not screw that up. And even in our new agile paradigm designing from UX perspective, user story or epic, scrum team ownership of the service or traceability of artifacts you end up with the need to stay out of eachothers’ lanes during the progression of a build and secure repeatability and traceability of your best practices throughout the lifecycle of your service. When you reverse engineer it that way, you will end up with some need for stage autonomy to prevent contamination of your results. Be it development, testing ~ system, load, regression etc, to acceptance testing with your customer, and ultimately the go live, that process, we call DTAP. And clearly from your article, it is still alive and kicking. And rightfully so.

    So DTAP is not dead, only the hardware side of things has shifted away from locally hosted to IaaS. Even if you run this against a PaaS option, on the SP side they will have some sort if IaaS to host their PaaS.

  3. I’m also not sure what you mean to say it is dead, while other systems are based on the same principles.

    So, are you saying “testing” on itself is dead ? Well, good luck.

  1. Pingback: Initial Cut

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.