Note that this post is part of a series where I am ‘live blogging’ my way through the ministry of testing’s 30 days of Agile Testing challenge.
What actions do we take when there is a red build? Well, what a timely question! I’ve just spent the last couple of days trying to figure out a few different build issues. The story illustrates one set of responses to a red build, but it also shows that there isn’t just one answer to question of what we do when there is a red build.
Thursday I noticed that one of my builds used to check on a lower level package was red. It wasn’t just one or two tests failing. Every single test was failing. Clearly something was going badly wrong. I spent some time digging into it and finally realized that part of the package wasn’t getting extracted correctly during the setup. After some more time (and frustration), I finally figured out that the issue boiled down to the regression VM using an older version of 7zip. Apparently the build machine creating the package had been updated to a newer version and so now the old version on the machine I was using couldn’t properly extract the package. I updated the version of 7zip and re-ran the build. Everything was passing, so I posted an artifact to get picked up in the final build process. Everything is good now right?
Wrong.
Friday morning I came in to find that instead of the build picking up an artifact from Thursday (as it should have), it had picked one up from May?!?! Stop the presses! More sleuthing required! We stopped the build from progressing and starting digging into it. The problem ended up being another machine that had the wrong version of 7zip installed. This machine had also had not cleaned up properly at some point and so had an old file hanging around that it could (and did) use. We fixed the 7zip version and updated the scripts to make sure they were correctly cleaning things up and now *touch wood* everything in running smoothly again.
The point of this story is to show that the things we do to deal with red builds varies. Normally we wouldn’t stop all other work and focus all energy on fixing the build, but in this case the red build was of the ‘Nothing Works!’ category and so the steps taken were more drastic. In the ‘normal’ day to day red build where a test or two is failing our approach to it would be different. We would look into it and follow up, but if the issue was small enough we would let the build pipeline continue and just follow up with a fix. Or if we caught the issue early enough, we might just quickly revert a change and things could continue on as expected. The approach to a red build can’t be strictly prescribed and often requires exploration to figure out.
The lesson? Even when it comes to red builds, the context matters!