January 27, 2005 - A Bad Day With Continuous Integration

Yesterday we had a problem. Just before 6 pm someone checked in some changes that broke our unit tests...

Unfortunately that unit test failure prevented our agitation tests from running, and thus we didn't learn that a little after 5 pm someone else had committed a change that broke agitation. That in turn resulted in our big nightly agitation hanging, and now it will be several more hours before we get those test results.

By our standards not having our nightly results is a big pain, a real disruption of our daily workflow, but take a look at what happened. The unit tests failures caught a real bug and we were notified within 10 minutes from when the code was committed. Given the time of day it wasn't fixed promptly, but the failure of the nightly test meant that we caught the other bug the next morning, about 16 hours after it was checked in. What was the alternative?

The purpose of automated continuous integration is to reduce the cost of failure, and the result is that a bad day with CI is better than a good day without it.

Posted by Jeffrey Fredrick at January 27, 2005 12:35 PM

Trackback Pings

TrackBack URL for this entry:


Well, I suppose it would have been even better if he caught the problem before he checked in. Was the problem truly an integration problem? Or was it just something was revealed because the full suite of unit tests are run by developers? Using a system like ClearCase's UCM allows for easily performing tests in the integration context (e.g. seeing all your co-workers changes without bringing them into your own workspace). But that obviously doesn't address the problem of having a suite of unit tests that take a long time to run.

Posted by: code_poet on March 25, 2005 08:31 AM

Yes it would have been better if he had run the unit tests and caught the problem before checking in, and I'm a big fan of doing that. But in my experience people are human and will occationally have lapses where they don't do something that they know they should, and Murphy's Law says that will be the time when something breaks. So yes, by all means run unit tests before checking in, but have an automated system running as a safety net as well rather than counting on your developers to be perfect.

Posted by: Jeffrey Fredrick on August 18, 2005 08:45 AM

Post a comment

Remember Me?