The eXperiment started from scratch, all code written test-first and in pairs. With all the pairing, testing, and refactoring I had developed a lot of confidence in our code, but then we got to a point where we needed to use some of the code from Agitator. This caused me quite a bit of concern. Actually concern is probably too concrete; I felt a great deal of unease. My instincts are typically inline with the motto Joel attributes to the Excel team: "Find the dependencies -- and eliminate them." But in this case there was no way around it. There was something we needed, Agitator had it, and it was ridiculous to think of rewriting it from scratch. Then as we moved to actually integrate the code, my unease was able to condense into a concern, and the concern had a name: JClass.
JClass is a large, complex, and heavily used class, reported as the riskiest by the Agitator dashboard. Here we had developed a nice clean codebase and I wasn't excited about opening the door to who-knows-what sort of complications. So I went to Mark -- author of the feature we needed and also the author of JClass -- and asked if there was any way we could refactor JClass out of the equation. He took my request with good grace and spent some time poking around the code before reporting his negative finding. Then he asked why I wanted to avoid JClass. When I shared my concerns he laughed and said, "Sure it is a risky class. But it is also the most heavily tested class we have!" A glance at the dashboard showed he was right. My concern was allayed, we moved ahead on the integration, and so far everything is going swimmingly. I can't think of anything else that Mark could have said that would have been more reassuring.
From my own experience, I've found unit-tested code to be much more reliable than code that wasn't unit tested. In a previous life as a QA engineer I worked on a release where some of the code was written by a sub-team using XP, while the remainder was written in the "normal" way. My subjective experience was that the code from the XP team mostly worked from when it was first delivered, and was actually worth me spending my time to test. The "normal" code, on the other hand, as likely as not didn't work except in the imagination of the engineers who wrote it, and I'd waste a lot of time showing them how their stuff was fundamentally broken.
Eventually I helped said company roll out unit testing across the whole development organization. A year after rollout, I ran a report against the version control system that counted checkins in the test directory per engineer. The results ranged from a low of about 70 checkins to a high of around 700 checkins, with most engineers in the 200-300 range. I think it is unsurprising that the developers with the strongest reputations on the team were on the high end while the developers with the poorest reputations were on the low end.
This correlation has carried over into my open source work as well. As a committer on CruiseControl, I review lots of submissions, and I've found that those that come with unit tests are generally and regularly of higher quality than those that don't. I have enough faith in this correlation that a submission that seems to have a good unit test will often be committed without a deep inspection of the code.
This correlation between developer tests and code quality seems obvious to me, both in theory and in practice. So why isn't this considered floor zero in software development? Why aren't QA people marching in the streets demanding developer tests before wasting their time on stuff that flat-out doesn't work? Why doesn't every contract include a developer-testing provision, and why don't component vendors advertise their developer testing metrics?
And what is it going to take for this industry to change, to graduate from alchemy to chemistry?
Posted by Jeffrey Fredrick at March 15, 2004 04:05 PM
TrackBack URL for this entry: