I'll start off with our self-congratulatory back-patting -- our predictions worked out as we predicted. As a reminder our hypothesis was:
If anything, the reality was better than what we expected. Looking at the project metrics, I see that our coverage (in terms of TCI) was about 77% based only on our JUnit tests. Interestingly enough, to accomplish this we created about 20% more test code than code under test. When we added agitation our coverage was extremely high, around 80% for the entire project and reaching 100% for a significant portion of our classes from the first click of "go." After configuring some factories to produce test data, we currently have TCI of about 94%. The ease with which we were able to get such good coverage is clearly a consequence of using TDD; as predicted, writing tests for the code made it more testable. Another practice of XP deserves some credit as well: Refactor Mercilessly. Constant refactoring made the code simple and predictable, which made agitation quite a bit simpler than it might have been. As a result a significant fraction of the time Agitator's observations could be transformed directly into assertions.
Of course with all the extra testing we were able to accomplish with Agitator, we expected to find some bugs that escaped our manually-created JUnit tests. This certainly happened, and the most dramatic instance was when Agitator surfaced two places in the code that were generating StackOverflowErrors. These bugs had escaped not only our unit and acceptance tests but a bit of exploratory testing as well. While it is possible we would have caught them before shipping without Agitator, I think it is more likely that Agitator saved us at least a few support calls.
Finally there is the "softer side of testing" -- the psychological impact on the customer and the effect that has on the value of the code. We stored our stories on the wiki and had a public dashboard that charted the progress of our testing. The combination of watching story after story be ticked off, combined with a dashboard showing our tests growing in lockstep with our codebase, provided very persuasive evidence that our progress was real and not a snow-job with a second shoe waiting to drop later in the cycle.
Despite our predictions playing out as we had hoped there were a couple of surprises that stand out as significant.
The most important in my mind is something both Kevin and I each wrote about before, Agitator Driven Refactoring (ADR). Michael Feathers in The Bar is Higher Now lays out a challenge:
“You think your design is good? Pick a class, any class, and try to instantiate it in a test harness. I used to think that my earlier designs were good until I started to apply that test. We can talk about coupling and encapsulation and all those nice pretty things, but put your money where your mouth is. Can you make this class work outside the application? Can you get it to the point where you can tinker with it, in real time, build it alone in less than a second, and add tests to find out what it really does in certain situations? Not what you think might happen, not what you hope might happen, but what it really does?”
This is where Agitator -- and by extension ADR -- really shines. Agitator is merciless, it cares nothing for your assumptions about what should happen, what sort of inputs you expected. If you care about reaching that bar that Michael has put out there, then Agitator is the fastest way to get there and to stay there.
There was another surprise that had a big impact on our effort, but not in a good way. We spent time early on getting an acceptance testing framework up and running but in the process we made what turned out to be a strategic blunder: we didn't apply once-and-only-once across our two sets of tests. As a result we had a decent framework for testing our UI on the acceptance test side, but the unit tests for the same UI were done in a very ad-hoc manner. Worse, because the unit tests and acceptance tests worked differently when developing, we frequently made changes to the UI that broke the acceptance testing framework. As a result our acceptance testing effort suffered with a significant amount of time spent keeping the framework up to date with changes in the UI. Finally we decided to adjust to the problem and refactor both the unit tests and the acceptance tests to use a new and more flexible testing framework. The new framework had a few benefits. First, it dramatically reduced duplication across our unit tests. Second, it made it easier to work TDD on our UI. And finally because the unit tests and acceptance tests now share a common infrastructure, we no longer break the acceptance tests when modifying the unit tests -- updating the framework for the unit tests serves the acceptance tests as well.
Posted by Jeffrey Fredrick at May 14, 2004 05:11 PM
TrackBack URL for this entry:
http://www.developertesting.com/mt/mt-tb.cgi/119
Does Agitar has any plan to make that GUI testing application public ?!
Posted by: Chidambaram Danus on April 19, 2005 12:48 AM