July 16, 2004 - Testing HTML Pages

I started this article intending to talk about a technique we developed for testing Velocity templates but realized that there was enough background material for a separate article on testing html. So, this entry describes how we developed a harness for checking the output from the Management Dashboard. A second entry will talk about how we adapted the harness for testing Velocity templates.

When we started developing the Management Dashboard, we wanted to have a suite of acceptance tests that would be developed in tandem with the coding of the actual stories. The focus of these tests was to describe the detailed requirements of each story as well as do what I like to call "testing around the edges".

Using XPath

I had written tests like this before, so I volunteered to help get the acceptance testing effort started. I have been successful on previous projects using JUnit with XPath to check the details of the output so that's where we started. XPath is an expression language for retrieving values from an XML document declaratively. XPath saves you from the potentially error-prone trouble of walking through an XML DOM programmatically.

For example, this expression


will retrieve the contents of the third cell of the first row of the second table on a page. XPath expressions can get very complex, but I have found that they rarely get much more complex than the example above when I use them for testing html. There are several XPath processors available. We use the one in Apache's Xalan. You can read more about XPath at http://www.w3.org/TR/xpath.

Writing the Tests

The first tests few tests we wrote looked something like this :

public void testSummaryDetails(){

  IndexPageHarness page = new IndexPageHarness("test/results/index.html");

This is somewhat different from the style I use when I am writing unit tests. With unit tests, I go in very small steps alternating frequently between code and test. I use Extract Method and Extract Class to create just enough harness to keep the tests readable and remove duplication. By contrast, when I am writing acceptance tests, I like to write tests that reflect the steps that an actual user would follow.

We wrote a whole bunch of tests up front to describe the story in terms that a user would understand. We ignored all the squiggly red lines that told us that IndexPageHarness did not exist and tried not to worry about the details of how it would be implemented. We also tried to keep in mind that the acceptance tests would be taking the place of several of the artifacts - functional spec, test plan, test design - that one might deliver on a non-agile project. In short, we tried to keep the tests readable.

Here's what the test above does:

  1. Run an Ant script (sample-code.xml)
  2. Check the results. The index page should show :
    • Number of target classes = 4
    • Number of test classes = 1
    • Classes with test points = 25%

Here's what the test above does :

Implementing the Harness

Once all the tests for one story were complete, we went back and used the magic of IntelliJ to create the skeleton for the harness and made all the assertion methods call Assert.fail("Not written yet"). As the programmers on the team got around to implementing the story, we went back and filled in the details of each assertion to make it pass - or fail, if the developer's understanding of the requirement was different from ours.

The assertions use xpath to access the actual values in the resulting html. Here are the details of one of the assertion methods from our test :

public void assertNumberOfTargetClasses(int expected){
  page.assertXPathEquals("Number of Target Classes", expected, "//table[2]/tr[3]/td[2]");

In the first few methods, we hard-coded the xpath and called the methods in junit.framework.Assert and org.apache.xpath.XPathAPI directly. But we refactored mercilessly to eliminate duplication and ended up with a harness that came into being just in time for the tests that needed it.

The IndexPageHarness hides all of the details of reading and parsing the correct page and we were able to Extract Superclass to get a framework that had methods like these :

public void assertXPath(String message, String expected, String xpath) {
  try {
    assertEquals(message, expected, document.getXPathText(xpath));
  } catch (TransformerException e) {
    throw new RuntimeException(e);

public String getXPathText(String xpath) throws TransformerException {
  return getNode(xpath).getNodeValue();

public Node getNode(String xpath) throws TransformerException {
  Node node = XPathAPI.selectSingleNode(getDocumentRoot(), xpath);
  if (node == null) {
    throw new NoSuchElementException("Node not found [" + xpath + "]");
  return node;

I came to this technique for building test harnesses incrementally after seeing many test automation efforts fail because the testing team got so carried away with adding bells and whistles to their harness that they never got around to writing any actual tests. It also feels more productive to be making continuous progress towards the goal of having a comprehensive test suite when the costs of building the harness are distributed across the whole testing project.

Why JUnit ?

About now, many of you are probably wondering why we didn't use FIT or Ruby or some higher level testing technique for acceptance testing. The simple answer is that we all know Java and we feel more productive using the tools that we are familiar with. The tests will be read by non-programmers but it doesn't take too much effort to keep the tests readable if you follow a few simple rules like avoiding conditions and iteration in the actual tests - it's better to hide the complex details away in the harness.

More Tests

This approach got us a long way towards having a complete harness for testing the output from the Management Dashboard. The point of the harness was not to check that the generated html was correct - there are too many ways that html can be wrong to check that. The point was to check that the values embedded in the html were correct. Once the harness was in place, it allowed us to concentrate on writing more tests. They all had the same basic shape

  1. Run an Ant script
  2. Check the results

and the harness helped to keep all the arcane details of html and xpath out of the tests.

Maintaining the Harness

We were glad we had the tests when, some way into the project, we decided to use Velocity templates to generate the output. But problems started to creep in when later stories changed the layout of the html and caused the tests we had already written to fail. We also realized that the acceptance tests overlapped a lot with the unit tests.

The next article will describe what we did to address these problems and also address the additional challenges that we encountered as we tried to keep our Velocity templates in sync with the model code.

Posted by Kevin Lawrence at July 16, 2004 05:01 PM

Trackback Pings

TrackBack URL for this entry:


Post a comment

Remember Me?