upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Conducting Iterative Usability Testing on a Web Site: Challenges and Benefits

Jennifer C. Romano Bergstrom, Erica L. Olmsted-Hawala, Jennifer M. Chen, and Elizabeth D. Murphy

Journal of Usability Studies, Volume 7, Issue 1, November 2011, pp. 9 - 30

Article Contents


Discussion

Participant accuracy in completing tasks increased successively with each iterative test from Iteration 1 to Iteration 3, for a total increase of 34% in accuracy across all three iterations, but then accuracy dropped from Iteration 3 to Iteration 4.  Participant self-reported satisfaction also increased successively from Iteration 1 to Iteration 3 and decreased in Iteration 4.  See Table 4 for accuracy and satisfaction scores across all four iterations and Table 2 for accuracy scores for the tasks that were repeated across iterations.  Although some issues raised in earlier iterations were resolved, new issues emerged.  The iterative process allowed the teams to identify these instances and work to correct them.

There is a lack of empirical support for many of the recommendations from well-known experts about iterative testing.  This paper contributes to the “usability body of knowledge” by demonstrating empirical support for a practice that is often recommended yet seldom implemented (cf. the RITE method, in which changes can be made following each session, as soon as a usability problem is identified; Medlock et al., 2005).  By repeating some tasks across iterations, we were able to evaluate whether there were continual improvements from iteration to iteration.  With design changes from one iteration to the next, we were able to assess whether participants were successful with the new design or whether the changes and additional available functionality had caused new problems.  With each progressive usability test, from Iteration 1 to Iteration 3, there were incremental improvements that we saw in user performance gains and in increased satisfaction, yet in Iteration 3, we also found that a design recommendation caused a new, unforeseen problem.  Participants encountered a high-priority issue in Iteration 4 that we were not able to assess in earlier iterations because the functionality was not yet available.  In Iteration 4, we were not able to evaluate solutions for the usability issue we encountered in Iteration 3 because the high-priority issue was such that participants were unable to get to deeper pages, as they were all stymied early on in the site with the difficulties of the geography overlay.  This highlights the value of continued iterative testing, and once modifications are made, iterative testing should continue.  It also shows that most of the value of an iteration can be lost if a “show-stopper” issue is introduced into it.

We believe that it was important to start with paper prototypes.  Paper is a medium that is easy to manipulate and to change.  When creating a working relationship with the developers, it helped that they had not yet created the back-end of the application (i.e., nothing had been hard coded, no application actually existed yet), which often weds developers to the design.  We involved the designers and developers each step of the way, by encouraging them to attend sessions, to think of solutions to the usability issues, and to comment on and revise our ideas for recommended improvements.  At the end of each session, we discussed the usability issues and possible fixes with the observers and thus got them into the mindset of anticipating modifications to their design when they were still willing to make changes.  This had a lasting impact throughout the entire iterative cycle. 

It is likewise important for the developers to be partners in the usability testing process.  Our partnership was possible because we involved the developers in task development, invited them to observe usability testing sessions and post-test discussions, and met with the design-and-development team regularly to review usability findings and discuss recommendations.  We had ideas for improvements, but we collaboratively came up with solutions to test.  Each team valued and respected what the other offered.

The AFF prototype changed drastically from Iteration 1 to Iteration 2, but because we started the usability testing early in the design, the process of refining the design was manageable for the designers and developers.  In each step of the process, we worked cooperatively with them, making use of our different skill sets, interests, and visions. 

Although most design teams are accustomed to addressing usability at the end of the product development cycle, we addressed usability throughout the development cycle.  We anticipated that few surprises would occur with the final product.  Developers watched participants interact with their product and were able to see firsthand what worked and what did not.  We got the product into the users’ hands, found out what they needed, and quickly identified usability issues: It was an efficient develop-test-change-retest process.  As part of the process, we were able to find a “show stopper” and recommend changes to fix it, although it would have been more productive to have anticipated this issue.  In future usability tests, we plan on asking for designs of key functionality earlier in the process so we can provide feedback prior to the screens being created for testing.

Some of the challenges associated with the project included the pressures of time and budget considerations.  Even though these were government studies and not subject to market pressures, there were time (the approaching release deadline) and budget (for the design contractors) constraints.  Working in our favor was the fact that the AFF team wanted a usable public site for the dissemination of Census data, and they agreed to a rigorous program of iterative usability testing.  Additional challenges that we faced during the iterative cycle included the need to work under pressure for quick turnaround, as this involved the logistics of recruiting and bringing in participants in short order and producing preliminary reports quickly.  As well, we were faced with the challenge of convincing the designers and developers that we had something to offer—this was where bringing the developers in to view the test sessions and to review the major observations and findings after the participant completed all the tasks made a big difference.  We found that watching the participant interact with the prototype first-hand was a very valuable experience for programmers and project managers.

While we were invited into the process early, we found that it was not early enough.  The project manager and a different team came up with the requirements document that the conceptual design was based on.  Usability testing was only thought of after the initial conceptual design had been created in the form of paper prototypes.  We tested the paper conceptual design and it performed poorly.  The AFF team realized that the system could not be built as it was in the time allotted, so they went back to the drawing board.  It then took 6 months to revise the prototype so we could test the new version.  It is very possible that usability staff could have had a role in the requirements-gathering stage, which might have lessened the need to, in the words of the developers, “dramatically scale back” the design.  We recommend future studies examine the impact of usability testing earlier than what we were able to do in this study. 

During this series of tests, some of the team members worked part time at the Census Bureau.  In addition, the Human Factors and Usability Research Group and the team members on this project had multiple ongoing projects.  As such, the turn-around time in this series of tests from study completion to meetings with the design team to recap findings took 2½ weeks, on average.  While this series of tests and the turnarounds were not as quick as we would have liked, designers and developers attended the sessions, and we had ongoing, regular discussions with the AFF team about the findings and potential solutions to the problems while they continued to code and work on the back end of the Web site.  Thus, they were a part of the process and did not wait for our documented results to continue with the product.  This series of tests could not have worked without the commitment and collaboration of both the usability team and the AFF team.

Future Directions

Iterative testing was a valuable process in testing the Census Bureau’s new AFF Web site.  In successive, increasing-fidelity iterations, we were able to identify issues, recommend ways to resolve them, quickly turn around design changes, and test the site again.  We obtained continuous feedback though the development of the new Web site because participants were involved in every round of testing.  As measured by participant accuracy and satisfaction across iterations, the usability of the Web site improved considerably but then declined dramatically in the final iteration.  As the accuracy scores dropped significantly in the last round, the usability team recommended that iterative usability testing continue on the live site until user performance reaches established goals as set forth by the development team.

In our experience, it was important for the AFF team to witness participants struggling with their Web site.  In future tests, we plan on having a sign-in sheet for observers so we can monitor the number of observers, the number of repeat observers, and where they are from (i.e., company, division, group).  In this study, we did not tabulate whether the early attendance from the AFF team led to increased attendance for later sessions, but this is something that we plan to record in future tests.

In summary, usability researchers on any Web site or software development team should aim to include several iterations in their test plans because, as we have demonstrated here, iterative testing is a useful and productive procedure for identifying usability issues and dealing with them effectively.

 

Previous | Next