upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Conducting Iterative Usability Testing on a Web Site: Challenges and Benefits

Jennifer C. Romano Bergstrom, Erica L. Olmsted-Hawala, Jennifer M. Chen, and Elizabeth D. Murphy

Journal of Usability Studies, Volume 7, Issue 1, November 2011, pp. 9 - 30

Article Contents

Iteration 3 Search and Navigation Plus Core Functions Available

Iteration 3 was a medium-fidelity usability test that was of slightly higher fidelity than Iteration 2, and screens were partially clickable (Romano et al., 2010).  In this round, the testing evaluated specific aspects of the user interface by examining the participants’ success and satisfaction on a few selected tasks.  Based on changes that were made following Iteration 2, we intended to evaluate whether the problematic elements of the interface, that were thought to have been alleviated with the changes that were made after Iteration 2, were resolved.  We repeated two tasks from Iteration 2 and introduced four new tasks that tested new functionality.

Materials and Testing Procedure

We tested the interface with six novices and seven experts over eight days, and as with previous iterations, some members (one to four observers at each session) of the AFF team attended all sessions.  In this round of testing, the screens were semi-functional such that not all of the buttons and links worked.  The only active buttons and links were those needed to test the elements of interest in this usability test.  We tested the Search Results page (right panel of Figure 4), the Table View page (right panel of Figure 5), and the Map View page (right panel of Figure 6).

The procedure was identical to Iteration 2, except the eye-tracking machine was down for maintenance, and so we did not collect eye-tracking data.  While we had intended on collecting eye-tracking data, we proceeded with testing the new iteration in order to get results and feedback to the design team quickly.  Each session lasted about 30 minutes.


This section highlights accuracy, satisfaction, and some note-worthy findings that emerged during the third round of testing.  Accuracy was higher than in previous iterations: For novice participants, the average accuracy score was 74%; for expert participants, it was 84%.  Accuracy scores ranged from 50% to 100% across participants and from 40% to 100% across tasks.  Satisfaction was also higher than in previous iterations: The average satisfaction score was 6.66 out of 9, with 1 being low and 9 being high.  The average satisfaction score for novice participants was 6.51, and for expert participants, it was 6.78.

As with previous iterations, we examined participants’ behavior and comments, along with accuracy and satisfaction, to assess the usability of the Web site and to infer the likely design elements that caused participants to experience difficulties.

Finding 1: Modify Table caused new usability issues.

Although the new Modify Table label enabled participants to perform certain tasks that other participants had not been able to perform in Iteration 2, they now went to Modify Table to attempt to complete other tasks that were not supported by the functionality available there.  Changing the label from Enable Table Tools to Modify Table on the Table View page (Figure 5) made it clear to participants that they could use that button to modify their table (e.g., remove margins of error from the table).  However, the breadth of Modify Table implied that people could use that button to add items, such as additional geographies, to their table.  The Modify Table label was so clear that participants were drawn to using it, but they could not apply all possible modifications to their table using that button. 

For the purpose of adding geographies, the developers had intended participants to click on “Back to Search” to go back to their search results.  During our previous meetings with the AFF team, we had discovered that from the developers’ perspectives, it was easier to deal with the complexities of the geography details from the original Search page.  As such, the developers said they really wanted this to work, even though they, and we, suspected it would not.  This was another example of the real life constraints of time and money at work: If Back to Search did not work, it would have required a complete redesign very late in the project timeline, and the deadline to release the first 2010 Census results would have been missed.  The Back to Search function was already hard-coded at this point, and so we tested it, but as evidenced in usability testing, this concept did not match the participants’ mental models.  This round of testing highlighted the button label Back to Search as an example of instructions worded from the programmer's perspective that did not work for the end user.  However, due to real costs and schedules involved, some of the results of usability testing had to be sacrificed or deferred to make the deadline and manage costs.  We recommended having the options to modify the table and to go back to search on the same line so users could visually associate the options.  We recommended adding a clear and simply-labeled “Add Data to Table” button next to the Modify Table button, and this new button would function in the same way as Back to Search.

Finding 2: “Colors and Data Classes” worked well for participants. 

In this iteration, participants were able to successfully complete the task that asked participants to change the color of the map.  Changing the label from Data Classes to Colors and Data Classes on the Map View page (Figure 6) was effective in making the tab more usable, as participants readily used this option when a task required them to change map colors.  Note that we did not test whether participants understood Data Classes or whether the action verb helped; we only know that adding the word “Colors” helped participants to change the colors on the maps.  This change satisfied the AFF team, but as usability professionals, we are not entirely convinced that this is the “right” solution and that users will completely understand the functions of the tab.  We felt that including action verbs in the labels are still the best option, but under the time constraints of the impending launch of the new site, we did not test this further.

Plans for Iteration 4

Overall accuracy increased, and specifically, accuracy for the two repeated tasks increased.  While usability increased, new usability issues were discovered.  We met with the AFF team and recapped findings and recommendations from Iteration 3.  Together we discussed design alternatives to resolve problems with the visibility of the Back to Search button.  Some design options included changing the Back to Search label to “Change Geographies or Industries” and adding the recommended Add Data to Table button in the Modify Table call-out box.  In the end, the designers chose to use a call-out box that read “Click Back to Search to select other tables or geographies,” rather than use the Add Data to Table label that we had recommended.  The developers said they wanted to try the design change that would require the least amount of programming because it had already been hard coded, and the project deadline was approaching. We agreed to test this in Iteration 4 with the understanding that if it didn’t work, we would try alternatives in future testing.


Previous | Next