upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Usability Testing with Real Data

Alex Genov, Mark Keavney, and Todd Zazelenchuk

Journal of Usability Studies, Volume 4, Issue 2, February 2009, pp. 85-92

Article Contents


The Benefits of Real Data Testing

While incorporating users' real data into your usability studies requires significant effort, there are some clear benefits that help make it worthwhile.

Ecological Validity

The primary benefit of incorporating users' real data in a usability study is increased ecological validity (Brewer, 2000)-that is, better approximating the real-life situation under study. And, by doing so, real data can also increase the study's external validity-in other words, the study results are more likely to generalize beyond the lab.

For example, consider a real data study done on a product called Intuit FinanceWorks. The product was used within a bank's Web site to allow the bank's customers to do certain personal financial management tasks that are more commonly done within a desktop software such as Quicken (for example, users could enter a check that had not yet cleared to account for that money in their balance). The team had run several fake data usability tests in which participants had no trouble with this concept, but in the real data test many were reluctant to even attempt the task. In this more realistic situation, participants assumed that there would be no way to do the task on their bank's Web site, that they "wouldn't do that here." This finding led the team to redesign to increase discoverability and to better educate first-time users about what the product could do. If the team hadn't tested with real data, this problem would have gone undiscovered until after launch.

Another example is in the user testing of Intuit's TurboTax tax preparation software. TurboTax has many screens that lead taxpayers through step-by-step explanations of how extremely complex and unfamiliar tax concepts may or may not apply to them. For example, based on the amount of income input by the user, the TurboTax software determines if the user is eligible or not for specific tax credits. Based on that determination, the program shows users a certain set of screens that is relevant to that specific tax situation. If, during a usability study, the participant is using made-up data including a specific level of income, the participant will see a set of screens that go with that level of income. These screens may include situations triggered by the pre-determined income level that are not familiar to the participant because that was not his or her actual income level. Such a scenario potentially introduces another source of variability to the usability study, namely the difference between the usability of the software and the usability of the study task and non-real data. In the past, when TurboTax did non-real data usability testing (which has its own advantages in terms of standardizing measurements of some program-related tasks such as navigation), teams had difficulty testing how well users understood these kinds of questions in reference to their actual situations. When they tested with real data, they found that many of these questions were not clear to some users, and so the team rephrased and added explanations to make them easier for customers to understand.

Data Issues

The second benefit of real data testing is that by testing with a range of real-life data, we can uncover usability issues that would not be produced by a narrower set of representative fake data. For example, at Intuit a team had designed an interface for exporting data from an online payroll system to QuickBooks desktop accounting software. The export software was designed to cancel the export and display an error in certain (we thought) very rare exception cases, such as when an employee name in the online payroll system exactly matched something that was not an employee in QuickBooks, such as an item in the vendor list. When we usability tested this system with real data, we found that many users had added their employees to the wrong list in QuickBooks, and so the export frequently failed. When we discovered this, we made the matching rules more forgiving, allowing the export to go through in these cases. Without real data testing, we would not have discovered this issue until after launch.

In another very simple example, a team was designing a purchase process for a Web site. Instead of having participants enter artificial user data into a credit card field, the team asked users to enter in the actual name and expiration date from their company's credit card. In one instance, a participant entered in her very long company name (35 characters) that exceeded the field limitations of the design being tested. The most interesting thing was that this aspect of the design was already in production and was not even a priority of the test. By incorporating users' real data, the team discovered an unknown and previously unencountered design flaw that was easily and immediately fixed.

Previous | Next