upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Usability Testing with Real Data

Alex Genov, Mark Keavney, and Todd Zazelenchuk

Journal of Usability Studies, Volume 4, Issue 2, February 2009, pp. 85-92

Article Contents

Security and Privacy

With fictitious data, the concerns about security and privacy of information are usually quite minimal and generally related to permissions for videotaping and subsequent sharing of results. In contrast, when you incorporate users' real data, you may quickly find yourself dealing with institutional security policies and highly sensitive privacy issues that need to be considered with care.

In all the real data studies done at Intuit, extra efforts regarding security and privacy are taken to ensure that participants' data are not at risk. For the FinanceWorks and Quicken Health studies, these efforts included file encryption to ensure that users' data was protected at all times and the destruction of all data files within a four week period following the study. Explicitly communicating to participants that their personal data was going to be used in the study was highly effective in setting expectations with participants and ensuring them that the security and privacy of their data was a priority for the research team.

Efforts also have to be put in to place to handle any recordings of a real data study. In a real data study on the Intuit FinanceWorks product, the sessions were recorded using Morae. These recordings, in addition to the participants' banking data, had to be encrypted. The researchers kept records of anyone who had access to or viewed these recordings. In a few cases, the researchers asked for additional consent from the participants after the test was completed to show video clips to a wider audience. Only those video clips for which additional consent was obtained could be shown.

Scenarios, Tasks, and Data Analysis

With fake data, it's possible to define the successful path for each task as the same across all participants and to analyze the data accordingly. With real data, things aren't simple. Depending on the participant, the same task may involve different amounts and types of data; in some cases, the task may not apply at all or may need to be done multiple times. There is no simple solution to this problem. The best way to analyze the data will depend on your particular application and the questions you're trying to answer. However, there are two techniques that we've used in multiple real data studies that have been helpful.

One is to construct task templates, generic versions of a task that are then filled-in with the specific data for each participant. For example, a task template for a study on an electronic billpay application might be, "Pay your _____ bill that is due on ____." The blanks would then be filled in differently for each participant before their test session, based on the data that we had received, and with an attempt to make the data in the blanks relatively similar across participants (e.g., selecting bills of similar amounts and due dates as much as possible). Thus there is some standardization of tasks, but each participant is actually paying a bill that is specific and familiar to them.

Another technique that is useful specifically for analyzing the data is to construct average scores for each participant in a given task before calculating success metrics across participants. So for example, if a task is to pay all bills that are due in the next week, in a real data study one participant might have five bills due and another might have only one bill. To avoid overweighting the participants with multiple bills, the best way to create an overall score for task success at paying a bill is usually to construct a composite score for each participant, and then average those composite scores.

Previous | Next