Resources: UPA 2005 Idea Markets
Do You Use (and Trust) Self-Reported Usability Measures?
Activator: William Albert, Fidelity Investments
The Activator's Initial Questions
There are two schools of thought on self-reported usability measures. Some usability practitioners ask participants to rate the usability for specific tasks. However, other usability practitioners do not trust self-reported usability measures, and therefore assign usability scores independent of the participants. What are the advantages and disadvantages to each approach, and is there one that is more prevalent than the other and why?
Questions for Discussion:
- Do you collect and analyze self-reported usability measures as part of your usability testing? If so, what types of usability measures are most useful and why?
- How do you collect and analyze self-reported usability measures?
- Do you trust or have confidence in self-reported usability measures? Have you ever experienced participants who confuse usability with their expectations, desire, or satisfaction? If so, what have you done to address this problem?
- How often do participants rate a task as easy, even after failure? What about the opposite when they rate a task as very hard, but did not experience any problems? Do you do anything to account for this?
- Are there some types of tests (such as automated) which you collect self-reported usability measures, but other types of studies that you don’t? What is the main difference between these types of studies?
- Does the sample size impact the reliability of self-reported usability measures? Is there a minimum sample size that you must have before collecting self-reported usability measures?
- If you don’t use self-reported usability measures, is it because you do not trust the data, or is it for another reason?
- Do you assign usability scores independent of the participant? If so, how do you do this and why, and what are the advantages and disadvantages of this approach over self-reported measures?
- If you have used both self-reported measures, and assigned scores, which one do you prefer and why? Do you communicate the results differently? Does your audience trust or value one type of approach more than the other?
General Participant Comments
- Participants feel there is value in collecting and analyzing self-reported usability measures. However, they feel self-reported measures are relatively less important than performance measures such as success.
- Participants indicated that they use self-reported measures as part of their usability research, but have low (to medium) trust in the data. The degree of trust in the data corresponds to the correlation strength with other measures, most notably task success.
- Participant’s trust in the data increases only when they use the data in conjunction with other usability metrics, most notably task success. Participants do not feel comfortable using self-reported measures as they sole usability metric.
When to Collect Self-Reported Usability Measures
- Participant’s indicated that their primary motivation in collecting self-reported usability measures is a way to prompt users to discuss their experience, and specifically the ease of use. Analyzing the data is actually a secondary motivation. The most valuable aspect of collecting a self-reported measure is to be able to dig deeper for qualitative feedback during the test debrief.
- Some participants mentioned that they feel more comfortable collecting self-reported measures for summative tests, or formative tests with a larger sample size.
Analyzing and Reporting Self-Reported Measures:
- One participant said that she will only report data based on at least seven users, and then report tasks that below a specific, pre-defined threshold. This is a way to identify tasks that may have poor usability, as compared to being able to take relative measures of the usability for different tasks.
- Several participants report the data, but do not focus on it. They usually include the data as part of an appendix or simply de-emphasize the data during their presentation of the usability findings.
- Several participants said it is acceptable to use self-reported measures, but it is very important to explore the data, particularly with respect to inconsistent data. They mentioned that it may be helpful to discard data based on inconsistencies they may see between a self-reported measures and performance measure.
- Several participants like to look at the data at an aggregate level as a clue to see what happened. They will focus on tasks which scored low, and try to ascertain the cause of the low scores.
- One participant said she reports data as a range. She feels it provides more a more accurate assessment of the usability than reporting the data as a mean.
- Several participants indicated that they trust self-reported measures more at the aggregate (post-test) level than at a post-task level. They feel that the overall usability assessment is more reliable than at the individual task-level. There are too many factors which can add noise at the task level.
- One participant only trusts the data for summative testing. A larger sample size (at least 30) will result in more trustworthy data. Also, it is important to be able to validate the data with other metrics.
None of the participants had a “black or white” approach to self-reported measures. None of the participants felt that self-reported measures are so unreliable that they should not be used in any type of usability evaluation. Also, none of the participants felt that self-reported measures offer unquestionably reliable and valid data.
My original assumption that there were “two-schools of thought” does not seem well founded. All participants I spoke with perceive self-reported measures as having at least some value, but with some major concerns about the reliability of the data. The best use of self-reported measures is a way to probe during an usability evaluation, and used along side other usability metrics.