upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Rent a Car in Just 0, 60, 240 or 1,217 Seconds? Comparative Usability Measurement, CUE-8

Rolf Molich, Jarinee Chattratichart, Veronica Hinkle, Janne Jul Jensen, Jurek Kirakowski, Jeff Sauro, Tomer Sharon, Brian Traynor

Journal of Usability Studies, Volume 6, Issue 1, November 2010, pp. 8 - 24

Article Contents


Introduction

Traditional usability tests are usually a series of moderated one-on-one sessions that generate both qualitative and quantitative data. In practice, a formative usability test typically focuses on qualitative data whilst a summative test focuses on performance metrics and subjective satisfaction ratings.

Qualitative testing is by far the most widely used approach in usability studies. However, usability practitioners are discovering that they need to accommodate engineers, product managers, and executives who are no longer satisfied with just qualitative data but insist on performance measurements of some type. Quantitative usability data are becoming an industry expectation.

The current literature on quantitative methods aimed at practitioners is limited to a book by Tullis and Albert (2008), a website by Sauro (2009), and UsabilityNet—a project funded by the European Union (Bevan, 2006) and a few commercial offerings, for example, Customer Carewords (2009). All base their measures on the ISO 9241-11 (1998) definition of usability. Tullis and Albert’s book describes the what, why, and how of measuring user experience from usability practitioner viewpoints. Customer Carewords focuses on websites and introduces several additional metrics such as disaster rate and optimal time. UsabilityNet identifies a subset of resources for Performance Testing and Attitude Questionnaires.

There are several psychometrically designed questionnaires for measuring satisfaction. Two of these are the System Usability Scale (SUS; Brooke, 1996) and the Website Analysis and MeasureMent Inventor (WAMMI) questionnaire (Claridge & Kirakowski, 2009). Many companies use their own questionnaires, but these may not have sufficient reliability and validity. Some instruments have also been developed to assess user mental effort as an alternative to satisfaction, for example Subjective Mental Effort Questionnaire (SMEQ; Zijlstra, 1993) and NASA-Task Load Index (TLX; Hart, 2006).

Sauro’s work and applications are mostly a result of statistical analyses of real world usability data and go as far as proposing a method to compute a single composite usability metric called Single Usability Metric (SUM; Sauro & Kindlund, 2005). Tullis and Albert, and Sauro, stress the importance of strict participant screening criteria and reporting confidence intervals, especially with small sample sizes. However, it is not known how much of this and other recommended practices have actually been taken up by the industry.

There is, therefore, a need for information about best practice in usability measurements for practitioners. This formed the basis for the CUE-8 study, the outcomes of which are reported in this paper.

About CUE

This study is the eighth in a series of Comparative Usability Evaluation (CUE) studies conducted in the period from 1998 to 2009. The essential characteristic of a CUE study is that a number of organizations (commercial and academic) involved in usability work agree to evaluate the same product or service and share their evaluation results at a workshop. Previous CUE studies have focused mainly on qualitative usability evaluation methods, such as think-aloud testing, expert reviews, and heuristic inspections. An overview of the eight CUE studies and their results are available at DialogDesign's website (Molich, 2009).

Goals of CUE-8

The main goals of CUE-8 were

Previous | Next