upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

How To Specify the Participant Group Size for Usability Studies: A Practitioner’s Guide

Ritch Macefield

Journal of Usability Studies, Volume 5, Issue 1, Nov 2009, pp. 34 - 45

Article Contents


The Broad Issues

When specifying the participant group size for a usability study it is important that we understand the broad issues related to this challenge.

Tensions in commercial contexts

In most commercial contexts there is an inescapable tension in study design between the desire for (more) reliable findings and the budget and time required for a study. Further, commercial practitioners must simply accept that we do not operate in an ideal world and that most study designs will be ultimately constrained by organizational or project realities.

Given this, our goal is not to be parochial advocates fighting for studies that have maximum reliability, whatever the cost. Rather, it should be to work with other stakeholders to reach a study design that is realistic and optimal for the project as a whole, or at least has some benefit to the wider project. This is challenging, not least because authority figures in our discipline have widely differing views as to the degree to which study reliability can (or should) be compromised for the “wider good” and the useful limits of this compromise.

For example, Nielsen (1993) argues that “for usability engineering purposes, one often needs to draw important conclusions on the basis of fairly unreliable data, and one should certainly do so since some data is better than no data” (p. 166). However, others (e.g., Woolrych & Cockton, 2001) may question the wisdom of this advice; they may argue that it is invalid to draw any conclusions from a study that lacks reliability.

Application of research literature

There are different types of usability studies and, similarly, studies take place in a wide variety of contexts. This means that we must be careful when applying any particular research based advice. Similarly, we must be aware that any numeric values presented in this advice are generally means subject to a margin of error and confidence level.

For example, the popular “headline” advice in Nielsen (2000) is that a usability study with five participants will discover over 80% of the problems with an interface, but this does not mean that any one particular study will achieve this figure. To explain, this advice is based on a study by Virzi (1992) and Nielsen (1993). In this study 100 groups of five participants were used to discover problems with an interface. The study did indeed find that the mean percentage of problems discovered across all 100 groups was about 85%. However, this figure has 95% confidence level and a margin of error of ±18.5%. This means that for any one particular group of five there is a 95% chance that the percentage of problems discovered will be in the range of 66.5%-100% and, indeed, some groups of five did identify (virtually) all of the problems; however, one group of five discovered only 55% of the problems.

Similarly, it is understandable that some usability practitioners perceive statistics to be more positivist in nature than is actually the case. In reality, statistics are not free from opinion and often rely on questionable assumptions (e.g., Grosvenor, 1999; Woolrych & Cockton, 2001). Therefore, different statistical methods and associated thinking can easily lead to different conclusions being drawn from the same set of research data.

In summary, usability practitioners should not simply accept and generally apply “headline” figures for participant groups sizes quoted in research articles without question or inquisition.


Previous | Next