upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

Rent a Car in Just 0, 60, 240 or 1,217 Seconds? Comparative Usability Measurement, CUE-8

Rolf Molich, Jarinee Chattratichart, Veronica Hinkle, Janne Jul Jensen, Jurek Kirakowski, Jeff Sauro, Tomer Sharon, Brian Traynor

Journal of Usability Studies, Volume 6, Issue 1, November 2010, pp. 8 - 24

Article Contents


Method

In May 2009, 15 U.S. and European teams independently and simultaneously carried out usability measurements of the Budget.com website (see Figure 1). The measurements were based on a common scenario and instructions (Molich, Kirakowski, Sauro, & Tullis, 2009).

The scenario deliberately did not specify in detail which measures the teams were supposed to collect and report, although participants were asked to collect time-on-task, task success, and satisfaction data as well as any qualitative data they normally would collect.  The anonymous reports from the 15 participating teams are publicly available online (Molich, 2009).

Teams were recruited through a call for participation in a UPA 2009 conference workshop.

After conducting the measurements, teams reported their results in anonymous reports where they are identified only as Team A ... Team P. The teams met for a full-day workshop at the UPA conference.

Figure 1

Figure 1. The Budget.com home page as it appeared in May 2009 when CUE-8 took place.

The following is the common measurement scenario:

The car rental company Budget is planning a major revision of their website, www.Budget.com.

They have signed a contract with an external provider to create the new website. Budget wants to make sure that the usability of the new website is at least as good as the usability of the old one. They want you to provide an independent set of usability measurements for the current website. These measurements will provide a baseline against which the new website could be measured by another provider.

Your measurements must be made in such a way that it will later be possible to verify with reasonable certainty that the new website is at least as good as the old one. The verification, which is not part of CUE-8, will be carried out later by you or by some other contractor.

Budget wants you to measure time on task and satisfaction for ... five key tasks.... Budget has clearly indicated that they are open to additional measurements of parameters that you consider important.

Budget recently has received a number of calls from journalists questioning the statement “Rent a car in just 60 seconds,” which is prominently displayed on their home page. Consequently, they also want you to provide adequate data to confirm or disconfirm this statement. If you disconfirm the statement, please suggest the optimal alternative that your data supports and justify it.

The scenario is realistic but fictitious. The workshop organizers had limited contact with Budget.com, and they had no information on whether Budget was planning a revision of their website.

The measurement tasks were prescribed to ensure that measurements were comparable. The following were the five tasks:

  1. Rent a car: Rent an intermediate size car at Logan Airport in Boston, Massachusetts, from Thursday 11 June 2009 at 09:00 a.m. to Monday 15 June at 3:00 p.m. If asked for a name, use John Smith and the email address john112233@hotmail.com. Do not submit the reservation.
  2. Rental price: Find out how much it costs to rent an economy size car in Myrtle Beach, South Carolina, from Friday 19 June 2009 at 3:00 p.m. to Sunday 21 June at 7:00 p.m.
  3. Opening hours: What are the opening hours of the Budget office in Great Falls, Montana on a Tuesday?
  4. Damage insurance coverage: An unknown person has scratched your rental car seriously. A mechanic has estimated that the repair will cost 2,000 USD. Your rental includes Loss Damage Waiver (LDW). Are you liable for the repair costs? If so, approximately how much are you liable for?
  5. Rental location: Find the address of the Budget rental office that is closest to the Hilton Hotel, 921 SW Sixth Avenue, Portland, Oregon, United States 97204.

Measurement Approaches

Table 1. Key Measurement Approaches

Table 1

Questionnaire: A=ASQ, M=SMEQ, N=NASA TLX, O=Own, S=SUS, W=WAMMI.
Time measured for: CS=Comprehend and complete task, T=Task completion, U=User defined
Result verified by: C=Multiple choice, M=Moderator, P=Professional, U=User

As shown in Table 1, nine teams (A, B, C, E, G, K, N, O, and P) used "classic" moderated testing. They used one-on-one sessions to observe 9 to 22 participants completing the tasks.

Six teams partly or wholly used unmoderated sessions. Teams sent out tasks to participants and used a tool to measure task time. Some teams used multiple-choice questions following each task to get an impression of whether the task had been completed correctly or not.

Four teams (D, F, L, and M) solely used unmoderated testing. Teams D, L, and M used a tool to track participant actions, collect quantitative data, and report results without a moderator in attendance. These teams recruited 14 to 313 participants and asked participants to complete the tasks and self-report. These teams used tools to measure task completion time. Team F used a professional online service (usertesting.com) to recruit and to video record users working from their homes; the team then watched all videos and measured times.

Two teams (H and J) used a hybrid approach. They observed 3 to 7 participants in one-on-one sessions and asked 13 to 57 other participants to carry out the tasks without being observed.

Team G included a comparative analysis of corresponding task times for Avis, Enterprise, and National. They also did keystroke level modeling to predict experienced error free task times.

Test Tasks

All teams gave all five tasks to users. Most teams presented the tasks in the order suggested by the instructions, even though this was not an explicit requirement. Team K and O repeated the car rental tasks (task 1 and 2) for similar airports after participants had completed the five given tasks. These teams reported significant decrease in time with repeated usage; task times for the repeated tasks were often less than half of the original times.

Measurements

All teams except one reported time-on-task in seconds. Team A reported time-on-task to the nearest minute. Some teams included time from task start until participants gave up in their time-on-task averages. Some of the teams that used unmoderated testing included time to understand the task in their time-on-task.

Other metrics reported included

Previous | Next