upa - home page JUS - Journal of usability studies
An international peer-reviewed journal

The Usability of Computerized Card Sorting: A Comparison of Three Applications by Researchers and End Users

Barbara S. Chaparro, Veronica D. Hinkle, and Shannon K. Riley

Journal of Usability Studies, Volume 4, Issue 1, November 2008, pp. 31-48

Article Contents


Results

The following sections discuss task success, task difficulty ratings, task completion time, satisfaction scores, and preference rankings.

Task success

Success rates for each task by program are presented in Table 2. All participants were successful in completing all but two of the tasks. Users had trouble completing the task to set up a card set in OpenSort and the task to find the data to analyze in CardZort.

Table 2. Success Rate of Participants.
Task CardZort WebSort OpenSort
Enter items to create a card set 100% 100% 12.50%
Set up and Find where to analyze the results 62.50% 100% 100%
Create and download results output 100% 100% 100%

Task difficulty ratings

Mean difficulty scores for each task by program are presented in Table 3 and summarized in Figure 1. A two-way within subjects ANOVA (task x program) was conducted to compare the average difficulty across tasks and applications. Results indicate a significant main effect of application F(1.1, 28) = 10.54, p = .01, partial η2 = .60 (Greenhouse-Geisser correction applied), a significant main effect of task F(2,28) = 7.96, p < .01, partial η2 = .53, and no interaction. Post-hoc comparisons revealed that participants rated the tasks with CardZort and OpenSort to be significantly more difficult than WebSort. In addition, they rated the task of creating the card sort to be significantly more difficult than the task to download the results output.

Table 3. Mean (SD) Task Difficulty for Each Program (1=Very Easy, 5=Very Difficult).
Task CardZort WebSort OpenSort
Enter items to create a card set 3.0 (1.60) 1.37 (.52) 3.6 (1.51)
Set up and Find where to analyze the results 2.0 (.93) 1.12 (.35) 3.0 (1.31)
Create and download results output 1.87 (1.13) 1.0 (0) 1.7 (1.04)
Average (SD) 2.29 (.95) 1.17 (.25) 2.79 (.67)

Figure 1. Perceived task difficulty across applications.

Figure 1. Perceived task difficulty across applications.

Task completion time

Time-on-task was measured in seconds, from the start to the end of each task. Figure 2 shows a breakdown of total task time by application. A two-way within subjects ANOVA (task x program) was conducted to compare time across tasks and application. Results showed a main effect of program, F(2,14) = 31.53, p < .01, partial η2 = .82, a main effect of task, F(1.06, 15.21) = 42.78, p < .01, η2 = .86, and a significant program by task interaction, F(4,28) = 13.69, p < .01, η2 = .66. Post-hoc comparisons revealed that participants took significantly less time with WebSort overall than OpenSort and CardZort. Examination of the interaction showed that this difference was primarily due to the longer time to create the card set and to set up the data for analysis in CardZort and OpenSort (see Figure 2).

Figure 2. Task completion times across applications.

Figure 2. Task completion times across applications.

Satisfaction scores

Satisfaction was measured using the 10-item System Usability Scale (Brooke, 1996) that is summarized by a total score out of 100. A one-way within subjects ANOVA revealed significant differences in satisfaction across applications, F(2, 14) = 5.07, p < .05, partial η2 = .42. Post-hoc comparisons revealed that participants were more satisfied with WebSort than OpenSort. Mean satisfaction scores for each application are summarized in Figure 3.

Figure 3. Mean satisfaction scores across applications.

Figure 3. Mean satisfaction scores across applications.

Preference rankings

All but one participant chose WebSort as the most preferred card sort application (Figure 4). Preference differences across applications were analyzed using a Friedman's Chi Square, X2 (2, N = 8) = 6.75,p< .05. Post-hoc tests showed that WebSort was more preferred than both CardZort and OpenSort (mean rank = 1.25, 2.37, and 2.37 respectively).

Figure 4. Application preference ranking: Each bar represents the number of participants that chose that application first, second, or third.

Figure 4. Application preference ranking: Each bar represents the number of participants that chose that application first, second, or third.

Interpretation of Card Sort Results

Task 4 required participants to look at sample results of an open card sort and to interpret the results. All programs offer the standard dendrogram (tree diagram) to display the results. OpenSort offers two additional methods. These include a Vocabulary Browser and a Similarity Browser (see http://www.themindcanvas.com/demos/ for examples). Users explored all of the methods but reported that the dendrogram provided the best summary. Participants reported that the OpenSort dendrogram had the most professional look and was the easiest to use of the three applications. They liked the use of color to differentiate each cluster of items and the ability to directly manipulate the number of groups. Users found the WebSort dendrogram to appear less professional in its design, to show little differentiation across groups, and to lack instruction as to how the data was analyzed. The CardZort dendrogram was reported to also lack detailed explanation of how the data was analyzed (i.e., single, average, and complete linkage analyses) and no group name analysis.

Previous | Next