The Usability of Computerized Card Sorting: A Comparison of Three Applications by Researchers and End Users
Barbara S. Chaparro, Veronica D. Hinkle, and Shannon K. Riley
Journal of Usability Studies, Volume 4, Issue 1, November 2008, pp. 31-48
Article Contents
Results
The following sections discuss task success, task difficulty ratings, task completion time, satisfaction scores, and preference rankings.
Task success
Success rates for each task by program are presented in Table 2. All participants were successful in completing all but two of the tasks. Users had trouble completing the task to set up a card set in OpenSort and the task to find the data to analyze in CardZort.
| Task | CardZort | WebSort | OpenSort |
|---|---|---|---|
| Enter items to create a card set | 100% | 100% | 12.50% |
| Set up and Find where to analyze the results | 62.50% | 100% | 100% |
| Create and download results output | 100% | 100% | 100% |
Task difficulty ratings
Mean difficulty scores for each task by program are presented in Table 3 and summarized in Figure 1. A two-way within subjects ANOVA (task x program) was conducted to compare the average difficulty across tasks and applications. Results indicate a significant main effect of application F(1.1, 28) = 10.54, p = .01, partial η2 = .60 (Greenhouse-Geisser correction applied), a significant main effect of task F(2,28) = 7.96, p < .01, partial η2 = .53, and no interaction. Post-hoc comparisons revealed that participants rated the tasks with CardZort and OpenSort to be significantly more difficult than WebSort. In addition, they rated the task of creating the card sort to be significantly more difficult than the task to download the results output.
| Task | CardZort | WebSort | OpenSort |
|---|---|---|---|
| Enter items to create a card set | 3.0 (1.60) | 1.37 (.52) | 3.6 (1.51) |
| Set up and Find where to analyze the results | 2.0 (.93) | 1.12 (.35) | 3.0 (1.31) |
| Create and download results output | 1.87 (1.13) | 1.0 (0) | 1.7 (1.04) |
| Average (SD) | 2.29 (.95) | 1.17 (.25) | 2.79 (.67) |

Figure 1. Perceived task difficulty across applications.
Task completion time
Time-on-task was measured in seconds, from the start to the end of each task. Figure 2 shows a breakdown of total task time by application. A two-way within subjects ANOVA (task x program) was conducted to compare time across tasks and application. Results showed a main effect of program, F(2,14) = 31.53, p < .01, partial η2 = .82, a main effect of task, F(1.06, 15.21) = 42.78, p < .01, η2 = .86, and a significant program by task interaction, F(4,28) = 13.69, p < .01, η2 = .66. Post-hoc comparisons revealed that participants took significantly less time with WebSort overall than OpenSort and CardZort. Examination of the interaction showed that this difference was primarily due to the longer time to create the card set and to set up the data for analysis in CardZort and OpenSort (see Figure 2).

Figure 2. Task completion times across applications.
Satisfaction scores
Satisfaction was measured using the 10-item System Usability Scale (Brooke, 1996) that is summarized by a total score out of 100. A one-way within subjects ANOVA revealed significant differences in satisfaction across applications, F(2, 14) = 5.07, p < .05, partial η2 = .42. Post-hoc comparisons revealed that participants were more satisfied with WebSort than OpenSort. Mean satisfaction scores for each application are summarized in Figure 3.

Figure 3. Mean satisfaction scores across applications.
Preference rankings
All but one participant chose WebSort as the most preferred card sort application (Figure 4). Preference differences across applications were analyzed using a Friedman's Chi Square, X2 (2, N = 8) = 6.75,p< .05. Post-hoc tests showed that WebSort was more preferred than both CardZort and OpenSort (mean rank = 1.25, 2.37, and 2.37 respectively).

Figure 4. Application preference ranking: Each bar represents the number of participants that chose that application first, second, or third.
Interpretation of Card Sort Results
Task 4 required participants to look at sample results of an open card sort and to interpret the results. All programs offer the standard dendrogram (tree diagram) to display the results. OpenSort offers two additional methods. These include a Vocabulary Browser and a Similarity Browser (see http://www.themindcanvas.com/demos/ for examples). Users explored all of the methods but reported that the dendrogram provided the best summary. Participants reported that the OpenSort dendrogram had the most professional look and was the easiest to use of the three applications. They liked the use of color to differentiate each cluster of items and the ability to directly manipulate the number of groups. Users found the WebSort dendrogram to appear less professional in its design, to show little differentiation across groups, and to lack instruction as to how the data was analyzed. The CardZort dendrogram was reported to also lack detailed explanation of how the data was analyzed (i.e., single, average, and complete linkage analyses) and no group name analysis.
