A Methodology for Measuring Usability Evaluation Skills Using the Constructivist
Theory and the Second Life Virtual World
Journal of Usability Studies, Volume 4, Issue 4, May 2009, pp. 178-188
Article Contents
Results and Discussion
Comments about Second Life were overwhelmingly positive. In fact, reports, diaries, and checklists consisted of three times as many complimentary or descriptive comments than problems. In short, students could easily identify what worked. Yet, they also identified 9 usability issues, including ones also identified by the instructor. Beyond this, 46 issues were presented and discussed in the Blackboard forum but only 16 appeared in the final reports and diaries. Seven issues were determined to be false positives. A false positive is defined here as an issue that was identified by students as usability problems but were not. Table 2 lists the 9 valid usability related problems identified by students.
| Identified issues | |
|---|---|
*Lag time |
There was a general agreement that lag time would not allow complex tasks to be completed within 12 seconds. Even for simple movements, lag time made response times inappropriate to the tasks. |
*Avatar maneuverability |
According to one student avatar movement is “like working a joystick. It takes practice.” New users ended up in trees and were given no prompts for getting out. One avatar became locked into a backward movement. User skills did not increase considerably between the first and second visits. |
*Unhelpful sounds |
About the sounds, one student stated, “There are sounds for almost everything…Sounds, like constant beeping, are not always helpful.” Sound is often used to differentiate between error messages and background conversation or gestures. A user performing an action correctly may mistake the beeping for something she is doing wrong. |
*Confusing messages |
Though selections within the drop-down menu are brief and familiar, some were confusing or inappropriate, such as the “Wear on” option. “Where,” one student asked, “does one ‘wear’ a unicorn?” |
*Reversal of actions |
When users go back to a previous scene, they cannot change their earlier choices; they are irreversible. According to one student, “There is no undo function as such, but mistakes can be recovered from if their previous selections are remembered.” |
Lack of white space |
Meaningful groups of items are not separated by white space. The text is often small and spaced tightly, making it difficult to read. |
Misnomer of world menu items |
Choices are not always logical. The Edit menu, for example, displays options to go to a friend’s and group’s lists, allowing for direct communication with them. While a user can edit these lists, this is not the main function of these selections. The World menu features selections for chat, navigation, and account information. None of these seem to generate “world” in the mind. |
Inappropriate metaphors |
Sometimes metaphors are not clearly understood for objects that indicate a script (i.e., floating dance machines). |
Distracting elements |
Nonessential elements sometimes distract. Users can control menus by minimizing windows with nonessential items. Novices, however, may find themselves searching for essential menus. At the same time, dialog boxes with background noise may prompt them for a selection in reference to some other task. |
Instructor-identified usability issues
Table 3 shows results based on the valid issues students identified and those they discussed or explained in their final reports. This table shows that most students could identify all the problems named by the instructor and additional problems as well. Further, for several issues, a number of students could explain the heuristic violated and why it is important.
| Identified issues | # of students identifying issue |
# of students explaining |
|---|---|---|
*Lag time |
9 of 9 |
8 |
*Avatar maneuverability |
9 of 9 |
9 |
*Unhelpful sounds |
9 of 9 |
8 |
*Confusing or unhelpful messages |
7 of 9 |
6 |
*Reversal of actions |
7 of 9 |
4 |
Lack of white space |
5 of 9 |
4 |
Misnomer of world menu items |
2 of 9 |
1 |
Metaphors as cues |
3 of 9 |
3 |
Distracting elements |
6 of 9 |
2 |
*Instructor-identified usability issues
Table 4 shows the point breakdown of skills. Students garnered more than two thirds of points possible for the identification tasks, suggesting that identification of the problems was a manageable task for students to master. Explaining the problem, however, was a bit more difficult for students. More than half of the points available were distributed for this task.
Points for identification | Points for explanation | ||
|---|---|---|---|
Possible |
Actual |
Possible |
Actual |
630 |
490 |
1215 |
675 |
There were, however, some heuristics a majority of students did not identify in their reports and even fewer who could explain. The diaries provide clues to why students may not have explained the heuristic. In her diary, one student referenced the problem with the misnomer of the world menu. She wrote, “That [world menu] doesn’t make any sense, the menu, but I guess I’m the only one that cares about that. Picky me.” This was not repeated in the student’s report. The results in Table 5 show that the total number of usability problems listed for the diaries are more than for the reports. These findings suggest that students thought the issues bothersome but either could not explain how they might affect usability or thought they were not worth mentioning in their report.
| Identified issues | # of students identifying issue |
# of students explaining |
Issues identified in diaries |
|---|---|---|---|
*Lag time |
9 |
8 |
9 |
*Avatar maneuverability |
9 |
9 |
9 |
*Unhelpful sounds |
9 |
8 |
8 |
*Confusing or unhelpful messages |
7 |
6 |
3 |
*Reversal of actions |
7 |
4 |
5 |
Lack of white space |
5 |
4 |
3 |
Misnomer of world menu items |
2 |
1 |
3 |
Metaphors as cues |
3 |
3 |
6 |
Distracting elements |
6 |
2 |
8 |
*Instructor-identified usability issues
Conversely, students identified 7 problems that were determined to be false positives. Kantner and Rosenbaum (1997) found that 43 percent of issues in their study were false positives. In the current study, the percentage was 44 percent. The false positives discovered by inexperienced students in the current study indicate that there is a greater possibility of applying the wrong heuristics to a particular problem. Table 6 identifies the issues determined to be false positives.
| List of false positives | # of students identifying issue |
|---|---|
“[It’s] like a game, so no one can really get any work done.” |
1 |
“Walking backwards does not always reverse walking forward.” |
1 |
The system performs “…actions too quick to cancel.” |
1 |
“There are too many unavailable items in the tool bar because a membership is required to use them.” |
1 |
Expressed sentiments of fear (or the “willies”) or danger from “predators” on Second Life. |
4 |
Character customization of avatars is “ugly.” |
1 |
“There is a condescending attitude of many of them [avatars] that make you feel uncomfortable, and the identity of the avatar is uncertain.” |
1 |
Percentage of false positives of all issues identified |
44% |
Usability issues identified |
9 |
False positives |
7 |
Total issues identified |
16 |
De Angeli, et al. (2003) determined several reasons for false positives, including observations based on the personal preference of the evaluators but that were not usability problems, problems that reflected the misjudgments of the evaluator, system defects due to hardware configuration, and unclear or confusing statements. As shown in Table 6, most false positives were attributed to the personal feelings or preferences of individual students.
Revisiting the research questions, this study suggests that heuristic evaluation skills can be measured using diaries along with a systematic process that includes established usability guidelines, standard instruments to assure soundness, and benchmarks to measure progress. Most students had a minimum amount of measurable usability evaluation skills in that they could identify heuristic problems. A fewer number had a deeper amount of knowledge in that they could explain the problems and the reasons why they were harmful.
The results of this study suggest that diaries provided a means for students to express their opinions and feelings about the usability process, Second Life, and heuristics. The detailed descriptions in the diaries beyond the usability checklist and report suggest that the diaries served as instruments for unstructured and deep reflection.
