Simplifying Game Interfaces: Principles for Hiding UI Complexity

Sometime around 2002, I appointed myself “Ambassador of Video Games” to my parents. They had never really gotten into those games, and they detested the screeching, neon noise that came out of our Nintendo Entertainment System (NES) as we were growing up. We have lived in different states since I went off to school, so I only see my parents a few times a year. When I do see them, I like to show them the prettiest, most interesting video games that have come out since I saw them last, partly to share my enthusiasm for beautiful, engaging things, and partly to gauge whether “normal” people are connecting with video games yet.

screencap of Prince of Persia

Figure 1. Prince of Persia, 2003

I remember the first time they really noticed a video game. I was showing them Prince of Persia: The Sands of Time (Figure 1). They couldn’t play it—the controls were way too complex—but they appreciated the look and sound of the thing in a way they hadn’t before. When they came to stay with us over the Christmas holiday in late 2009, I had a fresh batch of interesting games to show them; among them, thatgamecompany’s Flower (Figure 2).

Flower builds on a decade of experiments and innovations to create a minimalist interface for a complex system. Flower has no visual interface in the traditional sense: no visible score, no health bar, no stats. Instead, the player’s progress and status are displayed within the game’s world as a stream of flower petals in the center of the screen. It is aesthetically pleasing, less intimidating than a traditional detailed overlay, and easier to understand for the majority of people who lack fluency in video game displays. You don’t have to understand how health and a recharging shield meter are interrelated to understand Flower.

Flower is a fully three-dimensional flight and particle simulation set in a lush landscape populated with individual blades of grass. The controls are very easy to understand: the accelerometer inside the PS3 controller acts like a gyroscope, binding the player’s relative orientation in space to the tilt of the controller. The X button is basically a gas pedal, propelling the player forward. Flying through the world of Flower is as easy as tilting the controller a bit and pressing one button.

In spite of its ease, or maybe because of it, Flower has neither written instructions nor a traditional tutorial. The only guidance Flower provides is in simple, iconic prompts to tilt the controller or press the X button. Players learn to play Flower by playing Flower.

Showing the measure of progress as an in-game effect, and reducing the controls or inputs to a simple motion control and a single button, enable an organic learning process. By removing intimidating, insular obstacles and encouraging exploration and discovery, Flower allowed my parents to engage in a culturally significant game on its own interactive grounds for the first time in decades.

screencap of flower petals

Figure 2. Flying through Flower

Teaching players to play while playing is an incredible thing—and one of the central problems in most game designs. Flower employs a common video game technique called input amplification to reduce the number of inputs without sacrificing an interesting level of complexity in the game itself.

Many games also employ a related but separate technique called context sensitivity.

These techniques aren’t necessarily native to games or restricted to games; our computers’ drag-and-drop functionality is a great example of input amplification.  This simple user action is interpreted by the software as a series of more complex commands, but the result is intuitive and useful. Drag-and-drop also has context sensitivity: the same action, depending on the circumstances, can move a block of text, open a file, or delete a directory. However, the use of input amplification and context sensitivity in games is much more pervasive and varied.

Input Amplification

Input amplification is so deeply encoded in games that it’s hard to imagine what video games would be like without it. Even in relatively ancient games like the original Super Mario Brothers (1985), an action as simple as pressing the right arrow on the controller triggers a very complex series of reactions in the game system (Figure 3).

screencap of Super Mario

Figure 3. Super Mario, 1985

Mario begins to run slowly at first, but soon at full speed. His animations update accordingly, and then the entire world begins to scroll so we can follow Mario on his adventure. New objects appear on the right side of the screen, while other objects disappear off the left side of the screen forever. The system is also checking to see if Mario bumped into anything, and if he did, there is a series of reactions and decisions that will branch off from there as well. All of these events cascade from a single button press in a two-dimensional game that is over twenty-five years old!

In newer games, including Flower, that entire series of events is both a fundamental assumption and an exponentially more complex series of events, since the game worlds are frequently rendered and explored in all three dimensions. As players approach new areas, they are added to the game world, while areas out of sight behind the player disappear, for performance reasons. As with Mario, the game checks to see if the player has bumped into anything, and reacts accordingly.

In modern games, these reactions can be even more complex, using timers, the state of the game, and the position of the player to divine the player’s intention. Big budget action games like Sands of Time’s descendant Assassin’s Creed (2007) or Sucker Punch Productions’ inFAMOUS (2009) employ complicated but invisible decision trees to interpret simple player inputs into complex game actions, like climbing the side of a building. On the whole, these games remain too complex for some people to enjoy, but the optimizations allow more experienced players to perform impossibly complicated behaviors easily and with minimal frustration.

Context Sensitivity

Imagine a game where, among other things, you can simultaneously open doors and turn lamps on and off—common actions in some games. There was a time when these actions would have been unique commands; for example, pressing the A button to open doors, and the B button to toggle the lamps. There are some disadvantages to this approach though, like asking the player to remember two different buttons just to accomplish these simple, similar tasks. These actions are great candidates for being assigned to a shared, context-sensitive “action” button.

One clue that these two actions could have a shared button is how different they are. Players can easily assume that to open a door they should be near a door, and likewise, to turn on a lamp they should stand near a lamp. There are rarely lamps embedded within doors, or vice versa, so the game system is not likely to get confused and accidentally open a door even though you are standing next to a lamp.

The other clue that they can be combined, however, is how similar these actions are in real life. In both cases, we can imagine ourselves reaching out with one arm to either flip a switch or turn a handle. We could even guess, after using the action button to open the door, that pressing the action button near a lamp might activate that object as well, without any instruction or guidance at all. Assigning intuitive context-sensitive behaviors is a funny balance of difference and similarity; the actions must be similar enough for their shared button assignment to make sense, but different enough that when the button is actually pressed there is no confusion between the expected and actual in-game result.

Let’s imagine a third action: maybe this game is about mobsters, and so there is a second way to open doors: kicking them down. Do we put that on the action button as well? This action is similar to opening a door, but it’s different in that the player is opening the door a different way. Now we have some confusion: when we approach the door and press the action button, what do we expect to happen? Do we open the door calmly or kick it off its hinges? Without more context and information, the game system would have to make an uneducated guess about our intentions. Furthermore, if this is the same button that turns lamps on and off, how could we know, or even guess, that pressing it will trigger a violent attack on an inanimate object?

The balance of difference and similarity here is reversed, and it makes this particular in-game action a poor candidate for mapping to our action button. However, it would make a great candidate for mapping to an “attack” button, something a mobster game would probably have anyway.

Pitfalls and Layers

For the last couple of years, I have been playing a lot of Street Fighter III and Street Fighter IV, a pair of fighting games from Capcom. Despite their ordinal proximity, these games are separated by a decade: SFIII in 1999 and SFIV in 2008. In many ways these were the formative years for accessibility in video game controls, and even in these hardcore, competitive games, the difference is notable.

In SFIII, the use of input amplification is limited to the basics of Street Fighter that even casual players might know: the motions of the joystick that can change a normal punch into a fireball (quarter-circle forward), or turn a regular kick into a spinning flying kick (quarter-circle back). SFIII also has a context-sensitive “parry” system, in which tapping toward one’s opponent at the precise moment of an attack will both block the attack and briefly stun the opponent. However, because of the strict, direct way SFIII processes joystick motions, the game is generally considered difficult and inaccessible.

SFIV, on the other hand, is credited with reviving the competitive fighting game community, due to its relative accessibility and modern look (Figure 4).

screencap of Street Fighter IV

Figure 4. Street Fighter IV has a very modern look.

SFIV is no Flower, but compared to SFIII, the controls are very forgiving. As long as a joystick input is mostly right, SFIV can usually “guess” what you were trying to do and make it happen. Many players still prefer the way SFIII works though, especially since they already understand the basics of the game. SFIII may be strict about the way it interprets joystick motions, but because of that it is much less likely to misinterpret the player’s actions.

Starcraft II takes a slightly different approach, maintaining two complementary control systems at all times. First, there is a mouse-based control system that employs lots of input amplification and context-sensitive behaviors. For example, depending on the state of the game, a simple mouse click can result in a collection of game units attacking, building, mining, moving, or some combination.

At the same time, there is also a more complicated, but less amplified, keyboard-based control system, where each key is tied to a precise, discrete action in the game. Beginners often start using just the mouse, but as they grow in experience, they gradually use the keyboard more and more, willing to trade the complexity of the input scheme for the speed and precision they get.

Conclusion

Even complex games rely on input amplification and context sensitivity to streamline the way in which they expose the most important game elements to players. For most games, coming up with the right control scheme is one of the hardest but most critical tasks in the design. Correctly amplifying the microscopic motions of thumbs on a game controller is a challenge we must overcome to avoid overwhelming players when they first experience the game. Context sensitivity—that unpredictable balance between differences and similarities—is a great tool for simplifying those controls.

For some games, though, the best way for people to learn is to layer progressively more complex control schemes. Imagine a wedding cake, where the smallest layer at the top is the simplest system with the most amplification, and underneath it, a more complex layer, and underneath that, an even more complex layer that affords even more control. Beginning players won’t feel overwhelmed or intimidated by the simplest layer at the top, while expert players in the lower layers won’t feel suffocated by unnecessary and unpredictable automation. Only then will that game be a space which allows and encourages players to grow and explore everything the game has to offer

Comments are closed.