Editorial

The “Rule of Three” Also Works in Conversation Design

Exploring the aesthetics and functionality of the rule of three in user interfaces

In the raging storm that is the business world, we cling desperately to data, no matter how tenuous our grasp. We all think we understand what the numbers mean, and bring them to bear on our decisions. But we need to be careful. Looking deeper into the research, we often discover the findings don’t say what we thought. As a small example, I’ve been guilty of throwing around “the 80/20 rule” like some big shot, only to find that I had its meaning all wrong.

This is just one of many classic numerical misconceptions. As a conversation designer for Google Assistant, I think a lot about language and how to make dialogue that is as responsive, natural, and effortless as possible. In that role, I’ve come across many of these subtle but glossed-over ideas. This post will hopefully steer conversation designers—and maybe other design disciplines—away from one classic numerical misconception, and toward a simple alternative that’s been under-appreciated for a half-century; a finding that can lead the way to simpler, more helpful designs.

The alluring, but often misused, magical number 7 (plus or minus 2)

One research chestnut that gets rehashed mercilessly in the design world is George Miller’s magical number 7 (plus or minus 2). This seminal research by Miller and others in the field of cognitive psychology led to a deeper understanding of the limits of human working memory, and the strategies we can use to overcome those limits. The findings were elegant and astonishing.

In short (too short to do it justice), Miller and his colleagues were trying to identify the “capacity limits” of working memory. Through a series of experiments, they found that people have limits both in the number of things they can identify correctly after being taught their labels, and also limits in the number of items they can recall from a list presented earlier. Coincidentally, both of these limits were around 7. Just as importantly, Miller found several ways people can circumvent these limits, like adding more dimensions of information to each item (e.g. it’s easier to identify a tone if it’s distinguished from others by both loudness and pitch, as opposed to just one or the other dimension) or by “chunking” bits of information together (e.g. we memorize a 10-digit US phone number as three pieces of information, not ten).

When I first entered UX in the late 90’s, I saw many references to Miller’s work. Even today, articles pop up regularly (not linking to them because I’m not trying to put anyone “on blast” as my kids say) that urge designers to minimize the number of chunks of information on a screen so as not to overload working memory. And that’s certainly a good cause. Busy web pages and app screens can be overwhelming. But that’s not the subject of Miller’s work. His questions and methods cannot easily inform visual design because, to put it simply, Miller’s participants had no screens (this issue is also pointed out by information designer Edward Tufte). Screens allow today’s computer users to offload much of their cognitive work. While limits certainly exist to the human capacity for organizing information consistently in one’s visual field, Miller’s work doesn’t address it.

But what about conversation design? Miller’s work could be relevant when designing voice-only interfaces. After all, one challenge of voice is that it’s ephemeral, requiring people to hold items in memory as they interact. Voice designers might ask themselves: How do I take Miller’s findings to heart? If my system is speaking out a list of movie showtimes to users, should I limit them to 7? Should I limit voice menus to 7 items? Plus or minus two? Should I chunk them by putting similar menu items closer to each other in the list?

Here’s my hot take: If you’re asking that question, something has already gone horribly wrong. You should avoid any situation that has you asking yourself how far you can stretch the user’s memory limits.

Consider the nature of conversations. They are cooperative actions. Cooperation manifests itself in many ways during a conversation.4 Now think back to the hypothetical questions our make-believe conversation designer was asking. Should I limit menu items to 7? Should I limit the movie times to 7? There’s nothing cooperative—nothing polite—about overloading a conversational partner with choices. You would never do that in a chat with a friend or co-worker, and that should be an indication not to do it in your voice designs.

While working on a project many years ago, I spent some time calling Las Vegas concierges and asking them, “What’s fun to do in Vegas if I don’t gamble?” The goal was to understand how they tackled the task of answering that question conversationally. All ten of them used similar strategies. They would say things like this:

Good concierge

They proposed the most popular attraction, and when that didn’t work, offered two or three other popular attractions. And when that failed, they put some responsibility on me to say what I liked to do. The important point is that absolutely no concierge took this strategy:

Odd concierge

And neither should your system.

Gail Jefferson and the magical number 3

The good news is that we can look to another number for guidance when designing conversational interfaces. You may have come across references to “the rule of three,” a theory suggesting people respond more positively to information delivered in sets of three in both written and spoken language (e.g. Life, Liberty, and the Pursuit of Happiness). But data suggests the rule of three goes deeper than simple aesthetics.

Gail Jefferson was a sociolinguist who, along with Harvey Sacks and Emanuel Schegloff, founded the field of Conversation Analysis. Their work supported the idea, rejected by Noam Chomsky, that conversation does have structure. In one fascinating article, Jefferson exhaustively reviewed transcriptions of people’s spoken interactions on various topics and found that speakers follow an unspoken rule to generate lists in threes. Jefferson writes:

“…three-partedness appears to have ‘programmatic relevance’ for the construction of lists. That is, roughly, lists not only can and do occur in three parts, but should so occur.”

Not only do speakers follow this rule, but listeners seem to comprehend speech using the same assumption. They expect the speaker to generate lists of three and use that information to decide when to start speaking. When lists don’t show up in threes, interruptions and other complications occur:

“…given two items so far, a recipient can see that a third will occur, and that upon its occurrence, utterance completion can have occurred whereupon it will be his turn to talk.”

Jefferson continues:

“In principle, then, a three-part list can be used to monitor for utterance completion and turn transition.”

In scores of examples, she shows evidence of people uttering, and even struggling to make sure they utter, three-item lists. For example, they say things like “the pie and the whip cream and stuff,” and “Tyrone Power, Clark Gable, and Gary Cooper,” and “she kept lookin’ and lookin’ and lookin’.” Even phrases that replace lists like “blah blah blah” and “yadda yadda yadda” (called “triple-singles”) come in threes.

Of course, writing and speaking need to also accommodate closed sets of two items, like salt and pepper, Harold and Kumar, etc. but when there are more things in a set, no matter how many more, three seems to have an ineffable power, an ability to succinctly communicate a sense of “muchness” as Jefferson puts it.

So what do you do with this information?

These results suggest that when crafting conversations, designers might consider grouping both speakable menu options (as well as answers with multiple items) into groups of three. There’s an aesthetic reason, which is that people seem to have a positive response to things grouped in threes, but also the cognitive reason that undergirds the aesthetic one: people use the assumption of three to generate and comprehend spoken language.

If you do find yourself in a tough situation, with 7 plus or minus 2 chunks of information—or way more—to present to a user, ask yourself if there’s a way to scale options down to 3. For example:

Leverage a screen when possible. Could the system’s voice speak three items and let a screen present any more items? Two people discussing what movie time is best for them might use a similar strategy, with the person holding up the phone showing the screen of options to the other person.

Lisa: [looking at phone] Well, if we want to see Zombie Goats 2: The Goatening tonight, there’s a 5 o’clock, a 7, a 9…

Francois: Yeah but remember we’re having dinner with Krishna.

Lisa: Oh right. Well which of these works? [shows phone screen]

Francois: How about that midnight show? It’s Saturday!

We put this principle into action with the Google Assistant. If asked on a smart display about upcoming calendar items, the Google Assistant will speak the first three items, and then offer to share more. All of the calendar items for the day are also on the screen, in case the person using the device prefers to browse them all by touch.

Break down a single menu of many items into multiple menus of no more than three items each. Back in 2001, I attended a colloquium at the lab of the late Cliff Nass at Stanford, and saw his students present their current research. One student (and I apologize for being unable to find a reference, please contact me if you know more) found that people experience a series of short lists of options as faster than one giant list of options. For example, being asked to choose from one list of six options felt slow, but choosing from two lists of three options, one after the other, felt faster. So if you’re designing a Conversational Action for the Google Assistant, and have multiple features a user can try, see if you can avoid putting them all in one list of options right away. Instead, try to present a maximum of three options with one or more of those options leading to a few more.

And remember, visually-impaired users or those who can’t reposition themselves to look at a screen have no choice but to rely on voice, so a three-at-a-time strategy is even more critical for this important population.

Conclusion

As groundbreaking and important as the findings were, it’s probably time for the UX community to leave George Miller’s work alone. The more GUI-focused designers can walk away because Miller’s methods and results don’t map onto what they do. Conversation designers can walk away because the question his work answers is a question we should never be asking.

Instead, conversation designers should slim down lists of any kind to three items, both because of the three-item list’s artful appeal and because of its functional role in conversation. It’s never good to look for easy answers in the world of psychology and design, but in the case of the number three, we may have an elegant exception.

Author’s note: I’m deeply indebted to my colleague Madelaine Plauché for her outsized contribution to this post.

This essay was originally published on Medium.

NOTES

  1. The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information (1955). George A. Miller.
  2. The magical number seven, plus or minus two: Not relevant for design (2000). Edward Tufte.
  3. Conversation Design: Speaking the Same Language (2017). James Giangola.
  4. Designing Actions on Google.
  5. List-Construction as a Task and Resource (1990). Gail Jefferson.