Canesta, Inc. is the inventor of revolutionary, low-cost electronic perception technology that enables ordinary electronic devices in consumer, security, industrial, medical, automotive, factory automation, gaming, military, and many other applications to perceive and react to objects or individuals in real time.
In Fall 2008, Canesta approached Kicker Studio to create a demonstration of their latest camera technology for the Consumer Electronics Show 2009. The prototype was to be of an entertainment center controlled by gestures alone, and powered, of course, by a Canesta camera.
Before we began design we needed to understand the technology: its limitations, possibilities, and how it would link up with the interface. The Canesta camera offered a very specific set of benefits and limitations. Within those parameters, we established a safe zone for accurate gesture recognition. In the meantime, we worked with engineers to ensure the interface would seamlessly link into the “gestural library,” which examined data coming in from the camera and recognized it as specific gestures.
We also set about to understand the activity and context of watching TV. We recorded subjects while watching video on their TV or computer. We looked for the types of casual gestures a user would make in order to limit the number of accidental triggers cause by non-deliberate gestures. We also noted the type of commands necessary specifically for operation of video playback. We looked for similar patterns of control to reduce the size of the vocabulary for easy retention. For example: changing the volume and changing the channel.
From these investigations, we quickly established a list of metrics to measure success. We wanted to clearly “beat the remote” while creating a fun, engaging experience.
Marketing and Product Strategy
Technology wouldn’t tell the story by itself. We developed scenarios that would help illustrate the ways in which the product itself could not only redefine the way we relate to our media, but to our environments as well. We did this by understanding the ways that the product could be situated within current markets as well as how it would fit into the product landscape of the next 5 or 10 years.
Defining the Gestural Language
From a practical standpoint (because it required the most development time), our first design task was to define the gestures that would control the entertainment center. This was a combination of brainstorming (making a lot of crazy gestures) and then comparing them to three things: the technical constraints of the camera, the off-limits casual gestures we found in research, and our design principles.
Our first day of brainstorming, we came up with a list of guiding principles. One of them was that users should be able to feel comfortable doing these gestures while on a date. That is: nothing difficult, nothing embarrassing, and nothing that was too cartoon-like. You often see a lot of “Minority Report”-style gestural interfaces, but those are far too tiring, too challenging, and often far too dramatic for a task like watching videos in your living room. You’d soon be begging for your remote back.
We spent several days in a small room with a whiteboard and lots of post-it notes and hand waving, which led us to a couple distinct gesture sets that we wanted to test with an audience.
With a set of research subjects, we did scenario-based prototyping, with paper and simulated screens. After watching people attempt our gestural set, we quickly added to our list of principles No emphatic gestures. We found that the more elaborate gestures made some users feel like they were “angry” at their TV. We also eliminated a number of gestures that seemed comfortable in our small room, but when put to the test seemed overly tiring.
During a rapid prototyping session we went through several iterations and were able to build on prototypes until we hit upon a successful solution that centered around circling (“Wax on, Wax off”) gestures, simple waves, and a small number of specific gestures for important actions you don’t want to do accidentally (like turn off the TV).
Interaction and Interface Design
With the absence of direct contact (as with a mouse or touchscreen) freeform gestures rely heavily on visual and audio cues to help guide the user. Once the gesture set was established, we created a unique user interface which helped enhance the mental connection between the user’s actions and the response of the interface. We focused on developing visual cues to help reinforce the types of movements that would be clear and natural. For example, our gestures relied heavily on a circular hand movements, so, rather than have items scroll top to bottom in a list, we created “dial” like lists.
Implementation and Development
Now we were ready to put all the pieces together. We worked closely with Canesta’s engineering team to ensure our interface synced with all the cues detected by the camera and pattern matched via the gestural library. Then, once the front- and back-end were cleanly married, we again did a series of user tests to ensure the proper events occurred when expected. After polishing up a few small areas, we were ready to present the demo to the public, first at CES and then at the 2009 TV of Tomorrow conference.
While the Canesta camera can (and will) be embedded into new televisions (as well as appliances and other consumer electronics), we could easily imagine customers who might want to supplement their existing televisions with stand-alone cameras so that they too could control their sets with gestures. We sketched and modeled dozens of different camera configurations before settling on one we liked.
This camera can perch on top of the TV and rotate to get a better view of the room. Soft white and blue LEDs indicate whether the camera is on, when it is observing a user, and when it is accepting gestural commands from the user.
Read the Canesta press release.