By Rob Knies, Managing Editor, Microsoft Research
George Robertson is taking this meeting seriously. He focuses intently on other participants in the room, making eye contact, noting posture and visual cues, interjecting comments when appropriate. He studies diagrams scrawled onto a whiteboard, and, on occasion, uses a laser pointer to call attention to something he wants to address.
Sound like just another productive business meeting? Well, you’re right—except for one little detail. The meeting is being held in Redmond, and Robertson is in Northeast Harbor, Maine, more than 3,000 miles away.
Spotlight: Blog post
Eureka: Evaluating and understanding progress in AI
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
How is this possible? That’s a question for the researchers behind the Embodied Social Proxies project, a joint effort between the Human Interactions in Programming (HIP) and Visualization and Interaction for Business and Entertainment (VIBE) groups within Microsoft Research Redmond.
“For a team that’s mostly collocated, with one person somewhere else, that person is often out-of-sight, out-of-mind,” says Gina Venolia, a senior researcher in the HIP group. “They’re left out of the loop on ad hoc conversations. It’s hard to pull them in for planned meetings, because you put them on a speaker phone or on a projector and they get forgotten. They just aren’t as real as the person next door.
“It’s very common for somebody to be looking at a problem that they’re having and walk down the hall to talk to somebody about it. They’re going to talk with somebody collocated more than they’re going to talk with somebody who’s remote.”
Such a scenario became underscored for Venolia and colleagues in autumn 2008, when Robertson, a principal researcher in the VIBE group, suddenly found himself thousands of miles from his Microsoft Research colleagues. His wife had seen her work take her from the West Coast to the East Coast, and her husband used the opportunity to examine a research question.
“Microsoft was glad to let me do this remotely,” he said, “and shift some research toward finding ways to make remote or distributed teams work more effectively.”
As it turns out, Venolia had been thinking along the same lines.
“Could we build a device that always represented that remote person, to give that person some physical weight in the team’s presence, to make it easier to be aware of what that person is up to, to make it easier to initiate ad hoc communication, and to be able to represent them physically in a meeting?”
While mulling the possibilities, Venolia recalled a scenario from her past.
“Fifteen years ago or so,” she explains, “I was at [Silicon Valley-based] Silicon Graphics. I was in a group, and two of our senior people were located elsewhere. They worked out of their homes. Jim was in Columbus, Ohio, and Helga was in Reykjavik, Iceland. And these were important people.
“My manager at the time had the brilliant idea to put up two clocks, one representing the local time at each place. And rather than labeling them ‘Columbus’ and ‘Reykjavik,’ they were labeled ‘Jim’ and ‘Helga.’ Over time, photos they had sent or postcards of their locales, Christmas Cards, vacation postcards—whatever reminded us of Jim and Helga—went up on the wall next to the clocks. That part of the hallway became a physical embodiment of Jim and Helga and made it hard to forget about them.”
Reflecting on that lesson, it didn’t take long for Venolia to begin imagining a nascent research project. Add more information—live information. Add a screen and provide information about the remote worker’s activities. Analyze the person’s communications. Even—consider this—combine it all into a package that could be transported into a real-life meeting room.
Welcome to the concept of Embodied Social Proxies—or, as the project inevitably came to be known, George-in-a-Box.
In a nutshell, the researchers proceeded to connect a pair of computers to an array of cameras, enabling both hub and satellite presences to gain visibility into the other’s activities. And when the monitor is not actually being used by team members during a meeting, it displays information about the remote user’s calendar, activities, and availability.
“Rather than trying to build the perfect system and see what was wrong with it,” Venolia says, “we did a philosophy of underdesign: What is the least we could do to use this? We got a 15-inch laptop with a Webcam and quickly found out that there was a problem with the camera’s field of view. We quickly found out that we needed a speakerphone.”
The team cobbled the necessary elements together, along the way borrowing a camera array from colleague Cha Zhang from the Communications and Collaboration Systems group at Microsoft Research Redmond. Put it all on a cart, roll it into a conference room, and there’s George, with a seat at the table and ready to engage.
“My moving to Maine was a catalyst for exploring these various things,” Robertson says, “so Gina and I got together to figure out how to make this happen.”
Robertson had concerns.
“He was worried,” Venolia recalls, “about: ‘How do I stay aware of what’s going on in my team? How do they stay aware of me? How do we not get out-of-sight, out-of-mind?’ ”
The two formulated a set of research questions for which they wanted to find answers: Was it possible to create a meeting experience almost as good as face-to-face? Could awareness displays increase ad hoc communication with a satellite teammate? Can physical presence of a satellite teammate increase his or her social presence in a hub team?”
It was time, Robertson says, to figure things out.
“You’d take that laptop to the meeting I was attending and have it occupy a physical space at the table. That did part of what we wanted. When people interacted with me, they turned and looked at me in the laptop.”
Then came a set of refinements. One problem was that while the hub-based team could view the remote member’s reactions, gaze direction, and attention, thanks to a Webcam on the satellite computer, the experience, from the remote perspective, was more problematic.
“It didn’t work very well in terms of me being able to see what was going on in the meeting,” Robertson says. “The Webcam view just was not wide-enough-angle, and I had no control over what I was looking at.”
So the project members outfitted George-in-a-Box with two more cameras. One, stationed above the monitor, is a pan-tilt-zoom camera operated via remote control, enabling the remote participant to focus on a whiteboard, an individual speaker, slides being projected during a meeting, or team members at a conference-room table.
A third camera, affixed with a 140-degree, fish-eye lens, is stationed below the George-in-a-Box monitor, enabling the remote user to view practically everybody in the room, see their responses, watch who’s looking at whom, and gain environmental awareness at what is occurring in the meeting room.
“This setup works really well in meetings,” Robertson says. “I’m able to, with all of these cameras, see how people are reacting to what’s being said. I can follow pretty much everything going on in the meeting, if it’s on the whiteboard or on the screen or interactions between people. Sometimes, I’m actually leading the meeting, and if somebody’s on one side of the table, and somebody else on the other side says something and I see that the first person doesn’t agree with it, I will ask what her response is.
“That combination of these two cameras, which I’m controlling, turns out to be really effective.”
For Venolia, the cart presence just makes everything much more lifelike.
“George can see us, and we can see George at all times,” she says. “George can see very clearly when we’re looking at him, so that enables really natural stuff like turning to him and saying, ‘What do you think?’
“These cameras are actually little embedded Web servers, and they serve up a page that allows George to pan and tilt and zoom. The wide-angle camera is also a solid-state pan-tilt-zoom, but generally he keeps it so he can see everybody. Sometimes, he’ll look down one side of the table with one camera and at the other side with the other. He wants to be able to quickly glance between who’s talking and who’s listening. You can’t do that with manual camera control.”
Venolia, Robertson, and colleagues are not the first to head down this road. Other teams have placed a stuffed animal in a chair to represent an absent teammate. And, of course, we all have family mementoes to remind us of people who aren’t present.
But make no mistake: The Embodied Social Proxies project is unique—in part because of its ingenious model.
“This is really around this hub-and-satellite case,” Venolia says. “From my surveys of software engineers at Microsoft, about 16 percent have no collocated coworkers.”
That’s a lot of disenfranchised satellites.
In addition to the conversational benefits offered by the George-in-a-Box concept, there are also its awareness capabilities, although, Venolia and Robertson indicate, more work needs to be done on that front.
“When I walk past a colleague’s office, I can see if he’s around, if he’s busy, if he’s reading a magazine, if he’s looking irritated,” Venolia says. “I can see what’s going on his whiteboard, if there’s new stuff there. I can see his printouts …
“There are two kinds of ad hoc communication. One is I need to talk to somebody: ‘George is around. I’ll call him.’ For that, I need to know if George is there. Is now a good time to talk with him? What’s he working on? What’s his calendar look like? What’s his recent and predicted availability?”
A second kind of awareness built into the project regards activity awareness: What is George working on?
“I might hear a couple of coworkers talking about revising a slide deck for project X that we’re working on together,” Venolia continues. “I might jump in and say, ‘Oh, you know, I meant to change this in the slide deck,’ or ‘You know, what I think really needs to be fixed is …’ “
Those questions and scenarios are what the information offered on George-in-the-Box’s awareness screen is designed to convey. His calendar is displayed, as is the time in his time zone, his busy/free status, and a description of his current activity. Also listed is critical information for long-distance colleagues: the quality of service between hub and satellite.
“The intention,” Robertson says, “was to make it possible for somebody walking by the office to notice that I was available and have impromptu discussions.
“The availability awareness display could be improved. The activity-awareness dashboard we have now doesn’t capture all of the information we want to capture. The part that’s worked the best is improving meeting attendance.”
Still, says Venolia, the awareness work holds plenty of promise.
“This is maybe more aspirational than current,” she says, “but the degree to which we’re going to be pushing on work-activity awareness is very distinct. This concept of ‘this device is George and always George’ is definitely a novel thing.”
In addition to improving the awareness display, the team is also working on a set of challenges. Networking reliability is crucial for this kind of work, as is dependable wireless access, insignificant network latency, and the ability to support more than one remote user.
In recent moves, John Tang, a researcher based in Silicon Valley, also has joined the project as a satellite user. His cart sits alongside George’s in their own office in Microsoft Research’s Redmond headquarters. Work on multiple remote users is necessary for the project’s next step: a six-week summer trial with a handful of Microsoft product teams including remote workers to see how the technology changes communication patterns.
“I’ve made a bet here,” Venolia says. “I’ve made a bet that having a hotline to George is a good thing. I don’t know how that bet is going to play out. Should it be a hotline, or should it be a telephone? If it’s a hotline, you could imagine adding semi-autonomous robotics so that George could just go to his next meeting.
“On the other hand, rather than going robotic and drive the price through the ceiling, what if this was just something like a tea tray that was easily carryable and you could just place it at the end of a table?”
There are many tantalizing possibilities inherent in the technology.
“I would love to see either of these tried out in the home,” Venolia says. “What about Grandma? You’ve got an elder living independently, but you want here there for dinner. You want to hang with her and watch American Idol, because she loves that. That’s what you do together.”
Meanwhile, Robertson, lacking robotics, must count on the kindness of colleagues to transport his cart from room to room.
“Having somebody take on that role,” he says, “is critical to making this work.”
Among those happy to assist in helping George-in-a-Box traverse from office to meeting and back is Bongshin Lee, a VIBE researcher particularly interested in the awareness-display component of the project.
“One of the most important things,” Lee says, “is to have a dedicated office for him. “Before, we put him in one of the labs, and people didn’t interact with him much. But now, even though they don’t yet initiate an impromptu call with George often in this space, sometimes they check to see if George is available. And if the cart is not here, they wonder where George is.”
Having a human colleague represented in such a manner has its lighthearted moments. One of George’s handlers has talked about leaving the cart behind and feeling as if she had left a baby in the back seat of a car.
“These carts are still somewhat the person,” Venolia says, “even though they aren’t live. “What’s super-cool is that, when it works, it’s just transparent. You stop thinking about the technology, and you’re just there with these people. To me an interface is successful when you don’t know it’s there.”
Lee agrees.
“It helps us interact with George better and have George interact with us better,” she says. “It’s fair to say I feel this is more like a George, even though it’s a machine. It doesn’t feel awkward. It’s very natural.”
Team members—including Kori Inkpen Quinn, an expert on computer-supported cooperative work who has helped design the system and who is helping to evaluate it—meet regularly to discuss the Embodied Social Proxies project. Refinements continue.
For George Robertson, he with a bi-coastal daily presence, albeit it sometimes “in a box,” it’s all about the bottom line.
“When I’m in a meeting with the cart,” he says, “I feel much more engaged in the meeting, and I think people are more engaged with me. When they talk to me, they will turn and look at me in the cart, and I can tell that they’re doing that.
“These little social cues and being able to see how everybody in the room is reacting to what’s being said … Those are the things that make meetings work effectively, and the cart really does make that possible.”