CHI squared with Dr. Ken Hinckley and Dr. Meredith Ringel Morris

已发布

CHI squared with Dr. Ken Hinckley and Dr. Meredith Ringel Morris

Episode 74, May 1, 2019

If you want to know what’s going on in the world of human computer interaction research, or what’s new at the CHI Conference on Human Factors in Computing Systems, you should hang out with Dr. Ken Hinckley, a principal researcher and research manager in the EPIC group at Microsoft Research, and Dr. Merrie Ringel Morris, a principal researcher and research manager in the Ability group. Both are prolific HCI researchers who are seeking, from different angles, to augment the capability of technologies and improve the experiences people have with them.

On today’s podcast, we get to hang out with both Dr. Hinckley and Dr. Morris as they talk about life at the intersection of hardware, software and human potential, discuss how computers can enhance human lives, especially in some of the most marginalized populations, and share their unique approaches to designing and building technologies that really work for people and for society.

Related:


Final Transcript

Host: On today’s episode, we’re mixing it up, moving the chairs, and adding a mic, to bring you the perspectives of not one but two researchers on the topic of human computer interaction. We hope you’ll enjoy this first in a recurring series of multi-guest podcasts, where we dig into HCI research from more than one angle, offering a broader look at the wide range of ideas being explored within the labs of Microsoft Research.

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.

Host: If you want to know what’s going on in the world of human computer interaction research, or what’s new at the CHI Conference on Human Factors in Computing Systems, you should hang out with Dr. Ken Hinckley, a principal researcher and research manager in the EPIC group at Microsoft Research, and Dr. Merrie Ringel Morris, a principal researcher and research manager in the Ability group. Both are prolific HCI researchers who are seeking, from different angles, to augment the capability of technologies and improve the experiences people have with them.

On today’s podcast, we get to hang out with both Dr. Hinckley and Dr. Morris as they talk about life at the intersection of hardware, software and human potential, discuss how computers can enhance human lives, especially in some of the most marginalized populations, and share their unique approaches to designing and building technologies that really work for people and for society. That and much more on this episode of the Microsoft Research Podcast.

Host: So, I’m excited to be hosting two guests in the booth today, both working under the big umbrella of HCI or Human Computer Interaction. I’m here with Dr. Ken Hinckley, from MSR’s EPIC Group, which stands for Extended Perception Interaction and Cognition. And Dr. Meredith Ringel Morris, aka, Merrie, of the Ability Group. Ken and Merrie, welcome to the podcast!

Merrie Morris: Thank you.

Ken Hinckley: Thank you.

Host: So, I’ve just given our listeners about the shortest possible description of the work you do. So, before we dive deeper into your work, let’s start with a more personal take on what gets you up in the morning. Tell us in broad strokes about the research you do, why you do it and where you situate it. Merrie, why don’t you give us a start here?

Merrie Morris: Sure! One of the reasons that I’m excited about working in the space of HCI every day is that I think computers are cool, but what I think is most interesting about computers is how people use computers and how computers can help people and how computers can enhance our lives and capabilities. And so that’s why I think HCI is a really important area to be working in. When I did my undergraduate degree in computer science at Brown, they actually didn’t have any faculty in HCI and the first time I heard of HCI was when Bill Buxton, who is actually a researcher now at Microsoft, came and gave a guest lecture in my computer graphics course. And so, we had that one lecture where Bill talked about HCI, and when I heard that I was like, wow, this is the most interesting thing I’ve heard about in all my courses, so…

Host: Awesome.

Merrie: Then I, you know, found a way to work on that topic area more in graduate school.

Host: Ken, what about you? What gets you up in the morning?

Ken Hinckley: Sure. Well, I guess the first answer is my dog and my kids, often earlier than I would prefer, but beyond that, I was interested in technology at a young age, but it was not for the sake of computers themselves, but like what is this technology actually good for? And I remember going through a phase early, maybe sophomore year in college, I was very interested in artificial intelligence and so forth, but I started thinking about it and I was like well, why would you actually want to do that? Like why would you want to make a computer that’s like a human? It’s really more about the humans and how can we use technology to augment their thinking and their creativity to do amazing things? So that’s what excites me about being at Microsoft Research and having research that’s getting out in terms of devices and experiences where people can do amazing things with the tools and the techniques that we’re building.

Host: I want to set the stage for the research we’ll be talking about by getting a better understanding of the research groups your work comes out of. Ken, you work in the EPIC group. We referred to what that stands for. And you called your interdisciplinary team a “pastiche” of different expertise. Since we’re using French words, tell us more about EPIC and its raison d’etre.

Ken Hinckley: Yeah, I should have been cautious in my use of French words. I did study French in high school, but I’ve forgotten all of it and my accent is terrible. But the core sort of mission of EPIC is to innovate at the nexus of hardware, software and human potential, so there’s really sort of three pillars there. There’s the hardware, so we do sensors and devices and kind of new form factors. We explore that like in terms of new technologies coming down the pike. We also look in terms of the software, like what can we do in terms of new experiences? And actually, Merrie mentioned Bill Buxton earlier. He’s actually one of the people on the team. And so we think a lot about what is happening at this, you know, as new technologies come forward like how do they complement one another, how do they work together, how do we have not just a single experience, but experience that sort of flows across multiple devices and everything, all the technology you are touching in your day? It’s still this incredibly fragmented world and we’re trying to figure out interesting ways that we can sense or provide experiences that give you this flow across all the technology you’re touching in your day. And then we also study, in sort of the human side, we study the foundations of perception and motor behavior and like these very fine-grained things about how people actually perceive the world. And it turns out it’s, you know, there’s some interesting things that go on there in terms of how you form sensory impressions when you’re hearing something and seeing something and feeling something all at the same time, or maybe not, in a virtual environment, for example. So, there’s tricks you can play sometimes. So, we sort of explore that as well.

Host: I want to drill in just a little before we go on to Merrie. I’ve looked at the scope of your team. It’s a really diverse group of people in terms of where they come from, academically and in their research interests. Talk a little bit about the kinds of people on your team and that “pastiche” that you referred to.

Ken Hinckley: Sure. So, we have people, you know, people like Bill Buxton who come more from the design world. He started in electronic music, right? So, that was his motivation for technology back in the 1970s. We have people like Mar Gonzalez Franco who comes from a neuroscience background and gives us some of that you know real interesting insight in terms of how the human brain works. We have people from computer vision like Eyal Ofek, who’s leading a lot of work in virtual environments and haptic perceptions. So, I mean it really kind of spans all those areas. We also have people from information visualization, people from multiple countries, you know, Korea, France, Switzerland, so it really is like kind of people from all over the world and we really value that diversity of perspective that it brings in terms of what we bring to our work.

Host: Right. So, Merrie, last May, we talked about what was then the newly minted Ability group on this podcast. It’s episode 25 in the archives! But as it’s been about a year now since that came out, let’s review. Tell us what’s been going on over the last 12 months, and what the Ability group is up to now.

Merrie Morris: Absolutely. So, as you mentioned, we’ve already talked about the Ability team on another podcast and our primary mission is to create technologies that augment the capabilities of people with disabilities, whether that’s a long-term disability, a temporary disability or even a situational disability. And we have a lot of new projects that we’ve started in the past year. A big one is the Ability Initiative which is a collaboration that spans not only many groups within Microsoft Research, but also externally, to a partnership with UT Austin. And this project is focused on how we can develop improved AI systems that can combine computer vision and natural language generation to automatically caption photographs to be described to people who are blind or, in the situational disability context, maybe someone who is sighted but is using an audio-only interface to their computer like a smart speaker. And so, we’re trying to tackle problems like, how do we include additional kinds of detail in these captions that are missing from today’s systems? And also, the really important problem of how do we not only reduce error, but how do we convey that error to the end-user in a meaningful and an understandable way? So that’s a really big new initiative that we’ve been working on.

Host: Mm-hmm.

Merrie Morris: Another big initiative that we’ve just started up with some new personnel is starting to think about sign language technologies. So, there’s a new post-doc, Danielle Bragg who joined the Microsoft Research New England lab, and her background and experience is in sign language technologies like sign language dictionary development for people who are deaf and hard of hearing. So, actually, together with Danielle, and also Microsoft’s new AI for Accessibility funding program, we organized a workshop here at Microsoft Research this winter that brought together experts in deaf studies, sign language, linguistics, computer vision, machine translation… to have a summit to discuss kind of the state-of-the-art in sign language technologies, what are the challenge areas, what is a plan of action going forward, how can the Microsoft AI for Accessibility initiative accelerate the state-of-the-art in this area? So, that’s a really exciting new area of work that we’re interested in.

Host: Give me an example of where sign language technology plays – I’m actually trying to envision. How would it manifest for a person who had hearing disabilities?

Merrie Morris: Right now, the state-of-the-art is very far from realizing automatic translation between English and sign language. It’s a very complicated computer vision program…

Host: Yeah.

Merrie Morris: …and a very complicated machine translation program because, since sign language doesn’t have a written form, there are no existing paired corpora. So, for example, you know, many famous works of literature or newspaper articles, you know, exist online in say both English and French, since we’re talking about French today.

Host: Right.

Merrie Morris: And so, that gives you data from which you can begin to learn a translation. But since there’s no written data set of American Sign Language, that becomes very difficult to do. And so how you might generate training data is one of the challenges that we discussed at the workshop. But, if you could have a computer vision-based system, a machine translation system, that could recognize sign language and provide English captioning, that could be an important communication aid for many people. And then of course vice-versa, in the other direction, if you could go from English to generate a realistic, animated avatar that signed that same content…

Host: Ooh, interesting.

Merrie Morris: …that would actually be really important for many people because many people who speak sign language, sign language is their primary language. English is a second language for them. And so, their English literacy skills are often lower than people who learned English as a first language. So just English captions on videos, or English language content on the web, is often not very accessible to many people who speak sign language and signing avatar translations of that content would open up web accessibility. But developing these avatars to sign in a realistic fashion, and signing involves not only the hands, but facial expressions… so actually, generating that nuance in an avatar is a very challenging research problem.

Host: I’ve had a sizable number of your colleagues on the show. All of them have stressed the importance of understanding humans before designing technology for them. And to me this makes sense. You are in human computer interaction. Human being the first word there, but it’s remarkable how often it seems like nobody asked when you’re using a technology. So, give us your take on this user-first or user-centric approach that comes out of this umbrella of HCI, but particularly within your groups, why it’s important to get it right and how you go about that? Ken, why don’t you start?

Ken Hinckley: Sure. I do subscribe to the user-centric approach to interaction design and the human-centric, maybe, is the more particular word, because you really have to understand people before you can design technology that fills this impedance mismatch between what you can do with technology and sort of the level at which people operate and think about concepts. However, in my own research, I often take sort of a – I call it a pincer maneuver strategy, right? So, you think about, you know, what are the fundamental human capabilities, how do people perceive the world, how do you interact socially with other people? But then you can sort of couple it with, you see things coming down-the-road in terms of oh, there’s these new technologies coming out. There’s new sensors coming down the pike. And so, what I often try to do is I try to match these trends that I see converging. And when you can find something where, you know, say a new sensor technology meets some need that people have in terms of how they are interacting with the devices and you can sort of change the equation in terms of making that more natural or making the technologies sink into the background so you don’t even have to pay attention to it, but it just seems like it does the right thing.

Host: Right.

Ken Hinckley: I like to do that. So, I really do play both sides of the fence where I study people and I try to understand, as deeply as possible, what’s going on there. But I also study the technologies. So sometimes I do have work where it’s more technically motivated first. So, it’s a bit contrary in that sense, but it always ends up meeting like these real-world problems and real-world abilities that people have in trying to make technology as transparent as possible.

Host: Merrie, how about you?

Merrie Morris: I can give, actually, a great example from the sign language workshop that we were just talking about of the importance of this user-centered design. So, one of the important components of this workshop was having many people who are, themselves, deaf, who are sign language communicators, attend and participate in this ideation and this session. And one of the themes that came up several times that speaks to the importance of user-centered design was the example of sign language gloves. So, there are quite a few examples of very well-intentioned technologists who are not, themselves, signers and who didn’t necessarily follow a user-centered process who have invented and reinvented gloves that someone would wear that could measure their finger movements and then produce a translation into English of certain signs. But, of course, this approach doesn’t take into account the fact that sign involves many aspects of the body besides just the hands, for example. So that’s one pitfall. And then, of course, also from a more socio-cultural perspective, another pitfall of that approach is it thinks about sign language technology from the perspective of someone who’s hearing. So, it’s placing the burden of wearing this very awkward glove technology on the person who is deaf. Whereas, for example, maybe a different perspective would be people who are hearing should wear augmented reality glasses that would show them the captions and the people who are deaf wouldn’t be burdened by wearing any additional technology. And so I think that kind of perspective also is something that you can only gain by a really inclusive design process that maybe goes beyond just involving people as interview or user-study participants but also actually having a participatory design process that involves people from the target communities directly as co-creators of technology.

Host: Yeah. Ken, did you have something to add?

Ken Hinckley: Yeah. So, maybe one more thing I’d add in terms of my own perspective. So, I sort of mentioned how it’s really important to understand people and sort of what goes on, but I think one of the interesting or unique attributes of that is, I think we’re trained, here in Microsoft Research in terms of understanding people, is we sort of become these acute observers of “the seen but the unnoticed.” So, there’s lots of these sort of things that we just take for granted in terms of interpersonal interactions like, if you are at a dinner party and you are talking to a group of people, like well you are probably forming a small circle. And probably there’s five or less of you in the circle, right? And there’s a certain distance that you stand, and you’re facing each other in certain characteristic ways. None of this is something that we ever notice. Another example I like to use from my own work in sort of working on tablets and pen and touch interactions is actually just looking at how people use everyday implements with their hands. In my talks, I’ll often ask people like, okay, well which hand do you write with? And of course, you know, 75% of the audience will raise their right hand because they are right-handed and of course they will say well actually you are wrong because you use both hands to write because first you got to grab a piece of paper and then you orient it with your left hand. Then the right hand does the writing. So, there’s examples like that in terms of what people actually do with their behavior that, because we just take them for granted and it’s just part of our everyday experience, you don’t actually notice them. But as a technologist, you have to notice them and say like, oh, if we can actually design our interactions to be more like that, it would be natural. The technology would be transparent.

(music plays)

Host: Well, let’s talk about research now. Ken, I’ll start with you and maybe the best approach is to talk about the research problems you’re tackling around devices and form factors and modalities of interaction, and how that research is playing out in academic paper and project form, especially with some of the new work you’re doing with posture-aware interfaces. So, talk kind of big picture about some of the projects and papers that you’re doing.

Ken Hinckley: Yeah, so, in terms of the global problem we’re trying to solve, we’ve been thinking for a long time about these behaviors that are seen but unnoticed. Everyone goes through the day using maybe their mobile device or, if you have a tablet, you interact with that in certain ways. And there’s always sort of these little irritations that maybe you don’t really notice them, but over time they build up, and using technology, we can mediate some of these. So, to go back you know almost 20 years now, we were looking at mobile devices and you had to go through certain settings. You could go in the settings and say like, oh, I want it to be in portrait mode now if you are taking some notes, or maybe you needed to look at a chart, you had to go back into settings and say, oh, I want it to be in landscape orientation. And so we started looking at oh, you know, if we had some sensors, like a little accelerometer that was on the device, you know, maybe we could actually just sense which way you were holding the device and the screen could automatically rotate itself. I mean, so that was something you could publish as a, you know, award-winning paper in the year 2000, and now it’s like sort of in everyday use, right? So, taking that same perspective now to, you know, the modern era of interacting with tablets, how do you actually understand how the person’s using the device and what they are trying to do with it? So, tablets have this interesting attribute where you can use them directly on your desktop. Maybe you are doing some very focused work, you can kind of be leaning over it and writing with your pen, maybe marking up some document. You know, or maybe you are just kicked back on your couch and watching like You Tube videos of like cats chasing laser pointers, right? So, kind of in that whole spectrum. But obviously, it’s particular to the situation, right? So, if you are kicking back on your couch, you’re probably not trying to mark up your Excel spreadsheet. And likewise, if you are kind of hunched over your desk, you are probably not watching the cat video. So, by sensing those contexts and actually adapting the behavior of the tablet so it understands how you’re holding it, which hand you’re grasping it with. Are you holding it on the bottom? Are you holding it on the left or the right? Are you holding the pen in your hand? Are you reaching for the screen? By being able to sense all those attributes, we can actually simplify the experience and sort of bring things that are useful to you at those moments.

Host: Well, and it’s interesting that you referred to that toggle between landscape and portrait mode, which is actually annoying to me. I sometimes lock my phone because I don’t want it to go there. I want it to stay there! But anyway, that’s a side note…

Ken Hinckley: It’s a side note but it’s funny because we actually noticed that when we first built that technique and we kind of knew that was a problem with it, but actually, in our most recent research now, we can also sense how you’re gripping the phone and so you can understand that if you lay down and your grip on the device has not changed, we know that you didn’t intend to reorient it, so we can actually now suppress screen rotation finally, twenty years later!

Host: That’s exactly the problem, because I would lie down and want to read it sideways and then it would go like that with my head! Um… Well, Merrie, let’s talk about the work you’re doing in the accessibility space, particularly for people with low or no vision. We might expect to see this manifested in real life, I’d like to say IRL, or on the web or other screen applications. That’s just sort of how I think about it. But you’ve taken it even further to enhance accessibility in emerging, virtual reality, or VR, technologies, which is fascinating to me. So, give us the big picture-look at the papers and projects you’ve been working on and then tell us about Seeing VR, which is both the paper and the project as I understand it.

Merrie Morris: I think this project initiative around accessible virtual reality is trying to think okay, virtual reality isn’t that commonly in use right now, but in ten years it’s going to be a big deal. And we want to think about accessibility from the beginning…

Host: Nice…

Merrie Morris: …so that we can design these systems right the first time instead of trying to have post hoc accessibility fixes later. So, part of the aim of this work, beyond the specific prototypes, like the cane-troller or Seeing VR, is really to start a conversation and just raise people’s awareness and be provocative and have people think oh, we need to make VR accessible. So, if people only remember one thing, that’s what I want them to remember, separate from the details of the specific projects. But yes, so last year, we presented, at the CHI Conference, the cane-troller which was a novel haptic device that allows people who are blind, who are white cane users IRL, to transfer their real-life skills into the virtual environment…

Host: Wow.

Merrie Morris: …in order to navigate and actually physically walk around a VR environment with haptic sensations that mimic the sensations they’d get from a white cane. And then that project was led by our fabulous intern, Yuhang Zhao from Cornell Tech, and also with strong contributions from our intern Cindy Bennett from the University of Washington, who is, herself a white cane user. So, she had some really important design insights for that project. And so Yuhang returned last summer for a second internship and she wanted to extend the accessibility experience in VR based on her passion around interfaces for people with low vision. So, low vision affects more than 200 million people worldwide and it refers to people who are not fully blind, but who have a visual condition that can’t be corrected by glasses or contact lenses. And so, working together with Yuhang, and also several people from Ken’s team like Eyal Ofek and Mike Sinclair, as well as Andy Wilson and Christian Holtz from MSR AI, and also Ed Cutrell from my team, this was really a big effort, we developed Seeing VR. And so, Seeing VR is a toolkit of fourteen different options that can be combined in different ways to make VR more accessible. And we went with this toolkit approach because low vision encompasses a wide range of abilities. And so, we wanted people to be able to select and combine tools that best met their own visual needs. And the great thing about Seeing VR is that most of these tools can actually be applied post hoc to any Unity VR application. So, Unity is the most popular language for developing VR. And so even if the application developer hadn’t thought about accessibility beforehand, we can still apply these tools. And so for example, the tools do things like increasing the magnification of the VR scene, changing the contrast, adding edge detection around things so that you can more easily tell the borders of different objects, being able to point to objects and hear them described out loud to you. Special tools for helping you measure the depth of different objects in case you have challenges with depth of perception. And so we’ll be presenting Seeing VR at CHI this year, and we’ll not only be presenting the paper, but we’ll also present a demonstration, so people who want to actually come and try on the VR headset and experience the tools directly will be able to do so.

Host: Well that’s a perfect segue into talking about CHI, the conference. It’s a big conference for HCI, maybe the biggest?

Merrie Morris: Yes, it is!

Host: And Microsoft Research typically has a really big presence there. So, talk about the conference itself and why it’s a big deal, and then give us an overview of what MSR is bringing to the party in Scotland this year in the form of people, papers and presentations.

Ken Hinckley: Sure. So, in terms of you know some of the papers that people on my own team touched, yeah, we have a couple of the Honorable Mention awards which just sort of recognizes the top 5% of papers appearing at the conference. So, one in particular, we talked about the, you know, sensing posture awareness in tablets. Another one that’s a really fun effort is using ink as a way to think with data, right? So, “think by inking” kind of thing. So if you have some sort of visualization that you are looking at in your screen, what if you can just sort of mark up some data points and, simply by marking it up as you are thinking about and just sort of glancing at these visualizations? And so then you take sort of these simple marks that people would do anyway in terms of using you know, very pen and paper like behaviors, but now translated to a digital context on your tablet for example, I can just mark something up with my pen and then I can use those marks as ways to actually link the data points back to the underlying data. Can I actually split my data set just by drawing a line across it as opposed to doing some complicated formula? And that’s our Active Ink project. So, instead of just having you know, sort of dead ink on paper, you can actually imbue it with this digital life that actually, you know, just naturally how people think, but then you can just start going deeper and deeper in terms of it actually touches live data that’s underneath.

Host: Go ahead, Merrie.

Merrie Morris: What I wanted to point out was the paper at CHI from Microsoft Research that I find most exciting this year is one of the Best Paper award winners, The Guidelines for Human AI Interaction Work, that was led by Saleema Amershi, who is a researcher here in the Redmond lab. I’m really glad this paper won the Best Paper award because it has a lot of immediate practical value for people in the HCI and AI community. So, if you are going to read one paper from CHI this year, listening audience, I’d suggest you read this one. It has a list of eighteen very concrete, actionable guidelines for developing AI systems that are meant to be used by real people…

Host: Yeah.

Merrie Morris: …in a way that’s pleasing to the end-user and it has lots of good examples and counter examples of applying each of these guidelines. So, I think it’s a really great read and I’m glad it was recognized with the award.

Host: Any other highlights that you want to talk about? I mean, you’ve got colleagues… Steven Drucker has a couple of papers I think in there and Cecily Morrison who is out of the UK lab and…

Ken Hinckley: I mean, there’s tons of great stuff. In a sense, it’s an embarrassment of riches because even myself being at Microsoft Research, I haven’t had a chance to look at all this work, um, and I haven’t looked at the HCI-AI paper that Merrie just mentioned, so now it’s like I’ve learned something from this podcast, too. I need to go read this. Yeah, in terms of other areas that we’re addressing, there’s actually quite a bit of work on virtual environments coming from Microsoft Research, so just looking at different ways that we can manipulate things in VR in very unique and sort of crazy and creative ways. So, we have work exploring that. We also have work exploring in terms of if you are in a virtual environment and you actually want to reach out and touch something, your hand is just in empty space, so how do you give the illusion that there’s actually objects you can interact with and give you sort of dexterous ways to manipulate them? So, we’ve been doing a series of technologies around how to simulate the, you know, grasping motions for example. So, you can actually feel an object and you can squeeze it and it feels like it has compliance. We have other work looking at personal productivity when you have wearable devices. And just numerous other topics. Even in terms of like oh, can you use the language of graphic novels as sort of a way to present visualizations to people? So, you sort of have these data comics or “data toons” we call them. Just exploring that as sort of another language for interacting with data. So, there’s all kinds of great stuff going on.

Merrie Morris: So, you mentioned Cecily Morrison. So, Cecily and her collaborators from the UK lab will be presenting a demo at CHI this year of Code Jumper, which I believe you interviewed her about in another podcast…

Host: And it wasn’t called Code Jumper then…

Merrie Morris: …it was Project Torino then. It’s been renamed. But that’s a tangible toolkit for children who are blind that they can use to actually learn programming skills themselves. And so, that will be on demo at CHI for people to try out. I think, besides all the papers, another thing to point out is that many people from Microsoft Research are involved in organizing workshops. And so, one that I’ll call out is there’s a workshop, it’s called The Workshop on Hacking Blind Navigation, that discusses the challenges of developing indoor and outdoor navigation technology for people who are visually impaired. And Ed Cutrell, from the Ability team, is one of the co-organizers of that workshop. But that workshop is going to feature a panel discussion that includes Melanie Kneisel, who is an engineer on the Soundscape team here. And so, as you may know, Soundscape is a Microsoft Research app that uses spatial audio to provide a unique outdoor navigation experience for people who are blind. So, Melanie will be speaking at that workshop and then the workshop will also be featuring some research from an intern, Annie Ross, from the University of Washington, who was co-advised between the Soundscape team and the Ability team. And what Annie was looking at was a very different take on accessible virtual reality. So, she was thinking about designing an audio-only virtual reality experience that would allow a Soundscape user to practice walking a route in the comfort of their own home or office before they actually walked that route in the real world. And it’s a very challenging design problem and so she’ll be discussing that at the workshop also.

(music plays)

Host: So, it’s about now that I always ask the infamous “what keeps you up at night?” question. Your work has direct human impact, perhaps more directly than other research that we might encounter. And often with people who are marginalized when it comes to technology. Merrie, I’d like you to start this one since I know you’re involved in work around AI ethics and some of the work that’s going on in the broader community about that. So, what are your biggest concerns right now and how and with whom are you working to address them?

Merrie Morris: Yes, so there’s been increased awareness, both in the industry and in the public consciousness, over the past year or so about some of the fairness and ethics challenges of AI systems. But a lot of that discussion, rightfully so, has focused on issues around gender and race. But one thing that I’ve noticed in these conversations is that there hasn’t been any discussion around these issues as far as how they might impact people with disabilities.

Host: Right.

Merrie Morris: And one goal of mine for the coming year is to really elevate awareness and conversation around this issue. So, for example, are people with disabilities being included in training data? Because if they are not, AI systems are only as good as the data. So, as an example, if you think about smart speaker systems like an Alexa or a Siri or a Cortana, do they recognize speech input from someone with a speech disability? So, if you have dysarthria or stutter or some other atypical speech pattern and the answer in general is no, because people with those speech patterns weren’t included in the training data set. And so that locks them out of access to these AI systems.

Host: Well you might want to add children. There’s a video about a little girl asking Alexa to play Baby Shark, and never, ever gets it until the mom comes in and says it in the right way.

Merrie Morris: Well, absolutely! And that goes back to a longer historical issue. The original speech recognition systems were trained only on men, so they didn’t even recognize women’s voices at all, if you want to look back several decades. So yes, there are many groups. People with accents, children, that all come to mind. And so, that’s a big issue. I think another example of an ethical issue, particularly for sensitive populations like people with disabilities, is expectation setting around the near-term capabilities of AI systems. I think a lot of the language that’s used by researchers and marketing people when we talk about AI, leads people to overestimate the capabilities of AI systems.

Host: Right.

Merrie Morris: I mean, even just the word artificial intelligence suggests a human-like semantic understanding which, really, today’s AI systems are all about deep learning, it’s all just statistical pattern recognition with no semantics at all. And so, I think, if you want to go back to the sign language example, I feel like every month I see some kind of headline that says, you know, researchers have invented a sign language recognition system. But then if you actually read between the lines, you know, they’ve invented a system that can recognize fifty isolated signs, which is nowhere near recognizing a whole language.

Host: Right.

Merrie Morris: And I think for people whose lives may be fundamentally changed by these kinds of technologies, being careful about how we communicate the capabilities of AI to those audiences is really important. So, in terms of going forward with this issue of raising awareness around issues around disability and AI fairness, I’ve been collaborating with Megan Lawrence who is from the office of the Chief Accessibility Officer, as well as Kate Crawford from the New York City lab. We’ve been organizing some external events to raise awareness around this issue. Megan led a panel discussion at the recent CSUN conference, and the three of us co-hosted a workshop last month at the AI Now Institute, so I think you’ll be hearing more about this topic in the future.

Host: Ken, aside from having a French test that you didn’t study for, is there anything that keeps you up at night? Anything that we should be concerned about?

Ken Hinckley: Yeah, my research, it doesn’t keep me up at night. It’s kind of, I’m really looking for what are some unique ways that we can add to human potential? So, for me, it’s a very satisfying and very exciting topic to think about. So, it doesn’t keep me awake, it kind of gets me excited at night. Or maybe it keeps me up because I’m just… my mind keeps buzzing with creative new ways that we can use technology to kind of amplify human thought, to make it so people can express new and better, or maybe more expressive, concepts than they would you know with pen and paper and physical materials. I often like to return to, you know, Kransberg’s laws as sort of a way to think about this, and particularly the first one where, um, you know, basically, the gist of it is that technology is not inherently good, nor is it bad, but nor is it neutral, right? So, it just kind of exists out there and it’s often you know human desires and needs and those shape how technologies get applied and we need to think really carefully about that when we’re bringing something new into the world. So yeah, so I don’t really like the approach of like oh, let’s just build something and kind of hope it works out. You really need to do think carefully through like what is the human impact and is this actually going to empower people and make it so they can do new things?

Host: I love stories. And one I’d like to hear is how you two know each other. And another is how you each ended up at Microsoft Research. So, Ken, tell us your story, and then we’ll have Merrie talk about how she, in her own words, “wormed her way” in here.

Ken Hinckley: Yeah, so I was a graduate student at the University of Virginia and I was a computer science student but my desk was actually situated in the neurosurgery department and most of my funding was through some grants that the neurosurgery department had because they had these really interesting medical image data sets of, you know, MRI scans of patients who were coming in, and the lab that I was working with were actually working on ways to create new tools for neurosurgeons so they could actually access these things directly and look at them. So, they were looking for tools to visualize and plan surgeries. So that was my entre into the world of HCI and research. So, George Robertson had been a sort of one of the pioneers or 3D interaction at Xerox PARC. He had moved over to Microsoft Research. He was starting up a team here. And so, he was like, hey, Ken, do you want to come work with us on these things? So, we started you know doing that, and just over time, you know, because there’s so many great things going on here in Microsoft Research, there’s lots of opportunities to jump to different topics. So, I started working with Mike Sinclair and he was saying like hey, look, I got this new accelerometer sensor and maybe you should try putting that on your phone stuff that you are looking at! And so that’s where this automatic screen rotation came from, for example. Ken Hinckley: So that’s where this automatic screen rotation came from for example. So that’s sort of how I ended up here and on this track of exploring devices and so forth.

Merrie Morris: So that’s actually a great segue into my story. The automatic screen rotation. So that’s how I met Ken and I don’t even know if Ken knows this story because it was very important to me at the time. So, I was a student at Stanford in the computer science department, doing my PhD and it was my first year of school, and Ken was an invited speaker for the seminar series. And Ken spoke about this work that he and Mike had done on attaching the sensors to the phone with the rotation. And it was an amazing talk and amazing project, but also it was the first time I had heard of Microsoft Research because I was a brand-new grad student. And so, I was really impressed. And after that talk I thought oh, I really want to get an internship at Microsoft Research. But it’s a very competitive internship, and usually they want students who are further along in their career, who already have a track record of publication. So, I wasn’t really a competitive applicant on paper. And so, then a couple months later, I attended the HCIC workshop and at this workshop I met a different Microsoft employee who invited me to come interview for a more product-oriented internship. And so, they were going to fly me up from California to Seattle. So, I took advantage of the fact I was going to be in Seattle, and I mailed Ken, and I said, Ken, you know, I’m a student at Stanford. I heard your talk. It turns out that I’m going to be in the Seattle area next week. While I’m there, could I come talk to you and your group about internship opportunities? And so I kind of snuck my way in the back door to having this on-site interview at the time and I did get the internship and I ended up actually spending my time that summer working with Eric Horvitz and Susan Dumais, who were part of the same team, but it was because of seeing Ken’s talk that I managed to “worm my way” in to Microsoft Research!

Host: And then you went back to finish your studies and then came back here? How did that go?

Merrie Morris: Right. Because I had such a great experience at that internship, I was very focused on becoming a professor of computer science, but I had a great experience at my Microsoft Research internship and so Eric and Sue and all of the people that I worked with really encouraged me to come and also interview with Microsoft Research, and so that’s how I ended up back here.

Host: Ken, what do you remember of that?

Ken Hinckley: That’s fantastic. I do remember talking with Merrie after I gave this talk. Her advisor, Terry Winograd, had invited me down to give a talk, another sort of wonderful person in our field. But yeah, but I was definitely impressed by Merrie. She asked me a lot of great questions after the talk and, you know so, sometimes you get students coming up afterwards but then they don’t really say anything that kind of gets to the next level. So, I remember Merrie asking a lot of really good questions. And so, I was very interested in bringing her in as an intern, but as it turned out, I didn’t actually have an intern slot to offer anybody that summer. So, it’s great that she sort of managed to work her way into actually getting in the building. And I think Sue was the one who had the internship to offer. And Sue Dumais, I don’t know if she’s been on these podcasts or not, but she’s also wonderful and sort of one of the people here at Microsoft Research that I look up to the most. So, it was even better than the opportunity with me, but working with Sue is wonderful and Merrie has just continued to blaze new trails in just every topic that she touches, so, it’s been great.

Host: Merrie, what’s fascinating about what you just said – and you are not the first one who has told a story about a kind of “bold move” – it’s like take a risk. Why not? Right?

Merrie Morris: Absolutely. That’s how I “wormed my way” into grad school, too! I had heard Bill Buxton’s lecture about HCI. We didn’t have HCI at Brown. I did some searching about HCI on the internet and came across one of Terry Winograd’s projects at Stanford. And so, just out of the blue, I emailed him and said, I read about your project on the internet. I’m an undergraduate in computer science. I really want to get to know about HCI research, can I come spend the summer with you working? And he said yes! And that was how I got involved with HCI research. Did a good job. He offered me to come back for graduate school. But if I hadn’t sent that email, it never would have happened. So, I guess the lesson is be bold and advocate for yourself!

Host: Well, you kind of just answered the last question I’m going to ask you, which is at the end of every podcast, I give my guests a chance to talk to our listeners with some parting thoughts, maybe wisdom, advice and stories about how being bold got you what you wanted. So, here’s your chance to say something you believe is important to researchers who might be interested in making humans and computers have better interactions. Ken, what do you think?

Ken Hinckley: I think for me, in terms of like what I see working well and having good research outcomes, I see lots of people, they try to sort of plan out this grand scheme of like oh, I’m going to do this and this and this in my PhD, and you kind of outline everything to death. But where the really exciting sort of non-linear breakthroughs come in is, you just start walking down this path and, at some point, you’re kind of beyond the area that is where the lights are from the city, right? And you’re kind of out in the wilderness, in the darkness, and you are like, well, I’m not quite sure where to go! But you just start trying things, right? You basically just try ideas, see what works, see what doesn’t work, and then from that, you take the next step. And if you’re kind of willing to just take this approach where, don’t try to plan everything out but be willing to walk off into the darkness in terms of just exploring unknown terrain, that’s where the really interesting things come up.

Host: Merrie, what do you say?

Merrie Morris: I guess my advice would be around just being curious and as part of that, asking questions. I know that for years, I think it was at least five years after I had my PhD before I felt confident enough to ask a question during a talk, instead of just waiting afterwards to talk to the speaker one-on-one, because I was worried like, ah, if I ask this question, will people think I’m stupid or they’ll think I don’t understand. And after I asked questions, you know, I realized that like well, lots of other people had the same question as me. So, not only did I learn new things, but I helped enrich the dialogue at the conference and the conference experience for other people and, a lot of times, asking these questions leads to really good conversations and collaborations and new research ideas afterwards. So, I think my advice is to ask questions.

Host: That’s what research is all about, isn’t it? Ken Hinckley, Merrie Ringel Morris, thank you so much for joining us today. It’s been really fun to have two people who play off each other and have great stories. Thanks for coming on the podcast.

Merrie Morris: Thanks for having us.

Ken Hinckley: Yes, thanks very much.

(music plays)

To learn more about Dr. Ken Hinckley and Dr. Merrie Ringel Morris, and their up-to-the-minute research in human computer interaction, visit Microsoft.com/research

继续阅读

查看所有播客