Microsoft’s AI Transformation, Project Turing and smarter search with Rangan Majumder

已发布

Headshot of Rangan Majumder for the Microsoft Research Podcast

Episode 112 | March 25, 2020

Rangan Majumder (opens in new tab) is the Partner Group Program Manager of Microsoft’s Search and AI, and he has a simple goal: to make the world smarter and more productive. But nobody said simple was easy, so he and his team are working on better – and faster – ways to help you find the information you’re looking for, anywhere you’re looking for it.

Today, Rangan talks about how three big trends have changed the way Microsoft is building – and sharing – AI stacks across product groups. He also tells us about Project Turing (opens in new tab), an internal deep learning moonshot that aims to harness the resources of the web and bring the power of deep learning to a search box near you.

Related:


Transcript

Rangan Majumder: At the time, deep learning was really impressive in terms of these perception tasks like vision, you know, speech… so we were thinking, like, could it be really good at these other higher level tasks like language? So, that’s when we started Project Turing, and the idea was, what if we could do, like, end-to-end deep learning across the entire web to be able to answer these questions?

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.

Host: Rangan Majumder is the Partner Group Program Manager of Microsoft’s Search and AI, and he has a simple goal: to make the world smarter and more productive. But nobody said simple was easy, so he and his team are working on better – and faster – ways to help you find the information you’re looking for, anywhere you’re looking for it.

Today, Rangan talks about how three big trends have changed the way Microsoft is building – and sharing – AI stacks across product groups. He also tells us about Project Turing, an internal deep learning moonshot that aims to harness the resources of the web and bring the power of deep learning to a search box near you. That and much more on this episode of the Microsoft Research Podcast.

Host: Rangan Majumder, welcome to the podcast.

Rangan Majumder: Thank you. It’s great to be here.

Host: So you’re a different kind of guest here in the booth. You’re a Partner Group Program Manager over in Search and AI at Microsoft. Let’s start by situating your group and its work since you’re not in Microsoft Research per se, but you do a lot of work with the folks here. How and where do you “roll up” as they say?

Rangan Majumder: Yeah, great question. So, as you know, the broader organization is called Microsoft AI and Research, and Microsoft Research is one of the sub-groups there. So another sister team of Microsoft Research is the Bing Search team, and my group is actually Search and AI, which is inside of Bing. So we’re a sister team to Microsoft Research. And it’s really great to be on this team because what we get to do is work closely with Microsoft researchers and then productionize some of their great research efforts…

Host: Yeah.

Rangan Majumder: …put it into production, and then, once we get it to work at scale in Bing, we can actually go take that technology and place it elsewhere, like in Office and Dynamics and other parts of Microsoft.

Host: Right. So I’m getting the visual of those nesting dolls with each part going inside the other. So top big doll is Microsoft AI and Research.

Rangan Majumder: Correct.

Host: And then Microsoft Research is part of that.

Rangan Majumder: Right.

Host: And Bing is part of that.

Rangan Majumder: That’s right.

Host: And then your group, Search and AI, is nested within the Bing group.

Rangan Majumder: That’s correct.

Host: Okay, and who do you all roll up to?

Rangan Majumder: Kevin Scott, who is our CTO…

Host: Okay.

Rangan Majumder: …and also EVP.

Host: Got it. Alright, well let’s talk about what you all do in Search and AI and now that we’ve situated you. You have a delightfully short, but incredibly ambitious mission statement. Tell us what it is and, if you can, what’s your high-level strategy for making it real?

Rangan Majumder: Yeah, so our mission statement is to make the world smarter and more productive. And you’ll notice that our mission statement doesn’t just talk about search because search is obviously the biggest thing that we do, but it’s important to understand what is the underlying user need for why people are searching, and it’s really to learn about something or to get something done, right? So people want to learn a lot about what’s happening with the coronavirus today. So that’s an example of how our technology helps make people smarter so they can know what’s going on in the world. An example of where we’re helping make people productive is something like when, you know, I got my sink clogged, right? So that’s something I just want to learn really quickly like, how do I unclog my sink? So that’s an example of productivity. So, the reason you need to understand the underlying user need versus like how they do it today is, the solutions actually change over time.

Host: Okay.

Rangan Majumder: So, we want to be really close to, what is the user need that people have? And then our technology helps provide that and satisfy that need. As I said, the mission is about making the world smarter and more productive. If we just focused on the users on Bing, we can still have a lot of impact. But if you look at the entire pie of customers from Microsoft, there’s a lot more we can do. So that’s where we’ve been working a lot with Office, taking our AI technology and not just bringing it to Bing, but bringing it to Office. So that’s an example where we increase the number of people we can impact by, like, a billion.

Host: Right.

Rangan Majumder: Because there’s a lot more users using Word. And then, if you think about, like, Azure becoming, you know, the world’s computer. So there’s a lot more impact we could have by bringing our technology into Azure as well.

Host: Well let’s talk about what gets you up in the morning. In your role as a Partner Group Program Manager for Search and AI, do you have a personal mission, or a personal passion, is maybe a better way to put it?

Rangan Majumder: Yeah, well as any program manager, your goal is really to maximize product/market fit, but my personal mission is basically the same as my team’s mission, which is really around making the world smarter and more productive, and if you just look at what’s happening today, which is, people are finding new ways to look for information, right? Like ten years ago was all about search. Like, people just kind of typed in words. But now people want to find stuff more naturally like smart assistants, people just want to ask a question, you know, in an ambient room and get the answer.

Host: Mm-hmm.

Rangan Majumder: People want to be able to take a picture of a flower and say, hey, what is this flower? How do I take care of this? Then the amount of information is changing too, so people aren’t just writing web pages like they were ten years ago. People are now taking pictures, uploading photos, uploading videos. So going back to my example of, you know, how do I unclog a sink? You don’t just want a web page walking through the steps. Sometimes you want a video…

Host: Right.

Rangan Majumder: …that just shows you, hey, here’s how I unclog a sink. So I think there’s just a lot to do in that mission, and something that I feel like we’ll be doing for like easily a decade or more.

Host: You know, as you brought up the flower and taking a picture of it, I’m thinking of this music app that I use, Shazam, where you find out what a song is. I’ve often said I want Shazam for all these different categories. What’s that tree? I don’t even know what it is, but if I take a picture of it could you tell me? Are you guys working on stuff like that too?

Rangan Majumder: Uh, we’ve actually shipped it already! So if you go install the Bing app you can actually go take a… I’ve done this when I moved into my new house, like there were these flowers and I’m like what are these flowers? They look really interesting. I could take a picture of it and it tells you what it is, and then you can find out more information. So plants, dogs, those kinds of things, the Bing Visual Search app (opens in new tab) does really well so, go install it today and try it out!

Host: Well, much of what we call AI is still work in progress, and there are some fundamentally different ways of doing AI within a company, especially one as large as Microsoft, so give us a snapshot of how product groups have traditionally approached building AI stacks and then tell us some of the big trends that you’ve noted in the science of AI that have enabled a disruption in that approach.

Rangan Majumder: I think this is probably the most exciting thing happening at Microsoft today. So the way we’re doing AI is definitely transforming. If you think about how we used to do AI, maybe five years ago, we would have multiple different product groups doing AI kind of independently, and for the most part didn’t share anything. But there’s three trends that have really been changing that. The first trend is really around transfer learning, which is this concept that, as you train a model on one type of data and one set of tasks, you can actually reuse that model for other tasks and it does, sometimes, even better than it would if you just trained it on that task specifically.

Host: Huh.

Rangan Majumder: The second one that’s happening is this trend with large pre-trained models. I think there’s a couple of these out there, right, like you probably heard about Open AI’s GPT, Google has BERT, Microsoft has its MT-DNN. So you can take these models and just train them on a bunch of data in a self-supervised way, it’s called, make it very large, and then you can actually apply it on lots of other tasks and it just does phenomenal. Just to give you an example, like, let’s say the Search team was about a hundred people and they’re working on various parts of search all the time so what we did is take about ten folks and said, okay, I want you guys to look at these large transformer networks and see what kind of impact could you have. So in just, like, a few months they were able to ship an improvement so large that it was larger than all the other, like, ninety folks, all the work they did, combined. So we were just, like, shocked how important and how impactful this kind of work was.

Host: Right.

Rangan Majumder: So much so that, at first, we thought, well, does that mean we don’t need these other ninety folks? We can just work on these ten folks? But instead we really embraced it and we said well, let’s get all hundred folks working on these large transformer networks. And then, in the end, like we just had a wave of improvements over the last six months of just like improvement after improvement equally as impactful as the one we had before, so this is a really big trend right now in these large pre-trained models.

Host: Okay.

Rangan Majumder: The third trend is really around the culture of Microsoft and how it’s changing. And this really started with Satya when he became CEO. He really has been focused on changing the culture and making it a lot more collaborative. In fact, he’s changed incentive structure in the team, so when you’re actually going through a performance review, it’s not just about, you know, what did you do? But it’s about, how did you use someone else’s work or how did you contribute to someone else’s work? The other, like, person who’s really changed a lot is Kevin Scott, our CTO. So he did a bunch of AI reviews and realized like there’s a lot of teams doing similar stuff, but some teams are a little bit better than others. So why don’t we do this work in a coordinated way? So when you take those three trends together, what we’re doing is, we’re starting to build this coordinated AI stack across Microsoft where we have certain teams saying, look, we are going to build these really large NLP models for the company, not just ourselves, because the problem is, if each team tried to do that, it would be just way too costly, and then, through transfer learning, I can now reuse this model in other parts. So the stack is kind of looking like this: at the very top you have applications like Bing, the different Office apps, you know, Dynamics, Azure Cognitive Services. The layer underneath is a bunch of these pre-trained models. Like we have one called the Turing Neural Language Representation, we’ve got Language Generation, we’ve got these vision models… The layer underneath is these software systems, which can actually run these models really, really fast because the models are very big and they’re very expensive.

Host: Yeah.

Rangan Majumder: So if you, if you run them in a naïve way, it would just, like, take too long and you’d hurt the customer experience so you need to actually do a lot of software optimizations. And then the final layer is around the hardware, so that’s around like CPU, GPUs and we even have our own little effort on chips with FPGAs.

(music plays)

Host: I want to talk a little bit about four big areas you’ve identified as important to progress and innovation in Search and AI and you’ve sort of labeled them web search, question answering, multi-media, and platform. So why are each of these areas important, especially as it relates to the customer experience, and what innovations are you exploring as you seek to improve that experience?

Rangan Majumder: Yeah, so I would say, about five years ago these things seemed pretty different. Like web search, question answering, multi-media and then the platform team would sort of support all those teams. I’ve noticed, and now you’ll see it more and more, that these experiences are very integrated. So if you go to Bing today and you search for, what do alligators eat? You’ll see, at the very top, an answer that says, you know, alligators eat things like fish, turtles, birds… but then you’ll also see an image there, sort of fused in with that answer, because an image actually helps you really get the emotional part. So just reading it is one thing, but humans also need that emotional experience, so by showing that image right next to the answer, it just makes the answer come to life.

Host: Right.

Rangan Majumder: So that’s one way where these things are kind of related. Like the experience, putting them all together, makes it much better for the customer…

Host: Right.

Rangan Majumder: …but also the technology stacks are becoming very similar too, especially with deep learning. So with deep learning, it’s mostly operating on vectors. So the first step in all of these systems, whether it’s question answering, web search and multi-media, is really taking this corpus of information and converting it to vectors using an encoder. So that part is pretty different for each one, but then, once you have this corpus of vectors, the rest of the stack is very similar. Like the first thing you do when a query comes in is, you do a vector search to say, all right, what are the most similar vectors here? And then you run a cascade of different deep learning models, and each one gets heavier and a little bit more costly, and that’s what’s been super interesting, where, before, each team had its own very different stack, but with deep learning and everything just betting on this vectors, there’s just a few services I need to build really, really well. One is this inference service which is, you know, given some content, vectorize it really quick. The other one is this vector search service which is, given a set of vectors how do I search them extremely fast?

Host: Your team has been involved in achieving several impressive milestones over the past five years. So take us on a little guided tour of that timeline and tell us about the challenges you face, along with the rewards that you reap when you try to bring research milestones into production.

Rangan Majumder: So first, I think a lot of the milestones, I have to give most of the credit to Microsoft Research because they’re the ones really leading the way on pushing the state-of-the-art on those benchmarks. Like our team doesn’t really focus too much on the academic benchmarks. So, ever since we went on this mission of, let’s really push deep learning for NLP, the first academic data set that came out that was really aligned with that mission was by Stanford called the Stanford Question Answering Dataset, SQuAD. So it came out around 2016 and Microsoft Research Asia was actually at the top of the leader board, like, throughout its existence. So for, like 2016, 2017, they kept building better and better models until around 2018 they actually achieved human parity, which is just a big milestone in general when you have these academic benchmarks. I think that was like one of the most exciting milestones around the natural language space, that we were able to achieve human parity on this SQuAD data set (opens in new tab)

Host: Right.

Rangan Majumder: …within two years. And then, I think around 2018, another dataset came out, which is Conversational Question Answering. A year later, 2019, once again Microsoft Research Asia, along with some other folks in, I think, XD’s Speech Team was able to…

Host: Yeah.

Rangan Majumder: …achieve human parity on that. Around that same time there was this GLUE benchmark, which was also very interesting.

Host: And GLUE stands for?

Rangan Majumder: General Language Understanding benchmark. So they had, I think, ten very different natural language tasks. So they thought, well this one’s going to be very hard. If we can build one model that can do well on all ten of these, that’s going to be pretty impressive and once again, in a year, Microsoft Research was able to do that.

Host: Unbelievable.

Rangan Majumder: So that’s where they came up with this MT-DNN model.

Host: Which stands for?

Rangan Majumder: Multi-Task Deep Neural Network.

Host: Right.

Rangan Majumder: Yeah. So basically, like, in language, Microsoft Research has been doing a really awesome job.

Host: Yeah.

Rangan Majumder: And while they’re doing that, our team is just taking those models and productionizing them. And what’s interesting is, just because you do well on academic tasks doesn’t necessarily mean it’s really ready to be shipped into production. And the first big learning was with the SQuAD dataset…

Host: Yeah.

Rangan Majumder: …which I talked about back in 2016, 2018. So the model they used there was called Reading Net or R-Net.

Host: Mm-hmm.

Rangan Majumder: And we realized that data set they had was a little bit biased because every – like the way this data set works is, you have a question and you have, like, a passage and you’re basically trying to answer the question, but their entire dataset was guaranteed that every question has an answer.

Host: Hmm.

Rangan Majumder: But in a production context, when people are asking questions to the search engine, not every question has an answer. And in fact, some questions shouldn’t be answered at all, right? So we need to actually also add unanswerable questions.

Host: Well, I want to talk about a project that you’ve been involved in called Project Turing, named after the eponymous Alan Turing, and you call it your internal deep learning moonshot, which I love! What was the motivation and inspiration behind that project and what are some of the cool products, or product features, that have come out of that work?

Rangan Majumder: Yeah, so Project Turing was started about 2016. The motivation for it was, we were doing a bunch of analysis on, basically, the types of queries we were getting. And there was one segment that really stood out because it was the fastest growing segment of queries. It was question queries. So people were no longer just typing in key words, they were asking questions to a search engine. So, like, instead of people typing in, you know, fishing license, they would say like, fishing age in Washington, right? What is the fishing age in Washington when I could go fish? So we looked at that and we thought well, people just don’t want to click on a web page, they just want you to find the answer for them.

Host: Right.

Rangan Majumder: And then, many times, the words that were in the question and the words that were actually in the answer were very different.

Host: Right.

Rangan Majumder: So the previous approach, which was, like, let’s just do key word matching, was not going to work. We had to match at a different level, at the semantic level. So, at the time, deep learning was really impressive in terms of these perception tasks like vision, you know, speech… so we were thinking like could it be really good at these other higher level tasks like language? So, that’s when we started Project Turing, and the idea was, what if we could do, like, end-to-end deep learning across the entire web to be able to answer these questions? And it basically completely changed our search architecture to be able to do this kind of thing. And that’s why it was a moonshot. So today, every time you issue a query, we’re running deep learning across, basically, the entire web to get that answer. And if we didn’t use deep learning, we wouldn’t be able to answer a lot of these questions…

Host: Right.

Rangan Majumder: …because key word matching just wouldn’t work.

Host: So that actually is happening now?

Rangan Majumder: Yes, that’s correct. That is happening and as we did it, there are all sorts of new innovations that came out of it that we realized are reusable for other parts of the company. And as we kept pushing the system, we noticed users kept asking harder and harder questions so then we just had to build better and better models. So, there are a lot of interesting things that came out of Project Turing. So first was, we’ve got this deep learning search stack, deep learning question answering system, but then we started to build these Turing Neural Language Representation. And then, just recently we announced the Turing NLG, or Natural Language Generation. So we realized, many times, the passage itself can be kind of long, that comes from a web page, so sometimes we need to rewrite it and shorten it for people, so that’s why we started to look into this generation task. We were able to train one of the largest deep learning language models and that’s called Turing NLG and we announced that I think last month.

Host: Right, so it’s very new.

Rangan Majumder: Yes, very new. It’s seventeen billion parameters, it was like impressing…

Host: Wait, wait, wait. Seventeen billion?

Rangan Majumder: Yes, seventeen billion parameters.

Host: Oh, my gosh.

Rangan Majumder: Yeah, and just like three years ago, our biggest model was probably ten million parameters. So it just shows you how quickly the space is growing.

Host: Okay, so with that kind of context, where’s the next number of parameters? Are we going to hit a trillion? I mean, is, is this scalable to that level?

Rangan Majumder: Yeah, that’s a good question. So definitely we’re going to keep pushing it because every time we get an order of magnitude, we notice it could just do better, so we’re not seeing it slowing down. So as long as you get improvements that could ship to customers, we’re going to keep pushing the boundaries. But at the same time, we need to be more and more efficient with our computation and also just not chase something for vanity’s sake, right?

Host: Right.

Rangan Majumder: Like just because we can get to a hundred billion parameters, which we want to be able to do, we also need to make sure we’re really maximizing the value that the model is actually getting with all those parameters too.

Host: I guess I should have said a hundred billion before jumping to a trillion… It’s like a “triple dog dare” right after the “dare you.”

(music plays)

Host: So drilling in a little bit on these different manifestations of your technology, I know that there’s one called Brainwave that is part of the search experience now, and you had talked a little bit about the fact that Project Turing and Brainwave were co-developed, or concurrently developed, because they each had gaps that they needed to fill. Tell our listeners how Turing and Brainwave came about together and how it speaks to the importance of collaboration, which you’ve already referred to earlier on, across research and product boundaries.

Rangan Majumder: Yeah, so these really large deep learning models are very expensive. So they really actually push both the software and the hardware to its limits. So while we’re trying to train these really big models, or even ship them to customers, we need to push the software and push the hardware. So Brainwave, the idea was, they could actually take deep learning models and accelerate them really fast (opens in new tab), but they really needed models that were worthy of that kind of hardware, right? They spent a lot of time building this Brainwave compiler and we got all these FPGAs in our data center and when our models were kind of small, like ten million parameters, sure, you can use Brainwave, but it was just making something that was already possible just a little bit faster.

Host: It’s like taking a thoroughbred to a kid’s party…

Rangan Majumder: That’s right. But it wasn’t until we got to these really large models, like the Turing NLR model, which was, you know, three hundred million parameters, or even six hundred million parameters, and it was so big, if we tried to run it without any kind of optimizations, it would probably take about six hundred milliseconds. And we would have to run this multiple times for every search. So imagine, you know, you type in a query, hit enter and it took you like five seconds to load the page, right? So this is something that was unacceptable. But we were getting these huge improvements from it. Like I said before, it was the biggest improvements we were getting.

Host: Right.

Rangan Majumder: So imagine that we’ve got this thing, which we knew was excellent for customers, but we had no way to ship it. And that was the problem we had on our modeling side. And then, on my Brainwave team, they’re like, I’ve got this awesome hardware, but I have no really, like, big models pushing us. So that’s how these two were kind of co-developed. So they needed something to push their platform, and the modeling side of my team needed hardware that can actually run these models. So, what ended up happening is, these models, which would take six hundred milliseconds, unoptimized, we got it down to five milliseconds, which is blazing fast. So the way to think about five milliseconds is, the blink of an eye is, like, you know, three hundred milliseconds. So every time you blink, you know, we’re running about like fifty inferences. I think Brainwave was just one part of that hardware story. The other thing we’d done is we partnered with Nvidia to be able to build faster and faster ways to run inference on GPUs. So we actually open sourced that in ONNX, the ONNX Runtime, so if people want to reuse our work, they can just go download the ONNX Runtime, and the other thing we’ve been able to do, this was also part of our announcement in February is, to train that seventeen billion parameter model, we had to do all sorts of things that weren’t done before because you can’t fit this model into GPUs, right? So we open sourced this library called DeepSpeed. It’s very easy to use, and it’s just a great way to train really large models super-fast.

Host: Talk about what you’ve called the most interesting story here, something you call the network effect. What do you mean by that, and how does the network effect make everything better?

Rangan Majumder: It’s super-interesting just the type of collaboration we’re getting. So we train a model once for a scenario in Bing, and that same model is reused for lots of scenarios in Bing, lots of scenarios in Office… So the economies of scale, which is, you know, each team can just easily get huge impact by just reusing something somebody else did, is really transformative. The second type of network effect we’re seeing is, basically by open sourcing this code, like the ONNX Runtime and DeepSpeed, and we also open sourced our vector search code called SPTAG, so by doing that, other people can now reuse the work, but also contribute to the work.

Host: Wow.

Rangan Majumder: So it just keeps getting better and better. So that’s something, you know, our team really believes in. Like, if you open source something that we think is state-of-the-art, by other people contributing to it, it can continue being the state-of-the-art.

Host: Right, are you seeing this across the industry? That other companies are open sourcing these really powerful technologies and code?

Rangan Majumder: Yeah, absolutely. That’s one of the exciting things. You know, Google open sourced TensorFlow, they open sourced their BERT model… Facebook’s open sourcing a lot of their models. They have PyTorch, which is open source…

Host: Right.

Rangan Majumder: …so it’s been really great that all these AI companies and leaders are actually open sourcing their technology.

Host: Is anyone keeping their cards close to their chest on any particular things?

Rangan Majumder: Well, I mean, you can never know for sure.

Host: Right.

Rangan Majumder: But the general consensus is, researchers want to work in an open way, you know. The old way of working in an open way was just publishing papers.

Host: Right.

Rangan Majumder: But now it’s about open sourcing.

Host: Right.

Rangan Majumder: So that’s… I think open sourcing is the new, like, publishing papers. So they really want to share their achievement, and that’s one way of just proving… like, you can write a paper, but is that reproducible? Many times it’s not. But if you open source it, then people can really test it and you know it really works.

Host: Right. Well, as much as I love asking questions about the upside of technological innovation, I always have to ask about the downside. So now I’ll ask you, Rangan, is there anything about the work you’re doing that keeps you up at night, metaphorically, and if so, what are you doing at the outset to help mitigate it?

Rangan Majumder: Yeah, definitely the thing that worries me the most is around AI and ethics. So if you think about these really large models, they’re trained on existing data in a self-supervised way. So they take the data, all this text, and you know all this text that humans write actually have biases.

Host: Right.

Rangan Majumder: All the existing data out there has a bias. And I think Microsoft Research even showed this in one of their papers so if you look at the word embedding for nurse, it tends to be closer to female word embedding than the male word embedding, right? So these models, if they’re just trained on this already biased data, they’re going to learn that kind of bias and that kind of stuff definitely worries me. In fact, when we launched Turing NLG, we had a demo page for it and we wanted to share the demo with everybody, but right before we did that, I gave it to a couple of folks on my team who were like hackers and I said, hey, why don’t you try to break this? And within a couple of hours they came back and were like, you know, they could manipulate the model to say, like, offensive things. I just said, well, if I just gave this to everybody, they could easily show examples where this model was saying some things that are inappropriate.

Host: Right.

Rangan Majumder: And then, like, just one or two examples of it doing inappropriate things would just wash away all the good things that it could do.

Host: Right.

Rangan Majumder: So that’s why we decided to release it in a controlled way. But I think that’s a really important problem for the entire AI community to solve, like how do we solve the bias that we have in our data?

Host: Right.

Rangan Majumder: Especially when you’re using these AI models to make decisions that could affect people’s lives.

Host: All right, so many, many people that have been in this booth have said the same thing in terms of identifying the problem, and this is something we need to think about and something we need to talk about. Is anybody, and maybe the product side is the closest to, you know, the reification of all of this, is anybody thinking about how you do that? How you let this out in a controlled way and/or keep the bad actors from doing their best work?

Rangan Majumder: Yeah, obviously we have to because we can’t just let these models out and do inappropriate things, especially when they show up in products like Bing and Office and so on.

Host: Right.

Rangan Majumder: So the first thing we have to do is measure the problem. We have a lot of metrics to just make sure, like, okay, the question answering experience is not saying offensive things, right? And it’s actually kind of tough because, as the models get smarter, they get better at finding answers, and anywhere on the web, there’s, like, somebody who’s written some garbage, right?

Host: Right.

Rangan Majumder: So you could ask, basically, fill in the blank, like, is so and so a bad person? And there will be somebody out there…

Host: Absolutely.

Rangan Majumder: …who has written that, right? So we actually first have to measure the thing you don’t want to accidentally do, and I think that’s probably the first thing around this space you have to do. Like come up with some metrics around bias.

Host: Right.

Rangan Majumder: And then, once you have good metrics, the thing I’ve seen is, teams are very good at optimizing for that. But it’s still a very hard problem to do. Like how do you even measure bias? How do you make sure that you’re measuring all sorts of bias?

Host: Yeah, this is going all the way up to the C-Suite. Brad Smith is talking about it in his book and it’s a big deal even in academic works is how, you know, do you put out “parental controls” on a product? I use parental controls in

Rangan Majumder: Yeah.

Host: quotation marks, but you know?

Rangan Majumder: That’s right and he has this Aether committee that our team is actually involved in.

Host: Right.

Rangan Majumder: Around just making sure, like, this AI we are building, make sure it can’t be harmful, it’s used in, you know, like a responsible way, and so on.

Host: Yeah.

Rangan Majumder: So I think that’s one AI ethical angle that I’m worried about. The second one is really around inclusivity.

Host: Mm-hmm.

Rangan Majumder: So if you look at all the AI breakthroughs, they’re mostly coming from a few companies. They’re, you know, Microsoft, Google, Facebook, some of the Chinese companies like Alibaba, and that’s because these really large models take a lot of compute to go build, so I am worried that it’s just going to be a few companies that are just doing all the AI breakthroughs. So we really need to think about, how do we make it more inclusive? And I think the good news is people are open sourcing a lot of their technologies so others can use it, but when it comes to compute and things like that, like there is only a few companies that can afford that kind of stuff.

Host: Right.

Rangan Majumder: So we need to also think about, like, how do we make sure this AI transformation is inclusive for as many people as possible.

Host: Well, and I think you’re starting to see that in some of the AI for Good efforts that Microsoft is doing…

Rangan Majumder: Absolutely.

Host: …with, you know, these grants that aren’t just money, but they’re compute resources, right? You can use this and use Azure for free.

Rangan Majumder: That’s absolutely right, yeah. So that’s one way to do that.

Host: All right. Tell us a bit about yourself. Where did the high-tech life begin for you and how did you end up at Microsoft?

Rangan Majumder: I guess the high-tech life started for me when I went to Carnegie Mellon and I studied computer science and computer engineering. And while I was there, I took some machine learning courses. And this was early 2000s, so machine learning wasn’t nearly as impressive as it is today, but at the time it was extremely fascinating because I’ve always been interested in these open-ended questions, like the meaning of life. How do people think, is also one of those open ended questions that I was very fascinated with, so when I was learning machine learning it’s like well, one way to learn how people think is to kind of rebuild it in machines. And then, I came to Microsoft. I started as a developer for about four years. Then I switched to program management, and I also switched to the Bing team because, at that point, while they were building up the Search team, I realized this is the best place to apply machine learning, right? So if I want to really be at the cutting edge of machine learning, like this is the place to be, so… and I’ve been there for the last ten years.

Host: Yeah.

Rangan Majumder: Just applying machine learning to solve customer problems.

Host: Has anyone ever tried to say, come back to academia and get an advanced degree?

Rangan Majumder: Yeah, definitely. I’d say my parents are the ones who are saying that, because they, you know, they think, if you get a PhD it just means so much to you and the family and so on… I’m like well, I could, but I’m having so much fun here!

Host: What’s something interesting that people might not know about you? And maybe it’s a life event that impacted your career or maybe it’s a personality trait that made you who you are today? Or maybe it has no connection to any of that and it’s just an interesting data point that we couldn’t find out about you if we typed your name into the Bing box?

Rangan Majumder: The thing that probably most people would be surprised at first, that you can’t find in the web is, when I was younger, I got diagnosed with ADHD. I was getting in trouble in school all the time. I wasn’t, like, doing well, and at one point, the teacher and the principal, like, brought my parents in and they said, hey, like, you have to do something about him otherwise, like, he won’t be able to return to school. So my parents took me to a therapist and then they diagnosed me with ADHD, and then they gave me some drugs and it completely changed my life because I started to, like, get the highest grades in the class. Like, I was no longer getting into trouble. It was so strange, I remember, my parents noticed that my behavior was different, so they took me off it for a little while and almost immediately I started getting in trouble again. But one thing that was different this time was, I noticed it and I realized well, I don’t like getting in trouble. Like, this is not fun for me! So then I actually made a conscious effort to try and do well at school and not get in trouble. And then I was able to kind of make up for it. So I really made an effort to sort of control and change my behavior.

Host: As we close, I’d like to circle back to the beginning, and sort of tie things together. If the big goal is to make us smarter and more productive, and we’re not there yet, what are the big open problems in the fields that, if solved, would get us closer to the big goal, and what kinds of people do we need to help us get there?

Rangan Majumder: So the first few things that you’re going to see over the next year or two is multi-modal models. So that’s mixing text, images, and videos together in a single representation. That’s something we’re experimenting with today. We’re seeing some good results. So I think you’ll be able to, like, ask questions and get an answer in images or in videos, like look inside a video, look inside an image. So I think that’s going to be pretty cool. The other thing is, like, we’re definitely betting big on this deep learning, so you’re going to see us be more and more efficient around, how do we run these models, train them with less compute? How do we get more out of it? Data efficiency is another thing. Given there’s a limited amount of data, how do we make sure that we’re maximizing it to build better models? But I’d say, in the long term, the thing that is still missing is, um… Like, I think there’s two AI camps. There’s this deep learning camp and then there’s this, what they call, a symbolist camp, which is looking at graphs and structured data, and I think there still needs to be a way to fuse those two so that you can actually take unstructured data and reason over it the way you can with structured data, and you can create new knowledge and things like that, because there’s a lot of questions we’re seeing people ask, and sure, the answer isn’t written there, but if you combine the information in two paragraphs, you can actually get the answer by combining them, so I think that’s something we’re still thinking about. It’s not going to be an easy problem, but I think that’s something the academic field and industry still needs to do.

Host: Rangan Majumder, thank you so much for joining us today. It’s been a real pleasure.

Rangan Majumder: Thank you for having me!

(music plays)

To learn more about Rangan Majumder and the latest advances in Search and AI technology, visit Microsoft.com/research

继续阅读

查看所有播客