Episode 83, July 17, 2019
Johannes Gehrke is a Microsoft Technical Fellow and head of Architecture and Machine Learning for the Intelligent Communications and Conversations Cloud in Microsoft’s Experiences and Devices division. But lest you think his lofty position makes him in any way superior to you, let me tell you, he knows who works for whom, and he’ll be the first to tell you that you are his boss!
On today’s podcast, Gehrke frames the new, cloud-powered work world as a fast paced, widely-distributed workplace that demands real-time decision-making and collaboration – and explains how products like Microsoft Teams are meeting those demands – and tells us, both directly and indirectly, about the future of work, which for Microsoft, involves a pivot from an app-centric approach to a people-centric approach where, by using an AI-infused productivity suite coupled with the power of the cloud, we can essentially “hire Microsoft” to help us get our work done.
Related:
- Microsoft Research Podcast: View more podcasts on Microsoft.com
- iTunes: Subscribe and listen to new podcasts each week on iTunes
- Email: Subscribe and listen by email
- Android: Subscribe and listen on Android
- Spotify: Listen on Spotify
- RSS feed
- Microsoft Research Newsletter: Sign up to receive the latest news from Microsoft Research
Transcript
Johannes Gehrke: I think maybe one difference for us is that we think of this not as, hey, here’s something that, you know, forces you, or tries to engage you more, suck you more into our applications. But again, you hire us to get a job done. The job is not to spend more time in the application. The job is to be more productive. The job is to get this document done. The job is to finish this proposal. And therefore, our goal is not, with all of these different parts of our experiences, to have more minutes in our application, because that’s not what you pay us for.
Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.
Host: Dr. Johannes Gehrke is a Microsoft Technical Fellow and head of Architecture and Machine Learning for the Intelligent Communications and Conversations Cloud in Microsoft’s Experiences and Devices division. But lest you think his lofty position makes him in any way superior to you, let me tell you, he knows who works for whom, and he’ll be the first to tell you that you are his boss!
On today’s podcast, Dr. Gehrke frames the new, cloud-powered work world as a fast paced, widely-distributed workplace that demands real-time decision-making and collaboration – and explains how products like Microsoft Teams are meeting those demands – and tells us, both directly and indirectly, about the future of work, which for Microsoft, involves a pivot from an app-centric approach to a people-centric approach where, by using an AI-infused productivity suite coupled with the power of the cloud, we can essentially “hire Microsoft” to help us get our work done. That and much more on this episode of the Microsoft Research Podcast.
(music plays)
Host: Johannes Gehrke, welcome to the podcast!
Johannes Gehrke: It’s great to be here.
Host: I like to begin each podcast by introducing and “situating” my guests, to use a research term. So here we go, for you and it’s long. It’s a mouthful, what you do! You’re a Microsoft Technical Fellow, and you’re the Chief Architect and head of AI and Machine Learning for the Intelligent Communications and Conversations Cloud in Microsoft’s Experiences and Devices group!
Johannes Gehrke: Yep.
Host: There’s a lot to unpack there, so we will in a second. But first, since you’re currently working under the product mantle, but you have a PhD and deep roots in research, tell us how that all comes together for you. What do you do for a living? Why do you do it? What gets you up in the morning?
Johannes Gehrke: Well, in the morning, usually it’s just a good cup of coffee, but… So, what I do in my job, I’m responsible, basically, for two things. One of them is architecture. So, for example, in the Intelligent Communications and Conversations Cloud, one of the things we’re doing is we’re powering Microsoft Teams, which is sort of a chat-based experience where, you know, we send messages back and forth and we have an existing chat service that has “reached its age,” and we’re now… we’re now designing a new chat service. So, I’ve been very deeply involved in helping with that design. I’m also responsible for artificial intelligence and machine learning. I can tell you much more about this, but basically what we’re doing is there, we’re taking a bunch of existing code, and replacing it with models. And we’re both creating, therefore, more robust experiences, as well as new innovative experiences, for our customers. Then I’m also responsible for innovation. It’s not only me, it’s the whole team, but I’m especially interested in this, given my research background. And this is especially exciting in the context of Microsoft Research because I get the opportunity to work closely with Microsoft Research in Redmond. I work with Microsoft Research in India. And also, MSRA. So, for example, this summer I am co-supervising a few interns here at Microsoft Research, actually in sort of the database area, but just to keep my research muscle going. And as part of that, I’m also, therefore, responsible for mentoring some of the principal engineers. So, I’m actually working with several engineers across the team to see, you know, who is the next generation of architects for our team?
Host: So, let’s unpack a couple of those meaty phrases. Experiences and Devices. Is this a fairly new umbrella, if you will? It’s broad enough to include just about anything that has to do with computing. Can you sharpen the focus a bit?
Johannes Gehrke: In short, Experiences and Devices basically combines Windows, Office, Devices and Browsers. So those four things. But really what this means is that, at the core, there’s this group of products that we call Microsoft 365, which is basically our productivity cloud that spans both work and life. And it’s this communication collaboration platform for all of our customers and it integrates business processes directly in our experiences and it encompasses security for devices and applications, so it’s really a big, comprehensive, productivity solution for all of our customers. And, what has changed there, and that’s maybe where the sort of the E + D focus comes in, that it used to be very app centric. So, we used to have Outlook, which was like messaging and calendaring. Then we had SharePoint, which was about documents and workflows and content management and so on. And then we have Skype and Skype for Business for real-time communication. And we, basically, in the consumer world, we had a sort of similar app-centric worldview, there was basically OneDrive, there were Skype, and there was Hotmail. And what we’ve recently done is, we’ve changed this and pivoted this to put the customer at the center. So, people are really at the center of the suite. And so, we have now workflows that go across all of the different devices and apps, basically to get the job done. So, the way I like to think about, actually, everything that we’re building is that you basically hire Microsoft products to get a job done. So, when you need to have the job done, you’d have to have all of these digital assets at your fingertips.
Host: Right. Right. All right, so drilling in there just a little bit, if I’m the customer, and I’m used to an app-centric world – in fact, I even conceptualize things as, I’m going to get an app for that. There’s got to be an app for that, right? – how does it change my experience if you, aka Microsoft, say, you’re the center now?
Johannes Gehrke: Yeah, so, it’s pretty transparent for you. And it basically means that we bring, sort of, the power of the whole suite, disintegrated across all of the different apps and we make workflows across them pretty seamless. For example, assume you’re in Outlook. Now, you would like to attach a document to your meeting request or even to your email. Well now, you click on Attach File and a very simple experience is now is that we’d actually show you the MRU, which was the most recently used files that you had, because that’s very often the file that you want to attach there, and make sure we have a lot of telemetry that shows us what are actually the recent files. You can think of this like a brain-dead example, because, why didn’t we do this before? But this is an example of, when I’m in Outlook, I think about, well I want to open up a file box and then, you know, let you pick the right file, which if I think across the suite, will think, oh, actually, I’ve worked on a file before in SharePoint that I’ve edited that I now want to share with you. And the edit should directly be there at your fingertips.
Host: Well, I love that you call it an MRU and then immediately told me what that meant, because I was going to ask, MRU? Everything’s a TLA, a three-letter acronym! But the “automatically knowing what I just did,” does this involve a lot of machine learning algorithms that are looking at what I’m doing and sort of processing? How does that…
Johannes Gehrke: Right, so that actually, I think, is now where, a little bit, the power of the cloud comes in. So, if you look at previous generations of software, they were basically running on-premise. You basically had all the data there as well, but it wasn’t really utilized. Whereas now, we have the power of the cloud, where basically every interaction with the system from Microsoft gets recorded. And it’s not that, now, we software engineers can play around with it, because Microsoft actually has very strict controls of what we can look at and what not. But we can actually now make this data available for all of our customers to build interesting applications on. For example, what we can now do is, we can compute the list of the people that you’re working with. And this is not only the people that you’re emailing with a lot. This is not the people that you’re sharing documents often with a lot. These are not the people that you are chatting with a lot on Teams. But it’s a combination of all of them. So basically, what we can do is, we can take all of these signals and rank them and then see, oh, who are the people that are actually working with you? And there are many other applications, or scenarios, like this across the suite.
Host: Interesting. Because I’m looking at it from, you know, a broader perspective of how software works on social media. For example, Snapchat would have the person that you Snap the most, right? And it knows automatically, because it’s collecting the data… And then hopefully serving you, so that you don’t have to scroll down your list if they’re in the… lower in the alphabet.
Johannes Gehrke: Yeah, I think it’s very similar to that. I think maybe one difference for us is that we think of this not as, hey, here’s something that, you know, forces you, or tries to engage you more, and suck you more into our applications. But again, you hire us to get a job done. The job is not to spend more time in the application. The job is to be more productive. The job is to get this document done. The job is to finish this proposal. And therefore, our goal is not, with all of these different parts of our experiences, to have more minutes in our application, because that’s not what you pay us for. And so, therefore, some of the metrics that we’re optimizing for are very different than, you know, various consumer companies.
Host: You know, I’m grinning so big right now, because the way you frame this is, you’re working for me. And I’m thinking, yeah, that’s right! You’re working for me! Don’t make me do what you want me to do. You do what I want to do! Well, how about this phrase, the Intelligent Communications and Conversations Cloud. When I first read it, I wanted to put an, “in the,” like, Intelligent Communications and Conversations in the Cloud. Right? But that’s not it. It is the cloud.
Johannes Gehrke: Right. It’s actually the thing.
Host: The thing.
Johannes Gehrke: Right.
Host: So, give us a picture of what this is, because I do want to talk about your role in the division, but tell us more a little bit about the Intelligent Communications and Conversations Cloud.
Johannes Gehrke: I think it came from the observation that, as businesses are transforming and are getting more distributed, and decision-making is getting faster and faster, communications, and especially real-time communications, is becoming much more important, and real-time collaboration as well. And really that’s what Microsoft Teams is all about. And in a way is that, it’s sort of a new application that brings chat at the center. But really collaboration is at the center of the application. Teams has these four different capabilities built into it that are sort of living off this Communications and Conversations Cloud. So, the first one is, basically, it has video conferencing in meetings. So, if you are now sitting not in the same studio here, but then we could be using Teams to just collaborate together directly and see each other. It is collaboration building, in that we can co-edit documents together, and basically, in a way, it has, sort of, Office built in into the single shell. And anything that we can do in Office, we can do directly within Teams. And then it also integrates sort of these other business workflows that, if you are working on third party tools, then we can actually integrate that all together into Teams as well.
Host: Really? So, example. Third party tool. What would it…
Johannes Gehrke: Well, so for example, I mean, you know, I said Office is sort of built in. But there’s also Planner built in, there’s GitHub, and there’s sort of a long tail of other applications that you can directly just hook up to Teams and then you have sort of your workflow directly at your fingertips.
Host: So how new is all of this?
Johannes Gehrke: So, I think the general model was pioneered by Slack, which is a competitor. And they sort of had this observation that many third-party applications can actually have a messaging interface. And so, you sort of, when you check something in, you get a return message. And so, we’re using the same model as well. And then we have a lot of other innovative ideas around how we expose this directly in the UX to our customers as well.
Host: Well, switching streams here a little, because we’re going to come back to Teams and some other things that you’re working on, but one of your big research interests is database systems. In fact, you’re a bit of an expert in that area. You co-authored a book called Database Management Systems way back when. It’s a leading text for database education courses. And the field has changed a lot since the most recent edition of the book, in 2002. I think that was the third edition…?
Johannes Gehrke: Yep, that was the third edition.
Host: So, tell us what the database systems landscape looked like when this book came out in 2002, the third edition. Because you alluded to the fact that it didn’t change a lot in the first three editions, but things have gotten way different now. What changes have you seen to warrant the editions and what might future editions look like?
Johannes Gehrke: So, this is actually an interesting story. So, this book was written by my advisor, the first edition, Raghu Ramakrishnan, and when I joined as a PhD student, I was actually really interested in writing and we talked a lot about the book. And, at some point in time, he said, well, I’m writing a second edition, would you like to join me? And so, I joined in the second edition and then we wrote the third edition. And basically, the book gives you a really deep introduction of what it takes to build a relational database system. Sort of, you know, SQL Server, Oracle, really the hallmarks of traditional data management systems. And I think now, database systems have changed significantly. So, I think, especially the move to the cloud has changed many things. Usually the first attempt of bringing any system into the cloud is sort of this “lift and shift.” You take the on-prem system and then you just run it in the cloud. But that only brings you so far, and I think one of the main misunderstandings of the cloud is that it’s just about cost savings and multi-tenancy and elasticity. But it’s really about standing on the shoulders of giants. So, you basically, somebody builds something really awesome and then everybody can use it. And that’s now also what’s happening in the database community. So, people have started to, basically, go ahead and build these cloud-native database systems that are specifically designed for the cloud that build on other cloud infrastructure pieces. And that looked very different than the traditional relational database systems. What has also changed is that, therefore, a bunch of other aspects have become much more important. For example, distribution and wide-area availability has become super important. Distributed database systems used to be a niche area, but now with the cloud, and sort of 24-7 availability across different zones… I just remember when the terabyte was called the “terror bite.”
Host: Terror bite!
Johannes Gehrke: And now we have, you know, exabytes of data. So, this is just…
Host: And even zettabytes…
Johannes Gehrke: And even zettabytes. It’s really grown so much over the last decade, really, and over the last two decades, that the book has really become somewhat outdated now.
Host: So, the last edition was 2002. It’s now 2019. Is there another edition coming out?
Johannes Gehrke: Yes, so Raghu and I have been talking. And every year we sort of sit and ruminate and say, yes, we should do it. At least right now we’re talking and, you know, we’re thinking about bringing out another edition. I think what has changed, also, a little bit is that, you know, when you would have asked me, fifteen years back or so, what is a database system? I think it was clear to everybody what a database system is. And I think, over the last decade, this has also changed quite a lot. So, I think we’re now early even in understanding what are sort of the lasting principles again for this next class of database systems. I think early on, it was clear, you know, on the theory side maybe you need to know relational algebra, there’s an index, there’s the B-tree. There is a very well-understood notion of what concurrency control and recovery means. There is a very well-understood notion of what zeroizability means. And now, in the cloud, everything has changed. So, it’s not, anymore, clear what it actually means to be sort of a first-class database system right now and what the foundations are.
Host: I hope you have a big team! There’s a lot of thinking. I mean it isn’t just the work to do. There’s the conceptual work up front…
(music plays)
Host: Well, I want to talk about some interesting work you’re involved in. And I’m going to suggest that we go sort of free range here because there’s a lot of moving parts and pieces as we’ve just discussed. All of it has to do with, as you’ve alluded to earlier, technologies that help us work better and smarter in different ways. Right? So, talk about the spectrum of projects you’ve got going and how they’re manifesting, either as products or in products. Because that’s the other thing, you’re no longer just doing boxes of software and saying, here, buy this. It just appears in my workflow. So, talk about that.
Johannes Gehrke: Yeah, so when I first came to Microsoft, I came with the idea to build something that we called, at that point in time, Delve and the Office Graph. So, I used to work in Enterprise Search on a startup that actually Microsoft acquired in 2008. But at Enterprise Search, we always had this kind of frustrating experience that we would basically sell our software, and we had customers like Best Buy and Financial Times and so on, and they would buy our Enterprise Search software and then get a bunch of consultants go in and customize it. Then afterwards, it looked like a black box. Then, when we came to Microsoft, now we are running, suddenly, in the cloud. What this actually means is that we now see all of these signals. So, when you go ahead and share a file with me, that’s actually a signal that we are collaborating. So, if you think about now search in the enterprise as compared to search on the internet, searching in the enterprise has never been as great as on the internet. Now in the enterprise, you have, you know, small and large enterprises, but you have a reduced audience and they’re not searching all the time, so you have much less engagement, you have many fewer links and the content is spread across all these different OneDrives and SharePoint sites and you don’t really know which of them is much more important than the other. So, enterprise search, by itself, is a much harder problem. So, what we did in Delve is we said, well, let’s build an underlying data asset and we called it, at that point in time, the Office Graph and then it became the Microsoft Graph. And in the Office Graph, you would capture all of these signals that came from people just working in the cloud. And these signals now help us to do ranking much, much better in the enterprise. And, because it sounds kind of abstract, and the Office Graph sort of as a data asset is kind of abstract, we built in Delve, as an experience on top of it, where Delve is basically people-centric search. So, I could go to your Delve and I would see all the documents that you are working on and sort of your organization is working on around you. And you do this basically as a relevance feed that comes from everything around you. So that was Delve, and that was really exciting because, in a way, it shifted, also, I think Office a little bit from this app-centric view to more of a data-centric view, and also to this people-centric view. And then, more recently, over the last couple of years, I’ve been working on Teams. And Teams, I started, mainly, with architecture because there were basically a bunch of things that I think were older systems that we needed to bring into the modern, cloud-based area. But, in addition, then recently I started to also look at AI and machine learning, and there, especially what I’ve been looking at, is what’s called Software 2.0. It’s basically, you know, you replace code with code and data, and you manifest the data in models, especially deep neural networks. And so, we’re basically on a journey where we take pieces of Teams software and replace them with deep neural networks. And we have lots of individual components and these components were built at different times. And what we’re doing now is we’re replacing this whole pipeline, basically, with deep neural networks.
Host: What’s your timeline on this?
Johannes Gehrke: So, I hope that we can ship something later this year…
Host: Okay.
Johannes Gehrke: …and then all throughout next year.
Host: I’m going off-script, here, but how cognizant are customers, now, that this is happening, and is this going to be surprising, to put it nicely, or disturbing, that “it knows?” You know, that’s the phrase I use now: “It knows! How are you assimilating these products into people’s lives?
Johannes Gehrke: Right, yeah, I think it’s a bit of a journey. For example, when we first brought out Delve, again, Delve works on all of the existing customer signals. These are actually signals that the customer has access to already. So, who has updated a document? You can look at the document version history. You can look at your email to see with whom you’re communicating. On the other hand, we had some customers who said, well, you should turn Delve off because they said, well, this is something that is maybe too privacy invasive. So, I think this is also an important lesson that we learned that even though the data may be out there, the information that you, in some sense, “cook” out of it may still be more disturbing than the individual pieces of data that just are lying dispersed out there.
Host: And we’ll come back to the “what keeps you up at night” question in a second, or what keeps me up at night and I hope you’re working on, is probably better! Because I think what we’re seeing now is a shift in the idea of privacy. You know, what trade-off am I willing to make for the super productivity promises that I get from what you’re saying, hey, we just want to help you get the job done. We’re working for you. It’s like, well what do I have to let you know so that you can help me work?
Johannes Gehrke: Yeah, that’s a really good question and I think there are probably two different scenarios out there. There’s a scenario where you are in the control of your data and then there’s the other scenario, which is usually the work scenario, where actually, your employer owns all the data, at least in the United States.
Host: Right. Exactly. Yeah.
Johannes Gehrke: And so I think there, you know, I can at least say, so when I first came to Microsoft, the one thing I was most impressed by was how strong the controls are that we have against our own engineers to get at any kind of customer data.
Host: Absolutely.
Johannes Gehrke: I have never seen, in my whole time here at Microsoft, a single piece of customer data. I would not be able to log into the machines. There’s, like, triple escalation barriers in between. We have customer key and customer lockbox. Um, so, so we have actually extremely strong controls in there to protect the privacy of our customers and enterprises…
Host: Yeah. I’m visualizing the red light going off and the siren, beep! beep! But it’s interesting. I just had Ganesh Ananthanarayanan who’s doing computer vision and, as a researcher, he wants access to camera feeds. And he says, it’s both frustrating and reassuring that I can’t.
Johannes Gehrke: Right.
Host: Even as a researcher, I’m saying I could make products better if I had that data and Microsoft says, nope!
Johannes Gehrke: Yeah, exactly. That’s sort of this interesting tension. So, for example, what we’re therefore trying to use a lot, are mechanisms like reinforcement learning. Where basically, in some sense, the model adjusts itself and we have sort of controls about the model that it hopefully doesn’t go off the rails, but basically, we’re training models where we never see the model, we only see the customer signals, but we only don’t see them in plain text either. We only see something about the performance of the model. So, everything is basically done indirectly. Even for Delve, the people ranking that we have, again, we can compute a people ranking for you, which are your colleagues, and then we see your engagement on it. And if you always click on person number fifteen but not the first fourteen that gives us somewhat of a signal that the ranking is maybe not the right ranking.
Host: Well, while we’re on the topic of technology that helps us get work done, I’d like to touch on a subject that everyone is talking about, and one that’s actually a theme of Microsoft Research’s Faculty Summit this year, and it’s this “small topic” of the future of work. So, since you’re giving a keynote on that subject this year, what do we need to know, or at least what thoughts could you share, about this subject of the future of work?
Johannes Gehrke: First of all, I think the future of work is going to be powered by data and the idea is that we can make you more productive with all the data that both you, and all of your colleagues and all the people around you leave in the cloud. Our goal at Microsoft at least is always to put the user at the center and the user in control. So, I think Delve is one such example where we really put people at the center and then make search personalized, like the personal relevance feed. Another example is what we’re doing right now with Teams. Again, where we’re trying to use AI to make you more productive and not to suck you into our applications. I think what’s also important is that we use the data to make you more productive rather than to tell your manager, you know, oh, you clocked in at nine o’clock and you left already at four o’clock today, because we believe that goals are aligned: if you become more productive, that helps the enterprise as well. Our goal is to make you, as a person, overall more productive or make a team more productive.
Host: Mmm-hmm. So, that’s one thing about the future of work. What other kinds of things, computer science wise, could we think about?
Johannes Gehrke: So, I think what also changes is how we, as software developers, work. I mean, this comes back again to this software 2.0 theme, that previously you hired experts in an area and these experts, they sort of transferred, in some sense, their expert knowledge from their head into code. And now, what we have to do is, we have to collect data and then we take basically this data and now we have machine learning experts. And what these machine learning experts do is they take the data and then transform it into a model.
Host: Okay.
Johannes Gehrke: And then we transform this model into code. And what this means is that, actually, everything that we know about software infrastructure is now changing quite a bit as well. So, for example, for, you know, if you think about Software 1.0, we have lots of really great tooling. We have tooling around repeatability, around testing. We have ways of decomposing software into modules. We have great ways of debugging software… you know, good luck with debugging the deep net! We have really good ways of abstractions. We have high level languages, you know, optimizing compilers. We have performance tuners and we have, sort of, a whole dev-ops culture. It is basically taking software from the computer science discipline really into an engineering discipline. Now, in the Software 2.0 world, we don’t have, really, that well-developed tools or infrastructure for all of these aspects. For example, for repeatability, well, maybe we have model pipelines and environments for freezing code… For testing, we have maybe test generation with GANS. We have simulated environments. We have adversarial training and testing. For debugging, I think it’s a very nascent field for deep nets. In terms of abstractions, you know, for Software 2.0, we don’t have really great abstractions. I mean, here, actually, I think you interviewed Patrice Simard…?
Host: I did.
Johannes Gehrke: About machine teaching…
Host: Machine teaching, yeah.
Johannes Gehrke: …which I think could be an interesting abstraction, right? You have Auto ML. You have this notion of learned variables.
Host: Nicolo Fusi… did that.
Johannes Gehrke: Exactly.
Host: And back to what you were talking before about, Tom Zimmerman was recently on about data-driven decision-making for software productivity for software engineers… All these pieces are mind boggling and so exciting.
Johannes Gehrke: Right. So, there’s a whole development from basically traditional dev-ops and to data-driven dev-ops, then to data-driven dev-ops with models actually in it. And then all the way, sort of in the end, is, maybe, Software 2.0. I always think that, you know, there’s a spectrum from 1.0 to 2.0 and we’re clearly not there at 2.0 yet, but we’re somewhere in the middle, maybe at 1.5 or 1.6 or so.
Host: So, Johannes, I’ve asked several guests on the show to give us their take on the value of research, or maybe more specifically the value of research models, since we’re talking about models. Where do you fall on this spectrum of “publish or perish” versus “ship or perish?” Should we let a thousand flowers bloom, or should we try to do fewer things and do them better? Or both? Or what?
Johannes Gehrke: I think, clearly, both. I think also the sort of publish or perish, or ship or perish model is maybe… I never thought about this way. When I was an Assistant Professor at Cornell, one of the things is clearly on top of your mind is, how do you get tenure? And actually, the answer, which I found very satisfying, was that you just have to do great research that changes the world. Now, you can see this in two ways, right? You can either say, wow, this sounds really scary because there are no metrics, so, I’m doomed anyway! Or you can say, wow, this sounds really great because I’ll just do whatever I am best at and what I have fun, and hopefully that’ll work out.
Host: Right.
Johannes Gehrke: So, I think, in research, what you’re basically therefore have to do is you have to find a research area where you think you can make a big difference. And during that time, you sort of really scramble. You’d be very fast, you try lots of little experiments and once you’ve found an area, then you can go deep with research. At that point in time, you then take your time. So, you sort of have this fast part, or this exploratory part, and then actually some sort of deeper part because you make some certain bets. And that’s sort of a combination of let a thousand flowers bloom. But then also some of them, you actually now, you know, want to really build a greenhouse and, you know, maybe want to go, sort of, really big production. Right? And actually, in the product world, it’s very similar. Basically, you want to try, as quickly as possible, what you want to build. And there may be lots of different risks. You know, there could be technology risks, there could be people risks or product market fit risks. But once you have found a good match for all of these three, then again, you want to take your time because now you want to create a product that customers love. And so, in both sides, I think you have to scramble, let a thousand flowers bloom, let’s explore… But then, you want to make a few bets, and then go deep in them.
(music plays)
Host: All right. We’ve reached the point in the podcast where I ask, “what could possibly go wrong?” I ask everybody this, mainly to just, kind of, get out there the fact that people are thinking ahead in terms of what they’re doing and not just saying, well, I’m just doing what I’m doing and you all can deal with it later when things break, or things go wrong. So, is there anything about what you’re doing – you’ve alluded to a couple of things already – that literally or figuratively keeps you up at night?
Johannes Gehrke: I think the main thing that keeps me up at night, now that I’m in a product group, is to make sure that our service is running 24/7. And I’ve always been impressed by how professional that side of the whole cloud at Microsoft is. So, I think, what’s really important is that we have our services up and running 24/7 for our customers, but at the same time, that we can allow a certain amount of innovation. And innovation means change, and we have to manage the risk between the two of them, and then we have to have the right kind of mechanisms in place that we can roll out change without impacting our customers, but at the same time, understanding whether this is a stable build, for example, or whether there are any regressions or anything like this.
Host: So, when you’re talking about that, I mean, literally, you’re not up at night… probably some, you know, system…
Johannes Gehrke: Oh, I’m up at night as well.
Host: Oh, seriously?
Johannes Gehrke: From time to time, I’m an incident manager. I feel like it’s important that also people who are not that close to any of the actual code, that they understand what it means to run a service 24/7. If you’ve never experienced that pain, then you actually don’t really know what you’re talking about. And so, I think everybody in our leadership team, actually, is an incident manager, you know, maybe every eight to ten weeks or so. And therefore, when there’s an incident, then I’m actually also up in the middle of the night, if necessary. But then also, it shows us how good our tools are because, if there’s an incident, we try to mitigate it very quickly, and often it happens very, very quickly as well.
Host: Right.
Johannes Gehrke: I think it just speaks to, sort of, how we work as a team, and that, basically, everybody has to pitch in here. I think it’s important to really understand what our customers experience…
Host: Right.
Johannes Gehrke: …because otherwise you don’t develop that kind of customer empathy.
Host: Right.
Johannes Gehrke: And so therefore, as soon as we hear something like this, we try to respond extremely quickly and basically mitigate the incident as quickly as possible.
Host: So, you are actually the first guest on the podcast that says, “I’m literally up at night!” I want to get just a little personal with you, because you have a funny story about having come to Microsoft twice. Tell us how you got here the first time, what happened in between, and how you ended up where you are now.
Johannes Gehrke: So, I joined a startup in 2005 because two of my colleagues at Cornell were already with the startup. The startup was called FAST Search & Transfer. It was a Norwegian business intelligence search company, and it happened by the CTO, Bjorn Olstad, who is now a good friend of mine, he came to Cornell to give a talk and then we had a half-hour meeting and we ended up talking for, like, two or three hours. At that point in time, I didn’t know anything about search, but I learned that there are lots of interesting database and distributed systems problems out there to which I actually had some interesting thoughts. Then I became an advisor to the company, and then I had a sabbatical in 2007 and 2008, where I then worked for FAST from Germany, and then actually during 2008, FAST was acquired by Microsoft. So that’s when I actually got my blue badge for the first time! But then I actually went back to Cornell, because at that point in time I couldn’t think about leaving academia. Then, in 2012 around, I had this other idea, together with people from FAST where we were then building Delve and the Office Graph. And so, I thought, well you know, I really want to build this, and Microsoft is the place to build it because it has all the data, so I negotiated a year of leave, and so, I joined Microsoft again during that time. I had a fantastic time at Cornell, but there were so many more things to learn here at this point in time…
Host: Right.
Johannes Gehrke: …that I thought I’ll stay here. And so far, it’s been a fantastic ride here. I mean, I had a really fantastic time at Cornell and at FAST, but it’s also really great here to be at Microsoft.
Host: Right. And you’re Microsoft proper, not Microsoft Research.
Johannes Gehrke: Correct. So, I’m in a product group. I’m actually in this Experiences and Devices.
Host: Right. The mouthful we talked about at the beginning. Which is interesting too, because I don’t think there are all that many PhDs floating around in the product groups, or are there more than we know?
Johannes Gehrke: Actually, there were more than you would think of. So, my team is full of PhDs. These are basically machine learning experts. Then, even throughout the product teams, we have many PhDs. For example, especially in our audio-video stack we have PhDs. We have PhDs in our distributed systems group that builds a chat service. So, you wouldn’t imagine. I mean, Microsoft is actually full of smart people.
Host: Well, we know that!
Johannes Gehrke: And I’m learning from them every day, so it’s…
Host: It’s just crawling with PhDs! All right. Well, my newest fun question for my guests is, tell us one thing we might not know about you if we did a web search. You know, the standard things would show up, but this might not. An interesting characteristic, life event, personal quirk… side quest – I don’t know! – um, that may have shaped, informed, or influenced your career as a researcher. Do you have anything?
Johannes Gehrke: So, when I first came to the U.S., I came in 1993, and I especially, I came to UT Austin because I wanted to work with Avi Silberschatz, who was actually my final advisor’s advisor. So, sort of, my grandfather advisor! But actually, that year he left to Bell Labs. So, I was actually there at UT Austin and there was no database person there. But I found this other really great advisor, Greg Plaxton, and he was a theoretician and he was actually working on algorithms. So, for two years I actually worked with him on algorithms and I learned so much from him, but I also learned that I’m not a really great algorithms researcher. So, after two years, I actually then switched to Wisconsin where I started to work with Raghu Ramakrishnan. But I think what these two years taught me is that, there’s great value in writing things down precisely. There’s great value in thinking formally about problems. And I think UT Austin, in general, the training there was very good at that. And I think it still influences my research in that, whenever there’s a good question, I don’t try to jump to the first answer, but I try to understand, you know, has this question been solved before, or maybe even in the theory community, and how does it relate to sort of more formal problems that maybe other people have looked at?
Host: I’m actually sad that our time is coming to a close, Johannes. This has been more fun than I expected… That’s a terrible thing to say! [laughter]
Johannes Gehrke: I always set low expectations! That’s great! [laughter]
Host: Um… at the end of every podcast, I give my guests a chance to say anything they want to our listeners. And often, it’s in the form of advice or encouragement. Sometimes it’s, you know, cautionary tales, things I wish I would’ve known… something profound, unrelated, but interesting. You get the last word. What do you say?
Johannes Gehrke: So, maybe two or three things. The first one is, I’m a big believer in habits, like similar to what our software does that it teaches you workflows. I think in your personal life as well, I’m a really big believer in that you should have certain workflows or habits. And then these habits determine who you are and what you do, but also what you become. And so therefore, just by setting the right kind of expectations, starting with very small mini-habits and then growing them over time, you can change who you are significantly. Second of all, I think you should put people to center in your life. You know, wherever you are, what you want to work with is the right people, the smartest people, the best people. And it’s the people that matter in the end, and it’s just great to have these kind of relationships, and also the kind of guidance and help from other people and advice, and that have been with you throughout your whole career. And the last thing, you know, coming back to this, sort of, career notion, I think, you have to be proactive about your career. Nobody’s going to look out for your career except yourself. Other people, they will give you advice, but also only if you ask them.
Host: Right.
Johannes Gehrke: So, there are lots of benevolent people out there, they’ll help you. But without you being proactive about learning and contributing, you’re not going to grow. And, you know, especially, then, also once you reach a certain level and maybe seniority in the community, it’s also up to you to mentor and give back to the next generation.
Host: Johannes Gehrke, thank you for joining us today.
Johannes Gehrke: Thank you very much, Gretchen.
(music plays)
To learn more about Dr. Johannes Gehrke, and how the Intelligent Communications and Conversations Cloud is helping us “git ‘er done,” visit Microsoft.com/research