Podcast: A Random Walk from Complexity Theory to Machine Learning. With Dr. Neeraj Kayal and Dr. Ravishankar Krishnaswamy
Episode 012 | May 30, 2022
Sridhar Vedantham: Welcome to the MSR India podcast. In this podcast, Ravishankar Krishnaswamy, a researcher at the MSR India lab, speaks to Neeraj Kayal. Neeraj is also a researcher at MSR India and works on problems related to or at the intersection of Computational Complexity and Algebra, Number Theory and Geometry. He has received multiple recognitions through his career, including the Distinguished Alumnus award from IIT Kanpur, the Gödel prize (opens in new tab) and the Fulkerson Prize (opens in new tab). Neeraj received the Young Scientist Award from the Indian National Science Academy (opens in new tab) (INSA) in 2012 and the Infosys Prize in Mathematical Sciences in 2021. Ravi talks to Neeraj about how he became interested in this area of computer science and his journey till now.
Neeraj Kayal: It’s just a matter of time before we figure out how computers can themselves learn like humans do. Just human babies, they have an amazing ability to learn by observing things around them. And currently, despite all the progress, computers don’t have that much ability. But I just think it’s a matter of time before we figure that out, some sort of general artificial intelligence.
For more information about the Microsoft Research India click here.
Related
- Microsoft Research India Podcast: More podcasts from MSR India (opens in new tab)
- iTunes (opens in new tab): Subscribe and listen to new podcasts on iTunes
- Android (opens in new tab)
- RSS Feed (opens in new tab)
- Spotify (opens in new tab)
- Google Podcasts (opens in new tab)
- Email (opens in new tab)
Transcript
[Music]
Ravi Krishnaswamy: Hi Neeraj, how are you doing? It’s great to see you after two years of working from home.
Neeraj Kayal: Hi Ravi, yeah thank you.
Thank you for having me here and it’s great to be back with all the colleagues in office.
Ravi Krishnaswamy: First of all, congratulations on the Infosys prize and it’s an amazing achievement.
And it’s a great privilege for all of us to have you as a colleague here. So, congratulations on that.
Neeraj Kayal: Thank you.
Ravi Krishnaswamy: Yeah, so maybe we can get started on the podcast. So, you work in complexity theory, which is I guess one extreme of, I mean, it’s very theoretical end of the spectrum in computer science almost bordering mathematics. So hopefully by the end of this podcast we can, uh, I mean, convince the audience that there’s more to it than intellectual curiosity. Before that right, let me ask you about how you got into theoretical computer science and the kind of problems that you work on. So, could you maybe tell us a bit about your background and how you got interested into this subject?
Neeraj Kayal: Yeah, so in high school I was doing well in maths in general and I also wrote some computer programs to play some board games, like a generalized version of Tic Tac Toe where you have a bigger board, say 20 by 20, and you try to place five things in the row, column, or diagonal continuously and then I started thinking about how could a computer learn to play or improve itself in such a game? So, I tried some things and didn’t get very far with that, but at that time I was pretty convinced that one day computers will be able to really learn like humans do. I didn’t see how that will happen, but I was sure of it and I just wanted to be in computer science to eventually work on such things. But in college in the second year of my undergrad, I enrolled for a course in cryptography taught by Manindra Agrawal at IIT Kanpur and then the course started off with some initial things which are like fairly predictable that something called symmetric key cryptosystems where, essentially it says that let’s say we two want to have a private conversation, but anyone else can listen to us. So how do we have a private conversation? Well, if we knew a language, a secret language which no one else did, then we could easily just converse in that language, and no one will understand this. And so, this is made a little more formal in this symmetric key cryptosystem. And then, one day, Manindra ended one of the lectures with the following problem: but now suppose we did not know a secret language. Then we just know English, and everyone knows English and then how do we talk privately when everyone can hear us? I thought about it for a few days. It seemed completely impossible. And then Manindra told us about these wonderful cryptosystems, called the Diffie Hellman cryptosystem and the RSA cryptosystem where they achieved this and it was very surprising. And the key thing that these cryptosystems use is something that lies at the heart of computer science, a big mystery still even to this day at the heart of computer science. There are these problems which we believe are hard for computers to solve in the following sense, that even if a computer takes a very long amount of time, if we give it a fairly long amount of time, a reasonable amount of time it cannot solve it. But if we give it time like till the end of the universe, it can in principle solve such problems. So that got me interested into which problems are hard and can we prove they are actually hard or not? And to this day, we don’t know that.
Ravi Krishnaswamy: So, I’m guessing that you’re talking about the factoring problem, right?
Neeraj Kayal: Yes, factoring is one of the big ones here. And the RSA cryptosystem uses factoring.
Ravi Krishnaswamy: So, it’s actually very interesting, right? You started out by trying to show that some of these problems are very, very hard, but I think, looking back, your first research paper, which happens to be a breakthrough work in itself, is in showing that a certain problem is actually easier to solve. Then we had originally thought right so, it is this seminal work on showing that primality testing can be solved in deterministic polynomial time. I mean, that’s an amazing feat and you had worked on this paper with your collaborators as an undergrad, right?
Neeraj Kayal: Yes.
Ravi Krishnaswamy: Yeah, that’s an incredible achievement. So maybe to motivate others who are in undergrad and who have an interest and inclination in such topics, could you maybe share us some story on how you got working in that problem and what sort of led you to this spark that eventually got you to this breakthrough result?
Neeraj Kayal: So, my advisor Manindra, who also was the professor who taught us cryptography – he had been working on this problem for a long time and there were already algorithms that existed which are very good in practice- very very fast in practice, but they had this small chance that they might give the wrong answer. The chance was so small that practically it did not matter, but still as a mathematical challenge, it remained whether we could remove that small chance of error, and that’s what the problem was about. So, Manindra had this approach and he had worked with other students also- some of our seniors- on it, and in that course, he came up with a conjecture. And then when we joined, me and my colleague Nitin, we joined this project , we came across this conjecture and my first reaction was that the conjecture is false. So, I tried to write a program which would find a counterexample and I thought we would be done in a few days-Just find that counterexample and the project would be over. So, I wrote a program- it will train for some time, didn’t find a counterexample, so I decided to parallelize it. A huge number of machines in the computer center in IIT Kanpur started looking for that counterexample. And then to my surprise, we still couldn’t find the counterexample. So there seemed to be something to it. Something seemed to be happening there which we didn’t understand, and in trying to sort of prove that conjecture, we managed to prove some sort of weaker statement which sufficed for obtaining the polynomial time algorithm to test if a number is prime or not. But it was not the original conjecture itself. Many days after this result came out, we met a mathematician called Hendrik Lenstra who had worked on primality testing, and we told him about this conjecture. And after a few days he got back to us and it showed that if you assume some number theoretic conjecture is true, which we really really believe, it’s true.
Ravi Krishnaswamy: Ok, I see. So, the original conjecture, which you hoped to prove true is false, but the weaker conjecture was actually true, you proved it to be true, and that was enough for your eventual application.
Neeraj Kayal: Yes, so in some sense we are very lucky that in trying to prove something false we managed to prove something useful.
Ravi Krishnaswamy: Yeah, I mean it’s a fascinating story, right? All the experiments that you ran pointed you towards proving it, and then you actually went and proved it. If you had found, I imagine what would have happened if you found a counterexample at that time, right?
Neeraj Kayal: So yeah, Hendrix proof was very interesting. He showed that modulo this number theory conjecture a counterexample existed. But it would have to be very, very large and that’s why you couldn’t find it. So, he explained it beautifully.
Ravi Krishnaswamy: Yeah, thanks for that story Neeraj. So. I guess from then on you’ve been working in complexity theory, right?
Neeraj Kayal: That’s right, yeah.
Ravi Krishnaswamy: So, for me at least, the Holy Grail in complexity theory that I’ve often encountered or seen is the P versus NP problem, which many of us might know. But you’ve been working on a very equally important, but a very close cousin of the problem, which is called the VP versus VNP problem, right? So, I’m going to take a stab at explaining what I understand of the problem. So, correct me whenever I’m wrong. So, you are interested in trying to understand the complexity of expressing polynomials using small circuits. So, for example, if you have a polynomial of the form X ^2 + Y ^2 + 2 XY, you could represent it as a small circuit which has a few addition operations and a few multiplication operations like you could express it as X ^2 + Y ^2 + 2 XY itself. Or you could express it as (X + Y)^2. Which may have a smaller representation in terms of a circuit. So, you have been working on trying to identify which polynomials have small representations and which polynomials are natural but still don’t have small representations.
Neeraj Kayal: That’s right.
Ravi Krishnaswamy: Is that a reasonable approximation of the problem you’re thinking about?
Neeraj Kayal: Yes, that’s right. So, another way to put the same thing is what is the power of computation when you do additions, multiplications, subtractions, all these arithmetic operations. You could include division, square roots also.
Ravi Krishnaswamy: So, I have seen this VP class and it makes a lot of sense to me. It’s the set of all the polynomials that can be captured by small sized circuits with the plus I mean addition and multiplication gates. I’ve also seen the VNP class, which seems to me at least to be a bit mysterious, right? So, these are all the polynomials whose coefficients of the individual monomials can be computed efficiently. Is that a reasonable definition, at least? Is my understanding correct?
Neeraj Kayal: Yeah, that’s the technical definition of this class, but there’s another natural sort of intuition why we want to look at it, and the intuition is that it relates to counting the number of solutions to a problem, and also therefore to computing probabilities of various things happening.
Ravi Krishnaswamy: I see. Ok, so that gives me a lot more understanding. I guess when you’re able to estimate probabilities, you could also do sampling over those objects.
Neeraj Kayal: Yes exactly.
Ravi Krishnaswamy: Yeah, that’s a very nice connection. I did not know about this. Thanks for that. So, you have been working, you have an agenda on trying to show some sort of a separation between the two classes, right, VP and VNP, by constructing these low depth circuits. So, you’re able to show that all polynomials in VP have admit the low depth representation and your hope in this agenda is to find one polynomial in VNP which does not have a low depth representation, right?
Neeraj Kayal: That’s right.
Ravi Krishnaswamy: So, how far are you in this agenda and do you think we have all the tools needed to actually achieve success through this sort of a method?
Neeraj Kayal: Yeah, so just historically for converting a circuit or a program into a low depth program, this was done earlier. Most of this work was done by other people. So, we haven’t contributed much in that direction. We have been trying to prove certain polynomials don’t have small depth and small sized arithmetic circuits. So, it’s not clear to us whether the existing techniques are good enough to prove this or not. And like on Mondays, Wednesdays, and Fridays, I think they are capable maybe, and on the other days I think maybe not. And this is what researchers generally deal with. Especially in these areas where you don’t know whether your tools are good enough or not. And very recently, just last year, there was a big breakthrough by trio of complexity theorists who showed somewhat good lower bounds for all constant depth arithmetic formulas or circuits. And what was surprising also about this result is that, they use in a very clever way, techniques that were already known.
Ravi Krishnaswamy: So, they would have probably shown it on a Monday or Wednesday or Friday.
Neeraj Kayal: Yes, yes. [Laughs]
Ravi Krishnaswamy: OK, that’s very interesting. So, you still don’t know whether this will lead to success or not through this route.
Neeraj Kayal: Yes, yeah, we still don’t know that.
Ravi Krishnaswamy: Are there other people approaching this problem through other techniques?
Neeraj Kayal: So, there’s a program called the Geometric Complexity Theory program initiated independently by other people who basically try to understand symmetries. Because implicit in this question is a whole bunch of symmetry, then they try to exploit that. And there’s a field of mathematics called group theory and representation theory, which is all about understanding symmetries of objects. That area is beautiful, really beautiful, and a lot of advancement has been made there. So, people have been trying to also attack this problem through using those tools.
Ravi Krishnaswamy: Yeah, that’s very nice, I think. So basically, you’re saying a lot of like diverse techniques from math and computer science are at play here and trying to help you on your progress.
Neeraj Kayal: That’s right.
Ravi Krishnaswamy: I see. I mean, it’s very beautiful. I find it fascinating and beautiful that a lot of these different diverse techniques from mathematics and computer science come into play into establishing these lower bounds. And what’s more fascinating to me is that they are all not just from an intellectual curiosity standpoint. They seem to be powering a lot of things that we take for granted, right, right from, like, as you said, messaging each other through social networks or whatever it is. They seem to be like at the foundation- the inherent hardness of certain problems seem to be at the foundation of a lot of things that we take for granted.
Neeraj Kayal: Yeah, that’s right, Ravi. So, for example, I do transactions using my mobile phone and anyone who is within a reasonable distance of my mobile phone can read all the signals that my phone is sending. So, they can see all the communication that I’m having with the bank. And the fact that despite that they are not able to infer my banking passwords relies on the fact that certain problems are very inherently hard to solve and that’s what we are trying to prove.
Ravi Krishnaswamy: OK, so that’s very interesting Neeraj. And in the last part of this podcast, I want to flip the topic around a little bit. So, you’ve been working a lot on showing lower bounds, and in lower bounds in arithmetic complexity. But lately in the last couple of years you have also been using those insights into showing some very nice algorithms for some learning problems. I find that also very cool, so maybe you can talk a little bit about that.
Neeraj Kayal: Yeah, so the algorithms that we are trying to devise are trying to solve the following problem. More general version of it is the following. Given a function or a polynomial, what’s the smallest number of operations that you need to do to be able to compute that function or polynomial? So, for Boolean functions this has a very long history. That essentially is like designing chips, and you can imagine it was naturally very useful to think about. But more recently, it turns out a lot of works have found another very surprising connection because of which this problem specifically for polynomials has also become very interesting. And the connection is this. Suppose you have some very big data set. For now, think of this data set as consisting of a bunch of points in high dimensional space. For example, you can think of images as a point, every image as a point in the high dimensional space. Now it turns out that you can take statistics of this data. So, for example, you can take what’s the average value of the first coordinate, what’s the average value of the second coordinate? Or what’s the average value of the product of the first two coordinates in this data set and so on. So, you can take some of these statistics, encode them as the coefficients of a polynomial. And here’s the interesting part. When the data has some very nice structure, then this polynomial tends to have a small circuit.
Ravi Krishnaswamy: I see.
Neeraj Kayal: And so, when you want to understand the structure of data, so this general area is called unsupervised learning. Turns out that it’s useful to find small circuits for polynomials. So, this is the computational problem that we are looking at: given a polynomial, what’s the smallest number of operations, or what’s the smallest circuit representing this polynomial.
Ravi Krishnaswamy: So, if you’re able to find the smallest circuit representing this, then from that you will be able to infer the underlying distribution or the structure of the underlying data.
Neeraj Kayal: Yes, yes, that’s right. So, this is one connection, and it also turns out that the lower bounds that we are proving, showing that certain things are very hard to compute are also useful for now devising algorithms to find the circuits of polynomials which do have small circuits and maybe let me give you some very rough sense of how that comes about, and I find this a bit fascinating. Here’s how the lower bounds proofs work. So, underlying all those lower bounds for the various subclasses of circuits that we do have is a collection of linear maps and now it turns out that when you are given a polynomial which has a small circuit, using this polynomial and the collection of linear maps, which go into the lower bound proof you can form another big linear map, such that, very roughly, the eigen spaces of this new linear map correspond to the smallest circuit for F.
Ravi Krishnaswamy: I see.
Neeraj Kayal: And this was the connection that we discovered some time ago, which helped us find small circuits.
Ravi Krishnaswamy: So, you find small circuits by computing the eigen space of the of the map.
Neeraj Kayal: Yes, of this other linear map. That’s right Ravi.
Ravi Krishnaswamy: I see that’s very nice. Ok, so I think we covered a lot of the topics that I wanted to cover, so maybe I’ll end with two philosophical questions. So, one is you began the podcast by talking about how as a kid, you thought computers or machines could be able to do everything that human intelligence can do. So, I think it’s a vague question, but what’s your take on that now? And two is what advice would you give for budding theoreticians, whether they’re in school or college or grad school? What sort of advice would you give them?
Neeraj Kayal: So, for the first question, Ravi, I know a lot of other people also share this feeling, that it’s just a matter of time before we figure out how computers can themselves learn like humans do. Just human babies, they have an amazing ability to learn by observing things around them. And currently, despite all the progress, computers don’t have that much ability. But I just think it’s a matter of time before we figure that out, some sort of general artificial intelligence. To your second question, Ravi, I don’t have much to offer other than perhaps a banal comment that anyone looking to work in this area should really enjoy thinking about these kinds of problems. They tend to be rather abstract, sometimes the applications are not always apparent, but if you enjoy thinking about them, I’m sure you’ll do well.
Ravi Krishnaswamy: That’s great, Neeraj. It’s been a pleasure chatting with you. Thanks a lot for your time and hope you had fun.
Neeraj Kayal: Yeah, thanks Ravi. Thanks a lot for having me.
[Music]