Members of the research community at Microsoft work continuously to advance their respective fields. Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.
In this episode, Senior Researcher Chang Liu joins host Gretchen Huizinga to discuss “Overcoming the barrier of orbital-free density functional theory for molecular systems using deep learning.” In the paper, Liu and his coauthors present M-OFDFT, a variation of orbital-free density functional theory (OFDFT). M-OFDFT leverages deep learning to help identify molecular properties in a way that minimizes the tradeoff between accuracy and efficiency, work with the potential to benefit areas such as drug discovery and materials discovery.
Transcript
[MUSIC PLAYS]
GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief. I’m Dr. Gretchen Huizinga. In this series, members of the research community at Microsoft give us a quick snapshot—or a podcast abstract—of their new and noteworthy papers.
[MUSIC FADES]
Today, I’m talking to Dr. Chang Liu, a senior researcher from Microsoft Research AI4Science. Dr. Liu is coauthor of a paper called “Overcoming the barrier of orbital-free density functional theory for molecular systems using deep learning.” Chang Liu, thanks for joining us on Abstracts!
CHANG LIU: Thank you. Thank you for this opportunity to share our work.
HUIZINGA: So in a few sentences, tell us about the issue or problem your paper addresses and why people should care about this research.
LIU: Sure. Since this is an AI4Science work, let’s start from this perspective. About science, people always want to understand the properties of matters, such as why some substances can cure disease and why some materials are heavy or conductive. For a very long period of time, these properties can only be studied by observation and experiments, and the outcome will just look like magic to us. If we can understand the underlying mechanism and calculate these properties on our computer, then we can do the magic ourselves, and it can, hence, accelerate industries like medicine development and material discovery. Our work aims to develop a method that handles the most fundamental part of such property calculation and with better accuracy and efficiency. If you zoom into the problem, properties of matters are determined by the properties of molecules that constitute the matter. For example, the energy of a molecule is an important property. It determines which structure it mostly takes, and the structure indicates whether it can bind to a disease-related biomolecule. You may know that molecules consist of atoms, and atoms consist of nuclei and electrons, so properties of a molecule are the result of the interaction among the nuclei and the electrons in the molecule. The nuclei can be treated as classical particles, but electrons exhibit significant quantum effect. You can imagine this like electrons move so fast that they appear like cloud or mist spreading over the space. To calculate the properties of the molecule, you need to first solve the electronic structure—that is, how the electrons spread over this space. This is governed by an equation that is hard to solve. The target of our research is hence to develop a method that solves the electronic structure more accurately and more efficiently so that properties of molecules can be calculated in a higher level of accuracy and efficiency that leads to better ways to solve the industrial problems.
HUIZINGA: Well, most research owes a debt to work that went before but also moves the science forward. So how does your approach build on and/or differ from related research in this field?
LIU: Yes, there are indeed quite a few methods that can solve the electronic structure, but they show a harsh tradeoff between accuracy and efficiency. Currently, density functional theory, often called DFT, achieves a preferred balance for most cases and is perhaps the most popular choice. But DFT still requires a considerable cost for large molecular systems. It has a cubic cost scaling. We hope to develop a method that scales with a milder cost increase. We noted an alternative type of method called orbital-free DFT, or called OFDFT, which has a lower order of cost scaling. But existing OFDFT methods cannot achieve satisfying accuracy on molecules. So our work leverages deep learning to achieve an accurate OFDFT method. The method can achieve the same level of accuracy as conventional DFT; meanwhile, it inherits the cost scaling of OFDFT, hence is more efficient than the conventional DFT.
HUIZINGA: OK, so we’re moving acronyms from DFT to OFDFT, and you’ve got an acronym that goes M-OFDFT. What does that stand for?
LIU: The M represents molecules, since it is especially hard for classical or existing OFDFT to achieve a good accuracy on molecules. So our development tackles that challenge.
HUIZINGA: Great. And I’m eager to hear about your methodology and your findings. So let’s go there. Tell us a bit about how you conducted this research and what your methodology was.
LIU: Yeah. Regarding methodology, let me delve into a bit into some details. We follow the formulation of OFDFT, which solves the electronic structure by optimizing the electron density, where the optimization objective is to minimize the electronic energy. The challenge in OFDFT is, part of the electronic energy, specifically the kinetic energy, is hard to calculate accurately, especially for molecular systems. Existing computation formulas are based on approximate physical models, but the approximation accuracy is not satisfying. Our method uses a deep learning model to calculate the kinetic energy. We train the model on labeled data, and by the powerful learning ability, the model can give a more accurate result. This is the general idea, but there are many technical challenges. For example, since the model is used as an optimization objective, it needs to capture the overall landscape of the function. The model cannot recover the landscape if only one labeled data point is provided. For this, we made a theoretical analysis on the data generation method and found a way to generate multiple labeled data points for each molecular structure. Moreover, we can also calculate a gradient label for each data point, which provides the slope information on the landscape. Another challenge is that the kinetic energy has a strong non-local effect, meaning that the model needs to account for the interaction between any pair of spots in space. This incurs a significant cost if using the conventional way to represent density—that is, to using a grid. For this challenge, we choose to expand the density function on a set of basis functions and use the expansion coefficients to represent the density. The benefit is that it greatly reduces the representation dimension, which in turn reduces the cost for non-local calculation. These two examples are also the differences from other deep learning OFDFT works. There are more technical designs, and you may check them in the paper.
HUIZINGA: So talk about your findings. After you completed and analyzed what you did, what were your major takeaways or findings?
LIU: Yeah, let’s dive into the details, into the empirical findings. We find that our deep learning OFDFT, abbreviated as M-OFDFT, is much more accurate than existing OFDFT methods with tens to hundreds times lower error and achieves the same level of accuracy as the conventional DFT.
HUIZINGA: Wow …
LIU: On the other hand, the speed is indeed improved over conventional DFT. For example, on a protein molecule with more than 700 atoms, our method achieves nearly 30 times speedup. The empirical cost scaling is lower than quadratic and is one order less than that of conventional DFT. So the speed advantage would be more significant on larger molecules. I’d also like to mention an interesting observation. Since our method is based on deep learning, a natural question is, how accurate would the method be if applied to much larger molecules than those used for training the deep learning model? This is the generalization challenge and is one of the major challenges of deep learning method for molecular science applications. We investigated this question in our method and found that the error increases slower than linearly with molecular size. Although this is not perfect since the error is still increasing, but it is better than using the same model to predict the property directly, which shows an error that increases faster than linearly. This somehow shows the benefits of leveraging the OFDFT framework for using a deep learning method to solve molecular tasks.
HUIZINGA: Well, let’s talk about real-world impact for a second. You’ve got this research going on in the lab, so to speak. How does it impact real-life situations? Who does this work help the most and how?
LIU: Since our method achieves the same level of accuracy as conventional DFT but runs faster, it could accelerate molecular property calculation and molecular dynamic simulation especially for large molecules; hence, it has the potential to accelerate solving problems such as medicine development and material discovery. Our method also shows that AI techniques can create new opportunities for other electronic structure formulations, which could inspire more methods to break the long-standing tradeoff between accuracy and efficiency in this field.
HUIZINGA: So if there was one thing you wanted our listeners to take away, just one little nugget from your research, what would that be?
LIU: If only for one thing, that would be we develop the method that solves molecular properties more accurately and efficiently than the current portfolio of available methods.
HUIZINGA: So finally, Chang, what are the big unanswered questions and unsolved problems that remain in this field, and what’s next on your research agenda?
LIU: Yeah, sure. There indeed remains problems and challenges. One remaining challenge mentioned above is the generalization to molecules much larger than those in training. Although the OFDFT method is better than directly predicting properties, there is still room to improve. One possibility is to consider the success of large language models by including more abundant data and more diverse data in training and using a large model to digest all the data. This can be costly, but it may give us a surprise. And another way we may consider is to incorporate mathematical structures of the learning target functional into the model, such as convexity, lower and upper bounds, and some invariance. And such structures could regularize the model when applied to larger systems than it has seen during training. So we have actually incorporated some such structures into the model, for example, the geometric invariance, but other mathematical properties are nontrivial to incorporate. We made some discussions in the paper, and we’ll engage working on that direction in the future. The ultimate goal underlying this technical development is to build a computational method that is fast and accurate universally so that we can simulate the molecular world of any kind.
[MUSIC PLAYS]
HUIZINGA: Well, Chang Liu, thanks for joining us today, and to our listeners, thanks for tuning in. If you want to read this paper, you can find a link at aka.ms/abstracts. You can also read it on arXiv, or you can check out the March 2024 issue of Nature Computational Science. See you next time on Abstracts!
[MUSIC FADES]