Presented by Jiang Bian at Microsoft Research Forum, June 2024
There is “a substantial demand for advanced generative AI tailored to enhance core business operations. However, in our dialogues with strategic partners, we have identified crucial gaps in current generative AI capabilities versus the specific needs of industry applications. … Our research is crucial in addressing these limitations and amplifying the underappreciated potential of generative AI.”
– Jiang Bian, Senior Principal Research Manager, Microsoft Research Asia
- Microsoft research copilot experience What examples of new industrial applications for AI did Jiang Bian discuss in this talk?
Transcript: Lightning Talk
Driving Industry Evolution: Exploring the Impact of Generative AI on Sector Transformation
Jiang Bian, Senior Principal Research Manager, Microsoft Research Asia
Jiang Bian discusses how generative AI transforms industries by bridging gaps between AI capabilities and sector needs. He will showcase domain-specific foundation models and versatile AI agents, setting new industry standards.
Microsoft Research Forum, June 4, 2024
JIANG BIAN: Hello, everyone. My name is Jiang. Today, I’m excited to discuss the work we are undertaking at Microsoft Research Asia focusing on leveraging generative AI to drive transformation and evolution across various industries.
Our efforts are inspired by our unique co-innovation initiative with world-renowned partners from a few core sectors, including finance, manufacturing, energy, and so on. These collaborations have highlighted a substantial demand for advanced generative AI tailored to enhance core business operations. However, in our dialogues with strategic partners, we have identified crucial gaps in current generative AI capabilities versus the specific needs of industry applications. These include a too-narrow focus on human-like AI but not critical industry applications, limitations in processing complex and noisy data, and concerns about reliability in complex decision-making scenarios. Our research is crucial in addressing these limitations and amplifying the underappreciated potential of generative AI in high-value sectors. We are focusing on two main approaches: developing domain-specific foundation models that enhance analytical and predictive capabilities or enable interactive and controllable simulations and creating a versatile foundation-model-as-agent system for diverse industry decision-making tasks.
Our first project is transforming the way industrial data is analyzed and utilized. Facing diverse data formats like tabular, time series, and graph from various sectors, we are employing Generative Data Learning to enhance the large language model with strong ability to interpret and process diverse data formats by transforming them into a unified, instruction-oriented language. With training over this diverse sector data for [numerous]tasks, this approach enables more intuitive data analytics and predictions across various industries. Initial experiments on a typical classification and regression task over tabular data have shown that even a relatively small-scale model enhanced by our Generative Data Learning approach can outperform both general large language models and traditional models like tree ensembles, particularly in few-shot scenarios. This suggests the significant potential for a single-model solution with no extensive model training or fine-tuning in exploring industrial data intelligence maybe with only few-shot examples.
Our second project is exploring building foundation models over domain-specific data, and we focus on financial markets given its fundamental data is orders. We have developed a dual-level foundation model called Large Market Model that uses transformers on both the order sequence to model the market dynamics and the order-batch sequence to align the market trend with control signals. The performance of financial market simulations based on this Large Market Model has been very promising. They have excelled in forecasting market trends, simulating extreme scenarios for stress tests, and detecting market manipulations efficiently.
Our third project focuses on creating a decision-making agent through the knowledge-augmented generation and adaptive retrieval. This agent is essentially a trainable model that generates and extracts domain-specific knowledge, dynamically updating itself and retrieving most appropriate knowledge to handle changing environment. This adaptive approach is particularly useful in many industrycontrol applications, such as HVAC control with the goal of optimizing energy use while maintaining comfort. Deploying this agent into this scenario has shown it can outperform traditional reinforcement learning methods, saving significantly more energy, especially in unknown environments or when facing perturbations.
In summary, at MSR Asia, we are committed to advancing the development of generative AI to catalyze industry evolution through innovative research and partnership. We will soon be sharing more details about these projects through upcoming papers and open-source initiatives. We invite you, especially our industry partners, to stay tuned and join us in driving these transformative efforts forward. Thank you.
“Foundation models, also known as large language models, possess immense potential across a variety of industries. Yet, some companies and organizations limit their use of these expansive AI models to niche areas, including intelligent customer service, chatbots, or text and image generation. In reality, these foundation models demonstrate robust abilities in reasoning, content creation, and generalization, making them exceptionally fit for high-stakes business tasks. These tasks range from taking accurate prediction and forecasting, optimizing industrial control and complex decision-making, and conducting intelligent and interactive industrial simulations.”
— Jiang Bian, Senior Principal Research Manager, Microsoft Research Asia
As the development of large AI models, also known as foundation models, progresses, companies and organizations are becoming increasingly excited about their potential for enhancing productivity. However, a significant trend has been observed: many industry practitioners focus heavily on the human-like qualities of AI, such as conversational abilities, writing skills, creativity, and perceptual capabilities. In deploying these large AI models, there is a tendency to prioritize applications in intelligent customer service, chatbots, and other so-called ”human-like” functions. Unfortunately, this emphasis may restrict our comprehension and use of these potent models, hindering our ability to fully unleash their capabilities within various industries.
This limitation is not without reason. Incorporating foundation models into practical, production-oriented scenarios is still in its infancy, with few mature and widespread examples to follow. Viewing AI as a “production tool” is akin to possessing a tool before fully understanding its potential applications. Furthermore, humanity has rarely, if ever, encountered such a versatile yet uncertain tool that is not designed for specific tasks.
Additionally, the complexity and variety inherent in different industries require foundation models that move beyond traditional perceptions. This necessitates synchronized innovation in models at the industry level, enabling them to fully exploit the capabilities of foundation models across diverse industrial landscapes and to better align with AI applications. Instead of limiting AI to a “chat robot” role, we should broaden our perspective. Transforming industries in the AI era involves rethinking current business processes and frameworks, leading to collaborative models that can effortlessly integrate humans and foundation models.
Unlocking the boundless potential of foundation models in industry
Foundation models are endowed with broad capabilities in data representation, knowledge comprehension, and reasoning, allowing them to adjust seamlessly across various domains and scenarios, and swiftly adapt to new environments. Concurrently, digital platforms across industries have evolved, amassing substantial amounts of industry-specific data. This rich repository of knowledge and information positions foundation models to integrate effortlessly into industrial settings.
In practical terms, the advanced reasoning abilities of foundation models provide users with a deeper understanding of data. By extracting valuable insights from large datasets and identifying patterns and correlations, these models deliver more effective recommendations and deeper insights. This benefit is especially vital in industrial contexts, where prediction, decision-making, and simulation play crucial roles.
One of the standout features of foundation models is their exceptional ability to generalize. Before their advent, each industry scenario required specific data to train bespoke AI models, limiting scalability and hindering the full commercial exploitation of AI. Foundation models, with their access to a global pool of knowledge, markedly improve generalization. As a result, industries are freed from the necessity of developing unique models for every situation, overcoming a major limitation of traditional AI solutions.
Moreover, foundation models can work in tandem with generative AI to increase the accuracy, realism, and interactivity of industrial simulations and intelligent modeling, facilitating the creation of digital twins. These simulations and models aim to mimic and test real-world scenarios, which often involve complex roles and intricate environments. Traditional AI models may simplify real-world complexities or miss crucial extreme events, compromising the fidelity and authenticity of simulations. In contrast, generative large AI models, steeped in domain-specific knowledge, establish accurate mappings between specific data dimensions and real-world occurrences. This method allows for simulations that closely mirror reality, significantly aiding industrial forecasting and decision-making processes while maintaining adherence to industry standards.
Navigating four key challenges in implementing foundation models for industry
In the industrial sector, tasks of paramount importance and commercial value include precise forecasting and control, efficient optimization of decisions, and complex duties associated with intelligent and interactive industrial simulations. These areas should be the primary focus for traditional industrial enterprises. Yet, when assessing existing foundation models like GPT and the actual needs within industrial domains, we uncover significant mismatches between the capabilities of these models and the real demands of industry. To bridge this gap and fully leverage their potential, several challenges must be addressed.
First, there is a notable absence of a universal framework capable of effectively extracting complex domain knowledge from diverse field data and using this knowledge to construct intelligent agents. Various domains contain rich and complex data, such as logistics companies dealing with customs information and cross-national policies, pharmaceutical industries with FDA drug review documents, and the legal industry with numerous regulations. Developing intelligent agents that are deeply rooted in domain knowledge calls for a more generalized framework. This framework should be proficient in extracting crucial domain knowledge, identifying hidden connections between data and knowledge, and managing this information efficiently.
Second, while foundation models are adept at generating textual content, their proficiency in processing and understanding structured data, like numerical or tabular information, is lacking. Industrial scenarios often involve structured data, such as health monitoring indicators, battery charge-discharge cycles, and financial transactions. Current large models are not specifically designed or optimized for processing such data, which complicates accurate prediction and classification tasks based on structured inputs.
Third, in practical applications, foundation models currently fall short in stability and reliability for decision-making. Critical industries like energy, logistics, finance, and healthcare require dependable decision-making for tasks such as optimizing logistics routes, controlling energy equipment, formulating investment strategies, and allocating medical resources. These tasks often involve numerous variables and constraints, especially under dynamic environmental changes. Foundation models have yet to fully adapt to these complex industrial tasks, making direct application challenging.
Lastly, there is a lack of insight into domain-specific foundational data, as well as methodologies and experience for developing domain-specific foundation models. Essential information in many specialized fields extends beyond mere text, incorporating unique data structures and semantic relationships. For example, transaction order information in the financial investment field or molecular structure details in the biopharmaceutical industry contain critical knowledge often embedded in such foundational data. A deeper, more nuanced analysis is required. Creating domain-specific foundation models grounded in this detailed understanding is crucial for effectively leveraging and unlocking the potential of data in these fields.
Constructing industry foundation models: harmonizing general knowledge and domain expertise
To expedite the adoption and application of foundation models in industry, we can concentrate on several pivotal areas.
First, we can harness rich and complex industrial domain data to construct a more versatile, efficient, and practical retrieval-augmented generation (RAG) framework. This framework is designed to adapt seamlessly to various vertical domains, extracting essential domain knowledge, uncovering hidden associations between data and knowledge, and effectively organizing and managing this wealth of information.
Second, by carefully considering critical numerical data and the corresponding structured dependencies prevalent in industrial scenarios, we can design foundation models specifically optimized for industrial applications. These models effectively integrate general knowledge with domain-specific expertise derived from temporal or tabular data, thereby enabling more effective solutions for tasks such as prediction and classification within the industry.
Another avenue we are actively exploring involves harnessing the potent generation, generalization, and transfer capabilities inherent in foundation models to elevate the quality and efficiency of industrial decision-making. We are pursuing two distinct paths: first, treating foundation models as intelligent agents, and; second, leveraging foundation models to assist reinforcement-learning agents.
Treating foundation models as intelligent agents: By leveraging the pre-existing knowledge encoded in foundation models and integrating offline reinforcement learning, we can continuously acquire new domain-specific insights and fine-tune the models. This evolutionary process enhances the optimization and decision-making capabilities of foundation models, enabling them to prioritize industry-specific tasks.
Foundation models optimized for specific tasks can play a pivotal role across various industrial contexts. In formula racing, for example, these foundation models can optimize tire-maintenance strategies. By considering tire wear and repair costs, they determine the optimal pit stop timing, thereby shortening race duration and improving car rankings. In chemical manufacturing, leveraging these foundation models can significantly enhance efficiency in product storage and pipeline coordination during production processes, ultimately boosting overall production-execution efficiency. Furthermore, due to their generalization capabilities and robustness, foundation models can be swiftly adapted to optimize air conditioning control, ensuring comfortable temperatures while minimizing energy consumption.
Assisting reinforcement learning agents with foundation models: We can empower models to acquire universal representations that rapidly adapt to diverse environments and tasks, thereby enhancing their generalization capabilities. In this approach, we introduce a pre-trained world model that emulates human learning and decision-making processes, ultimately bolstering industrial decision-making. By harnessing a pre-trained world model with extensive knowledge and adopting a two-stage pre-training framework, developers can comprehensively and flexibly train foundation models for industrial decision-making, extending their applicability to any specific decision scenario.
We partnered with the Microsoft Xbox team to rigorously validate the effectiveness of our framework in game-testing scenarios. By harnessing this framework, we pre-trained a specialized world model tailored for game maps. This model directly tackles the challenge of long-term spatial reasoning and navigation, leveraging landmark observations within novel game environments. The results were remarkable: our pre-trained model significantly outperformed counterparts that lacked a world model or relied on traditional learning methods. As a result, game exploration efficiency was greatly enhanced.
Moreover, we can harness domain-specific foundational data and the precise semantic information it encapsulates to develop foundation models within the domain, thereby unlocking novel opportunities for intelligent, interactive decision-making, and simulation. For example, by analyzing transactional data from financial markets, we can construct robust investment models. These foundational datasets extend beyond mere textual characters; they embody intricate semantic structures and valuable information. Leveraging this financial foundation model, we can generate customized order flows for various market styles, simulate large-scale order transactions across diverse market environments, and conduct controlled experiments in the financial investment landscape. This approach empowers us to gain deeper insights into market fluctuations and devise strategies for extreme scenarios.
Foundation models propel the next industrial digital transformation
Microsoft Research Asia has long recognized that the widespread adoption of AI in industry necessitates continuous technological exploration, experimentation, and breakthroughs. Through collaborative efforts with partners across various industries, we have developed open-source models, including the Qlib AI quantitative investment platform, the MARO multi-agent resource optimization platform, the FOST spatial-temporal prediction tool, and the BatteryML battery performance analysis and prediction platform. These industry-oriented AI platforms, tools, and models not only play a pivotal role in industry but also serve as critical data and foundational components for implementing cutting-edge foundation models.
Building upon successful experiences in industrializing AI, we have embarked on the exploration of domain-specific foundation models tailored for industry, drawing from the dimensions previously discussed. Our findings reveal that these foundation models possess significant potential to diverge from conventional large-scale model paradigms and profoundly impact industrial transformation.
Envision a future where foundation models empower knowledge management, extraction, and iterative processes across industries. Furthermore, we are actively investigating how foundation models can support companies in achieving automated research and development (R&D). This encompasses tasks such as automatically identifying R&D directions, generating algorithmic research proposals, automating R&D processes and scientific experiments, and iteratively refining research approaches. In essence, AI will autonomously propel data-centric industrial R&D, fundamentally revolutionizing industry operations.
Foundation models are poised to become the driving force behind industrial digital transformation, mirroring the transformative impact of the internet and cloud computing. These models are set to unleash a new wave of industrial innovation. We eagerly anticipate collaborating with additional industry partners, immersing ourselves in real-world scenarios, and exploring diverse applications for foundation models within the industrial landscape, thereby fully unlocking their commercial potential.
Author
Dr. Jiang Bian currently serves as a senior principal research manager at Microsoft Research Asia. He leads the Machine Learning Group and the Industry Innovation Center at Microsoft Research Asia.
His team’s research spans deep learning, reinforcement learning, and privacy computing, with a focus on cutting-edge applications of AI in vertical domains such as finance, energy, logistics, manufacturing, healthcare, and sustainable development.
Dr. Jiang Bian has authored over a hundred research papers published in top-tier international conferences and journals. Also, he holds several U.S. patents. Dr. Jiang actively contributes to the academic community by serving on program committees for various prestigious international conferences and acting as a reviewer for leading international journals. In recent years, Dr. Jiang’s team has made significant strides in applying AI-based prediction and optimization techniques to critical scenarios across diverse fields, such as finance, logistics, and healthcare. Furthermore, they have generously shared relevant technologies and frameworks with the open-source community.
Dr. Jiang Bian completed his undergraduate studies at Peking University, earning a bachelor’s degree in computer science. He then pursued further studies at the Georgia Institute of Technology in the United States, where he obtained his Ph.D. in computer science.
Related resources
- Research Lab Microsoft Research Lab – Asia
- Publication RD2Bench: Toward Data-Centric Automatic R&D