MonitorAssistant: Simplifying Cloud Service Monitoring via Large Language Models

Zhaoyang Yu; Minghua Ma; Chaoyun Zhang; Si Qin; Yu Kang; Chetan Bansal; Saravan Rajmohan; Yingnong Dang; Changhua Pei; Dan Pei; Qingwei Lin 林庆维; Dongmei Zhang

MonitorAssistant: Simplifying Cloud Service Monitoring via Large Language Models

Zhaoyang Yu ,
Minghua Ma ,
Chaoyun Zhang ,
Si Qin ,
Yu Kang ,
Chetan Bansal ,
Saravan Rajmohan ,
Yingnong Dang ,
Changhua Pei ,
Dan Pei ,
Qingwei Lin 林庆维 ,
Dongmei Zhang

Foundations of Software Engineering (FSE) | July 2024

Organized by ACM

Publication

Download BibTex

In large-scale cloud service systems, monitoring metric data and conducting anomaly detection is an important way to maintain reliability and stability. However, great disparity exists between academic approaches and industrial practice to anomaly detection. Industry predominantly uses simple, efficient methods due to better interpretability and ease of implementation. In contrast, academically favor deep-learning methods, despite their advanced capabilities, face practical challenges in real-world applications. To address these challenges, this paper introduces MonitorAssistant, an end-to-end practical anomaly detection system via Large Language Models. MonitorAssistant automates model configuration recommendation achieving knowledge inheritance and alarm interpretation with guidance-oriented anomaly reports, facilitating a more intuitive engineer-system interaction through natural language. By deploying MonitorAssistant in Microsoft’s cloud service system, we validate its efficacy and practicality, marking a significant advancement in the field of practical anomaly detection for large-scale cloud services.