Learning to Log: Helping Developers Make Informed Logging Decisions
- Jieming Zhu ,
- Pinjia He ,
- Qiang Fu ,
- Hongyu Zhang ,
- Michael R. Lyu ,
- Dongmei Zhang
Published by ICSE 2015
Logging is a common programming practice of practical importance to collect system runtime information for postmortem analysis. Strategic logging placement is desired to cover necessary runtime information without incurring unintended consequences (e.g., performance overhead, trivial logs). However, in current practice, there is a lack of rigorous specifications for developers to govern their logging behaviours. Logging has become an important yet tough decision which mostly depends on the domain knowledge of developers. To reduce the effort on making logging decisions, in this paper, we propose a “learning to log” framework, which aims to provide informative guidance on logging during development. As a proof of concept, we provide the design and implementation of a logging suggestion tool, LogAdvisor, which automatically learns the common logging practices on where to log from existing logging instances and further leverages them for actionable suggestions to developers. Specifically, we identify the important factors for determining where to log and extract them as structural features, textual and syntactic features. Then, by applying machine learning techniques (e.g., feature selection and classifier learning) and noise handling techniques, we achieve high accuracy of logging suggestions. We evaluate LogAdvisor on two industrial software systems from Microsoft and two open-source software systems from GitHub (totally 19.1M LOC and 100.6K logging statements). The encouraging experimental results, as well as a user study, demonstrate the feasibility and effectiveness of our logging suggestion tool. We believe our work can serve as an important first step towards the goal of “learning to log”.