MergeBERT: Program Merge Conflict Resolution via Neural Transformers
- Alexey Svyatkovskiy ,
- Todd Mytkowicz ,
- Negar Ghorbani ,
- Sarah Fakhoury ,
- Elizabeth Dinella ,
- Christian Bird ,
- Neel Sundaresan ,
- Shuvendu Lahiri
Collaborative software development is an integral part of the modern software development life cycle, essential to the success of large-scale software projects. When multiple developers make concurrent changes around the same lines of code, a merge conflict may occur. Such conflicts stall pull requests and continuous integration pipelines for hours to several days, seriously hurting developer productivity.
In this paper, we introduce MergeBERT, a novel neural program merge framework based on the token-level three-way differencing and a transformer encoder model. Exploiting restricted nature of merge conflict resolutions, we reformulate the task of generating the resolution sequence as a classification task over a set of primitive merge patterns extracted from real-world merge commit data.
Our model achieves 64–69% precision of merge resolution synthesis, yielding nearly a 2x performance improvement over existing structured and neural program merge tools. Finally, we demonstrate versatility of our model, which is able to perform program merge in a multilingual setting with Java, JavaScript, TypeScript, and C# programming languages, generalizing zero-shot to unseen languages.