Predictability of identifier naming with Copilot: A case study for mixed-initiative programming tools
- Michael Jing Long Lee ,
- Advait Sarkar ,
- Alan F. Blackwell
Studies show that predictive text entry systems make writing faster, but written content more predictable. We consider if these trade-offs extend to code synthesis tools such as GitHub Copilot. While Copilot can make developers produce code faster, it may also affect how they choose identifiers for methods and classes. This may have non-trivial effects on the activity of programming, because identifier names are a primary semantic signal in code, and play important roles in authoring, debugging, and developer communication. In a controlled, within-subjects experiment (n=12), we compared identifiers chosen in the presence and absence of Copilot suggestions. We find that identifiers chosen in the presence of Copilot suggestions were significantly more predictable (have lower mean entropy), even when suggestions were only visible and could not be automatically accepted. These results imply that mixed-initiative systems can take an active role in shaping programmer intentions and potentially impact their sense of agency. We consider whether an increased convergence towards predictable names is an asset or a liability for the practice of programming, and suggest design opportunities for surfacing surprising identifiers and conceptual refactoring tools.