Back to overview

A Large-Scale Empirical Study on Code-Comment Inconsistencies

Type of publication Peer-reviewed
Publikationsform Proceedings (peer-reviewed)
Author WenFengcai, NagyCsaba, BavotaGabriele, LanzaMichele,
Project PROBE - Live Actionable Software Analytics
Show all

Proceedings (peer-reviewed)

Title of proceedings Proceedings of ICPC 2019 (27th International Conference on Program Comprehension)

Open Access

Type of Open Access Repository (Green Open Access)


Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to inves- tigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code- comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.