Back to overview

An Exploratory Study on the Usage of Gherkin Features in Open-Source Projects

Type of publication Peer-reviewed
Publikationsform Proceedings (peer-reviewed)
Author Chandorkar Adwait, Patkar Nitish, Sorbo Andrea Di, Nierstrasz Oscar,
Project Agile Software Assistance
Show all

Proceedings (peer-reviewed)

Title of proceedings 5th Workshop on Validation, Analysis and Evolution of Software Tests (VST 2022, co-located with SANER 2022)

Open Access

Type of Open Access Repository (Green Open Access)


With behavior-driven development (BDD), domain experts describe system behavior and desired outcomes through natural language-like sentences, e.g., using the Gherkin language. BDD frameworks partially convert the content of Gherkin specifications into executable test code. Previous studies have reported several issues with the current BDD practice, for example long repetitive Gherkin specifications and slow-running test suites. Data tables and additional features were added to the Gherkin syntax to express compactly test inputs (e.g., provide different combinations of input values and desired outputs to run tests multiple times) and also to improve the readability of Gherkin files (henceforth called spec files). However, there is no empirical evidence about the actual usage of these Gherkin features. To fill this gap, we analyzed the content of 1,572 spec files extracted from 23 open-source projects. For each spec file, we collected a set of metrics modeling the structure and the usage of the different Gherkin features. We found that only a minority of the considered spec files (i.e., 590) used data tables that contain two rows, on average. We also used statistical tests to compare the contents of spec files with and without data tables and found significant differences between the two populations, especially for what concerns the number of lines of code (LoC). On the one hand, our results shed some light on the discrepancies between the recommendations for defining Gherkin specifications and their actual adoption in practice. On the other hand, our findings demonstrate that the adoption of additional features, such as data tables, might only partially help to reduce the length of Gherkin specifications.