Today's software environment is becoming increasingly complex. Developers must write code for parallel and distributed systems, systems that are often constructed from components written in multiple languages using complex libraries and frameworks, and that operate on an alphabet soup of data formats. Ensuring reliability while providing maintainability and high performance presents a difficult challenge.
One approach to these challenges is to use domain-specific languages (DSLs). These ``little languages'' enable the introduction of new abstractions that use application-specific notation and that are checked by application-specific analyses. Embedded DSLs are hosted within another programming language, allowing them to work with the host language's existing editors, build managers, and other tools. However, because embedded DSLs are restricted to the syntax and semantics of the host language, no semantic checking is performed beyond what is performed by the host language compiler, and the syntax of the DSL may be constrained and awkward.
To address these limitations---to allow for richer syntax and stronger correctness guarantees---languages such as Java and Scala allow programmers to write compiler extensions, or plugins, that extend the host programming language. Plugins modify or extend an existing compiler with new functionality to enable additional static checking and code transformations, including optimizations. Plugins provide a mechanism to perform static analyses and code transformations on programs written in an embedded DSL.
However, compiler plugins present a number of challenges. By permitting code transformations, plugins can change the language semantics in unexpected ways. Plugins can interfere with each other, introducing conflicting syntax or semantics. These incompatibilities can lead different developers to develop separate, incompatible language extensions: the developer community can fragment into several groups that each use a different dialect of the host language. Finally, compiler extensions are difficult to write and to maintain. They require intimate knowledge of the internals of the host language compiler; moreover, as the compiler itself evolves, the extensions have to be updated.
In the proposed project:
* We will develop a theoretical framework for safe, modular compiler extensions that formalizes what plugins are and are not permitted to do. The challenge is to provide sufficient restrictions to be safe, but not so many restrictions that expressivity is lost.
* We will instantiate this framework in the Scala compiler, investigating how to ensure compiler extension safety and modularity in an existing statically typed language.
* We will implement Kanga, a new dynamic language designed from the ground up to support syntactic and semantic language extensions without the limitations imposed by working in the context of an existing programming language.
We will validate our approach, first, by formally proving safety properties of our framework and, second, by using the framework to implement several compiler extensions. The project builds on our previous work with the Polyglot extensible compiler framework, with the X10 programming language, and with the plugin system of Thorn.