Generating Compilers from Formal Semantics

Herman Venter
Department of Computer Science
University of Port Elizabeth

Introduction

Given a complete, formal description of the semantics of a programming language, it should be possible to generate a compiler for the language using an automated process. This is highly desirable, since it reduces the work required to produce a compiler to the absolute minimum and makes it more likely that the compiler will be correct. This text describes a set of tools and a methodology that does just this, for a realistic class of programming languages.

There are already numerous methods for giving formal descriptions of programming language semantics. However, none of the compiler generators built around them has yet reached the point where realistic compilers can be generated for real languages.

There are also numerous other, more realistic tool kits that aid compiler construction, but none of these can generate a compiler given only a formal description of the semantics of a programming language. In general, the work required to generate a compiler using such a tool kits is considerably more than the absolute minimum.

Perhaps the main reason why it is proving so difficult to use formal methods to generate compilers, is that the complexity of formal descriptions of programming languages usually rival or exceed the complexity of hand-crafted compilers. Moreover, these descriptions are in notations which are complex and usually very different from programming languages.

The chief contribution of the work reported here is that it provides a way to express the formal semantics of a programming language in another programming language, which makes it possible to use the techniques of object-oriented programming to manage the complexity of the description

The key to understanding how this can be done is to realize that an abstract syntax tree can be viewed as an expression in an object oriented programming language and that evaluating this expression amounts to running the program corresponding to the abstract syntax tree.

Abstract Syntax Trees as Expressions

An example compiler

Optimizing compilers

Conclusion