Optimizing Compilers

Compilers produced with the techniques outlined in this text are about as far away from optimizing compilers as one can get. Consider the output produced by the example compiler for the expression

  a + a

namely

  plus(value(new_identifier(`a')), value(new_identifier(`a')))

To evaluate this expression, it is necessary to construct two identifier objects and three number objects. When translated simplistically into machine code, several thousand instructions may be executed. Most of these instructions accomplish at run time, what compilers normally do at compile time.

Ideally, this should not be a problem. The output from the expression compiler must still be submitted to the Slim compiler before it becomes machine code. If the Slim compiler is really clever, it should be able to remove all the redundant instructions.

Unfortunately, this is not yet the case. In fact, using Slim as a target language is still a disadvantage, since Slim itself is implemented using the techniques described in this paper. One is thus using a really slow interpreter to interpret another really slow interpreter. (Fortunately, today's computers are so fast that using this scheme is actually quite tolerable.)

Until such time as advances in the Slim compiler makes it unnecessary to do so, there are a number of things that can be done to speed things up. The easiest is to translate the ASTs into C++ (or similar) instead of into Slim. This is fairly trivial to do. The main disadvantage is that one then has to write the semantics in C++ rather than in Slim. For simple languages this should not be too much of a pain. This is pretty much the approach currently used to compile Slim itself. The result is quite acceptable on a fast processor and can serve to produce a very useful prototype.

Another approach is to partially evaluate the AST before translating it into code. To do this, one augments the semantic classes with classes for unknown (or partially known) values. For example, to turn the expression compiler into a partial evaluator, one first turns it into a proper interpreter by changing the AST walker to directly call the semantic class methods, instead of producing code which calls them. Next, one changes the value method of the identifier class to produce unknown numbers instead of doing an actual input operation and then producing a known number. In other words, we change

  func value(self : Identifier) -> Number
    result := stored_value(name(self))
    if result = Undefined
      repeat
        put name(self)`?' & get v
      until v in number !number is the built-in numeric type
      result := new_number(mkstr(v))
      stored_value(name(self)) := result
    end

  func value(self : Identifier) -> Unumber
    result := stored_value(name(self))
    if result = Undefined
      result := new_unumber(number, `get')
      code_name(result) := name(self)
      stored_value(name(self)) := result
    end

Now, when an expression such as a+a is processed, the plus method for unknown numbers is called, instead of the plus method for known numbers. The result is another unknown number. Unknown numbers keep track of how they are to be computed.

Along with changing input operations to produce unknown numbers, it is necessary to change output operations to produce code. For example, one changes the display method of the number class from

  func display(self : Number) -> Number
    put slim_value(self)
    return self

  func display(self : Number) -> Number
    Code_string +:= `put '+mkstr(slim_value(self))+`\n'
    return self

More interesting code generation happens when one calls the display method for an unknown number:

  func display(self : Unumber) -> Unumber
    Code_string +:= code_for(self)+`put '+code_name(self)+`\n'
    return self

For this to work, each unknown number must keep track of whether it was the result of an input operation or an arithmetic operation. In the latter case, it must also keep track of the operands. The code_for method looks as follows:

  func code_for(self : Unumber) -> String
    code_for(self) := `' !only execute this function once
    return `put \`'+code_name(self)+`?\' &\n' + 
      `repeat get '+code_name(self)+`: Integer until '+code_name(self)+` /= Undefined'+`\n'
      when operator(self) = `get'
    result := code_for(operand1(self)) + code_for(operand2(self)) 
    result +:= code_name(self)+` := '
    if operator(self) = `divide'
      result +:= code_name(operand1(self)) + `/' + code_name(operand2(self))+`\n'
    elsif operator(self) = `minus'
      result +:= code_name(operand1(self)) + `-' + code_name(operand2(self))+`\n'
    elsif operator(self) = `multiply'
      result +:= code_name(operand1(self)) + `*' + code_name(operand2(self))+`\n'
    elsif operator(self) = `plus'
      result +:= code_name(operand1(self)) + `+' + code_name(operand2(self))+`\n'
    else
      assert False
    end

Note that the code string could easily be three address intermediate code which is then submitted to a classical code generator. Also note that, since (at most) one of the operands of an unknown number could be a known number, it is necessary to extend the class definition of known numbers with methods code_for and code_name.

An esoteric feature of the code_for method is that it produces code only the first time it is invoked. For example, the code for the expression a+a is:

  put `a?' &
  repeat get a: Integer until a /= Undefined
  temp1 := a + a

The second call to value(new_identifier(`a')) produces the same unknown number as the first. The code_for method is thus called twice for the same object. The first time code is returned, the second time only an empty string.

Note that classes identifier and integer are now used at compile-time rather than a run-time. They come into play when the AST is walked. The result of the walk is a partial number, which in effect is another AST. The recursive calls to code_for walk the second AST, producing the desired output.

With the addition of the class for unknown numbers, the semantics of the expression language now approach the complexity of a traditional compiler. The complete partial evaluator-based compiler for expressions can be viewed by following this link.

Things become somewhat more complicated when control flow is added, but the overall complexity still seems manageable and probably compares favourably with many conventional optimizing compilers.