The final phase of our compiler is code generation and it takes the input from intermediate code generation phase and produces an equivalent target program.
The schematic representation of position of code generator,
Input to the code generator
The input to the code generator comes from intermediate code generation phase with information in the symbol table which determines the runtime addresses of the data objects.
The output of the code generator is the target program. The output may be of any form, absolute machine language, relocatable machine language or assembly language.
Mapping of the names in the source program to the addresses of the data object in runtime memory is done cooperatively by the front end and the code generator.
The instructions having register operands are shorter and faster than the operands used in memory.
The use of registers usually has two sub problems
- During register allocation, the set of variables that will reside in registers are selected at a point in the program.
- During subsequent register assignment, we pick the specific registry that a variable will reside in.
Choice of evaluation order
The order in which the computations are done can affect the efficiency of target code. Some computations require lesser registeres to hold the values generated in intermediate code generation. The code would be generated for three address statements in the same order produced by intermediate code generator.
A graphic representation of three address statements is called a flow graph. It is useful in understanding code generation algorithms.
A basic block is a sequence of consecutive statements in which the control flow enters the beginning and leaves the end without a possibility of halt or branching except for the end.
The algorithm used to partition of a sequence into basic blocks,
Algorithm: Partition into basic blocks
Input: A sequence of three address statement
Output: A list of basic blocks
Transformations of basic blocks
The primary structure preserving transformations on basic blocks are,
- Common sub-expression elimination
- Dead code elimination
Suppose x is dead and never to be used then the expression x:=y+z appears in a basic block can be safely removed without affecting a basic block
- Renaming of temporary variables
Consider a statement t:=a+b where t is a temporary. If the statement is changed to u:=a+b where u is a new temporary variable. The change all uses the value u in place of t , the value of basic blocks remains the same.
- Interchange of two adjacent statements
Suppose we have a block with two adjacent statements,
Then we can interchange the two statements without affecting the value of a block.