The back-end is more complicated. There are several steps to be taken here.

  1. Tree creation
  2. Instruction selection
  3. Register allocation
  4. Peep hole optimization
digraph codegen {
1 [label="IR-code"]

10 [label="irdag"]
20 [label="dagsplitter"]
30 [label="instruction selector"]
40 [label="register allocator"]

49 [label="assembly parser"]
50 [label="outstream"]
60 [label="object file"]
61 [label="text output"]
1 -> 10
10 -> 20 [label="Selection DAG"]
20 -> 30 [label="Selection Trees"]
30 -> 40 [label="frame"]
40 -> 50 [label="frame"]

49 -> 50
50 -> 60
50 -> 61

Code generator

Machine code generator.

The architecture is provided when the generator is created.

class ppci.codegen.codegen.CodeGenerator(arch, optimize_for='size')

Machine code generator

emit_frame_to_stream(frame, output_stream, debug=False)

Add code for the prologue and the epilogue. Add a label, the return instruction and the stack pointer adjustment for the frame. At this point we know how much stack space must be reserved for locals and what registers should be saved.

generate(ircode:, output_stream, reporter, debug=False)

Generate machine code from ir-code into output stream

generate_function(ir_function, output_stream, reporter, debug=False)

Generate code for one function into a frame

select_and_schedule(ir_function, frame, reporter)

Perform instruction selection and scheduling

digraph "classes_foo" {
"0" [label="{CodeGenerator|arch\ldebug_db : DebugDb\linstruction_scheduler : InstructionScheduler\linstruction_selector : InstructionSelector1\llogger : RootLogger, NoneType, Logger\lregister_allocator : GraphColoringRegisterAllocator\lsgraph_builder : SelectionGraphBuilder\lverifier : Verifier\l|emit_frame_to_stream()\lgenerate()\lgenerate_function()\lselect_and_schedule()\l}", shape="record"];


During this phase, the IR-code is made simpler. Also unsupported operations are rewritten into function calls. For example soft floating point is introduced here.

Tree building

From IR-code a tree is generated which can be used to select instructions.


The process of instruction selection is preceeded by the creation of a selection DAG (directed acyclic graph). The dagger take ir-code as input and produces such a dag for instruction selection.

A DAG represents the logic (computation) of a single basic block.

To do selection with tree matching, the DAG is then splitted into a series of tree patterns. This is often referred to as a forest of trees.

class ppci.codegen.irdag.FunctionInfo(frame)

Keeps track of global function data when generating code for part of a functions.

class ppci.codegen.irdag.Operation(op, ty)

A single operation with a type

class ppci.codegen.irdag.SelectionGraphBuilder(arch)

Create a selectiongraph from a function for instruction selection

block_to_sgraph(ir_block:, function_info)

Create dag (directed acyclic graph) from a basic block.

The resulting dag can be used for instruction selection.

build(ir_function:, function_info, debug_db)

Create a selection graph for the given function.

Selection graph is divided into groups for each basic block.


When a terminator instruction is encountered, handle the copy of phi values into the expected virtual register


Process ir.AddressOf instruction


Process the alloc instruction


Visit a binary operator and create a DAG node


Process conditional jump into dag


Create a cast of type


Process constant instruction


Transform a function call


Literal data is stored after a label


Create dag node for load operation


Refer to the correct copy of the phi node


Transform a procedure call


Move result into result register and jump to epilog


Create a DAG node for the store operation


Visit an unary operator and create a DAG node


Determine address for load or store.

new_node(name, ty, *args, value=None)

Create a new selection graph node, and add it to the graph


Generate a new temporary fitting for the given type


Return blocks in depth first search order


Add an attribute to the class that is a map of ir types to handler functions. For example if a function is called do_phi it will be registered into f_map under key ir.Phi.

ppci.codegen.irdag.prepare_function_info(arch, function_info, ir_function)

Fill function info with labels for all basic blocks

DAG splitting into trees.

The selection dag is splitted into trees by this module.

class ppci.codegen.dagsplit.DagSplitter(arch)

Class that splits a DAG into a forest of trees.

This series is sorted such that data dependencies are met. The trees can henceforth be used to match trees.

assign_vregs(sgraph, function_info)

Give vreg values to values that cross block boundaries

check_vreg(node, frame)

Determine whether node outputs need a virtual register


Determine the register class suited for this data flow line

make_op(op, typ)

Construct a string opcode from an operation and a type

make_trees(nodes, tail_node)

Create a tree from a list of sorted nodes.

split_into_trees(sgraph, ir_function, function_info, debug_db)

Split a forest of trees into a sorted series of trees for each block.

ppci.codegen.dagsplit.topological_sort_modified(nodes, start)

Modified topological sort, start at the end and work back

code emission

Code is emitted using the outputstream class. The assembler and compiler use this class to emit instructions to. The stream can output to object file or to a logger.

class ppci.binutils.outstream.OutputStream

Interface to generate code with.

Contains the emit function to output instruction to the stream.


Actual emit implementation


Encode instruction and add symbol and relocation information


Emit all items from an iterable


Switch output to certain section