Back-end

The back-end is more complicated. There are several steps to be taken here.

  1. Tree creation
  2. Instruction selection
  3. Register allocation
  4. Peep hole optimization
digraph codegen {
1 [label="IR-code"]

10 [label="irdag"]
20 [label="dagsplitter"]
30 [label="instruction selector"]
40 [label="register allocator"]

49 [label="assembly parser"]
50 [label="outstream"]
60 [label="object file"]
61 [label="text output"]
1 -> 10
10 -> 20 [label="Selection DAG"]
20 -> 30 [label="Selection Trees"]
30 -> 40 [label="frame"]
40 -> 50 [label="frame"]

49 -> 50
50 -> 60
50 -> 61
}

Code generator

Machine code generator.

The architecture is provided when the generator is created.

class ppci.codegen.codegen.CodeGenerator(arch, optimize_for='size')

Machine code generator

emit_frame_to_stream(frame, output_stream, debug=False)

Add code for the prologue and the epilogue. Add a label, the return instruction and the stack pointer adjustment for the frame. At this point we know how much stack space must be reserved for locals and what registers should be saved.

generate(ircode: ppci.ir.Module, output_stream, reporter, debug=False)

Generate machine code from ir-code into output stream

generate_function(ir_function, output_stream, reporter, debug=False)

Generate code for one function into a frame

select_and_schedule(ir_function, frame, reporter)

Perform instruction selection and scheduling

digraph "classes_foo" {
charset="utf-8"
rankdir=BT
"0" [label="{CodeGenerator|arch\ldebug_db : DebugDb\linstruction_scheduler : InstructionScheduler\linstruction_selector : InstructionSelector1\llogger : RootLogger, NoneType, Logger\lregister_allocator : GraphColoringRegisterAllocator\lsgraph_builder : SelectionGraphBuilder\lverifier : Verifier\l|emit_frame_to_stream()\lgenerate()\lgenerate_function()\lselect_and_schedule()\l}", shape="record"];
}

Canonicalize

During this phase, the IR-code is made simpler. Also unsupported operations are rewritten into function calls. For example soft floating point is introduced here.

Tree building

From IR-code a tree is generated which can be used to select instructions.

IR to DAG

The process of instruction selection is preceeded by the creation of a selection DAG (directed acyclic graph). The dagger take ir-code as input and produces such a dag for instruction selection.

A DAG represents the logic (computation) of a single basic block.

To do selection with tree matching, the DAG is then splitted into a series of tree patterns. This is often referred to as a forest of trees.

class ppci.codegen.irdag.FunctionInfo(frame)

Keeps track of global function data when generating code for part of a functions.

class ppci.codegen.irdag.Operation(op, ty)

A single operation with a type

class ppci.codegen.irdag.SelectionGraphBuilder(arch)

Create a selectiongraph from a function for instruction selection

block_to_sgraph(ir_block: ppci.ir.Block, function_info)

Create dag (directed acyclic graph) from a basic block.

The resulting dag can be used for instruction selection.

build(ir_function: ppci.ir.SubRoutine, function_info, debug_db)

Create a selection graph for the given function.

Selection graph is divided into groups for each basic block.

copy_phis_of_successors(ir_block)

When a terminator instruction is encountered, handle the copy of phi values into the expected virtual register

do_address_of(node)

Process ir.AddressOf instruction

do_alloc(node)

Process the alloc instruction

do_binop(node)

Visit a binary operator and create a DAG node

do_c_jump(node)

Process conditional jump into dag

do_cast(node)

Create a cast of type

do_const(node)

Process constant instruction

do_function_call(node)

Transform a function call

do_literal_data(node)

Literal data is stored after a label

do_load(node)

Create dag node for load operation

do_phi(node)

Refer to the correct copy of the phi node

do_procedure_call(node)

Transform a procedure call

do_return(node)

Move result into result register and jump to epilog

do_store(node)

Create a DAG node for the store operation

do_unop(node)

Visit an unary operator and create a DAG node

get_address(ir_address)

Determine address for load or store.

new_node(name, ty, *args, value=None)

Create a new selection graph node, and add it to the graph

new_vreg(ty)

Generate a new temporary fitting for the given type

ppci.codegen.irdag.depth_first_order(function)

Return blocks in depth first search order

ppci.codegen.irdag.make_map(cls)

Add an attribute to the class that is a map of ir types to handler functions. For example if a function is called do_phi it will be registered into f_map under key ir.Phi.

ppci.codegen.irdag.prepare_function_info(arch, function_info, ir_function)

Fill function info with labels for all basic blocks

DAG splitting into trees.

The selection dag is splitted into trees by this module.

class ppci.codegen.dagsplit.DagSplitter(arch)

Class that splits a DAG into a forest of trees.

This series is sorted such that data dependencies are met. The trees can henceforth be used to match trees.

assign_vregs(sgraph, function_info)

Give vreg values to values that cross block boundaries

check_vreg(node, frame)

Determine whether node outputs need a virtual register

get_reg_class(data_flow)

Determine the register class suited for this data flow line

make_op(op, typ)

Construct a string opcode from an operation and a type

make_trees(nodes, tail_node)

Create a tree from a list of sorted nodes.

split_into_trees(sgraph, ir_function, function_info, debug_db)

Split a forest of trees into a sorted series of trees for each block.

ppci.codegen.dagsplit.topological_sort_modified(nodes, start)

Modified topological sort, start at the end and work back

code emission

Code is emitted using the outputstream class. The assembler and compiler use this class to emit instructions to. The stream can output to object file or to a logger.

class ppci.binutils.outstream.OutputStream

Interface to generate code with.

Contains the emit function to output instruction to the stream.

do_emit(item)

Actual emit implementation

emit(item)

Encode instruction and add symbol and relocation information

emit_all(items)

Emit all items from an iterable

select_section(name)

Switch output to certain section