IR-code

The purpose of an intermediate representation (IR) of a program is to decouple the implementation of a front-end from the implementation of a back-end. That is, front ends generate IR-code, optimizers optimize this code and lastly backends transform it into machine code or something else.

A good IR has several characteristics:

  • It should be simple enough for front-ends to generate code.
  • It should be rich enough to exploit the target instructions best.

The IR in the ppci.ir module has the following properties:

  • It is static single assignment form. Meaning a value can only be assigned once, and is then never changed. This has several advantages.
  • It contains only basic types. Structures, arrays and void types are not represented.

Top level structure

The IR-code is implemented in the ir package.

class ppci.ir.Module(name: str, debug_db=None)

Container unit for variables and functions.

add_external(external)

Add an externally located thing

add_function(function)

Add a function to this module

add_variable(variable)

Add a variable to this module

display()

Display this module

functions

Get all functions of this module

get_function(name: str)

Get a function with the given name

stats()

Returns a string with statistic information such as block count

variables

Get all variables of this module

External declarations

class ppci.ir.External(name)

External object

class ppci.ir.ExternalSubRoutine(name, argument_types)

External subroutine base class

class ppci.ir.ExternalProcedure(name, argument_types)

External procedure

class ppci.ir.ExternalFunction(name, argument_types, return_ty)

External function

Global declarations

class ppci.ir.Binding

Enum for public / private-ness of global values.

This can be used to keep value local to the module. This can be useful when you compile two modules with symbols with the same name. If the symbols are defined as local, this will not cause a name clash during linking.

class ppci.ir.Variable(name, binding, amount, alignment, value=None)

Global variable, reserves room in the data area. Has name and size

class ppci.ir.SubRoutine(name, binding)

Base class of function and procedure. These two differ in that a function returns a value, where as a procedure does not.

A subroutine contains basic blocks which refer to each other, forming a control flow graph (CFG). Each SubRoutine has a single entry basic block.

Design trade-off: In C, a void type is introduced to permit functions that return nothing (void). This seems somewhat artificial, but keeps things simple for the users. In pascal, the procedure and function types are explicit, and the void type is not needed. This is also the approach taken here.

So instead of a Function and Call types, we have Function, Procedure, FunctionCall and ProcedureCall types.

add_block(block)

Add a block to this function.

Parameters:block (Block) – the basic block to add.
add_parameter(parameter)

Add an argument to this function

block_names

Get the names of all the blocks in this function

calc_reachable_blocks()

Determine all blocks that can be reached

delete_unreachable()

Calculate all reachable blocks from entry and delete all others

dump()

Print this function

get_instructions()

Get all instructions in this routine.

get_out_calls()

Return the calls that leave this function.

is_function

Test if this routine is a function.

is_leaf()

Test if this procedure is a leaf function.

A leaf function calls no other functions.

is_procedure

Test if this routine is a procedure.

make_unique_name(dut)

Check if the name of the given dut is unique and if not make it so. Also add it to the used names

num_instructions()

Count the number of instructions contained in this function

remove_block(block)

Remove a block from this function

class ppci.ir.Procedure(name, binding)

A procedure definition that does not return a value.

Parameters:
  • name (str) – the name of the procedure
  • binding (Binding) – The linkage of this procedure
class ppci.ir.Function(name, binding, return_ty)

Represents a function.

A function always returns a value.

Parameters:
  • name (str) – the name of the procedure
  • binding (Binding) – The linkage of this procedure
  • return_ty – The return value of this function.

Basic block

class ppci.ir.Block(name)

Uninterrupted sequence of instructions.

A block is properly terminated if its last instruction is a FinalInstruction.

add_instruction(instruction)

Add an instruction to the end of this block

change_target(old, new)

Change the target of this block from old to new

delete()

Delete all instructions in this block, so it can be removed

first_instruction

Return this blocks first instruction

insert_instruction(instruction, before_instruction=None)

Insert an instruction at the front of the block

is_closed

Determine whether this block is propert terminated

is_empty

Determines whether the block is empty or not

is_entry

Check if this block is the entry block of a function

is_used

True if this block is referenced by an instruction

last_instruction

Gets the last instruction from the block

phis

Return all Phi instructions of this block

predecessors

Return all predecessing blocks

remove_instruction(instruction)

Remove instruction from block

replace_incoming(block, new_blocks)

For each phi node in the block, change the incoming branch of block into new block with the same variable.

successors

Get the direct successors of this block

Types

class ppci.ir.Typ(name)

Built in type representation

is_blob

Test if this type is bytes blob

is_integer

Test if this type is of integer type

is_signed

Test if this type is of signed integer type

is_unsigned

Test if this type is of unsigned integer type

class ppci.ir.BlobDataTyp(size: int, alignment: int)

The type of a opaque data blob.

Note that blob types can be compared by using the is operator:

>>> from ppci.ir import BlobDataTyp
>>> typ1 = BlobDataTyp(8, 8)
>>> typ2 = BlobDataTyp(8, 8)
>>> typ1 is typ2
True

These simple types are available:

ppci.ir.ptr

Pointer type

ppci.ir.i64

Signed 64-bit type

ppci.ir.i32

Signed 32-bit type

ppci.ir.i16

Signed 16-bit type

ppci.ir.i8

Signed 8-bit type

ppci.ir.u64

Unsigned 64-bit type

ppci.ir.u32

Unsigned 32-bit type

ppci.ir.u16

Unsigned 16-bit type

ppci.ir.u8

Unsigned 8-bit type

ppci.ir.f64

64-bit floating point type

ppci.ir.f32

32-bit floating point type

Instructions

The following instructions are available.

Memory instructions

class ppci.ir.Load(address, name, ty, volatile=False)

Load a value from memory.

Parameters:
  • address – The address to load the value from.
  • name – The name of the value after loading it from memory.
  • ty – The type of the value.
  • volatile – whether or not this memory access is volatile.
class ppci.ir.Store(value, address, volatile=False)

Store a value into memory

class ppci.ir.Alloc(name: str, amount: int, alignment: int)

Allocates space on the stack. The type of this value is a ptr

Data instructions

class ppci.ir.Const(value, name, ty)

Represents a constant value

class ppci.ir.LiteralData(data, name)

Instruction that contains labeled data. When generating code for this instruction, a label and its data is emitted in the literal area

class ppci.ir.Binop(a, operation, b, name, ty)

Generic binary operation

class ppci.ir.Cast(value, name, ty)

Base type conversion instruction

Control flow instructions

class ppci.ir.ProcedureCall(callee, arguments)

Call a procedure with some arguments

class ppci.ir.FunctionCall(callee, arguments, name, ty)

Call a function with some arguments and a return value

class ppci.ir.Jump(target)

Jump statement to another Block within the same function

class ppci.ir.CJump(a, cond, b, lab_yes, lab_no)

Conditional jump to true or false labels.

class ppci.ir.Return(result)

This instruction returns a value and exits the function.

This instruction is only legal in a Function.

class ppci.ir.Exit

Instruction that exits the procedure.

Note that this instruction can only be used in a Procedure.

Other

class ppci.ir.Phi(name, ty)

Imaginary phi instruction to make SSA possible.

The phi instruction takes a value input for each basic block which can reach the basic block in which this phi instruction is placed. So for each incoming branch, there is a value.

The phi instruction is an artificial instruction which allows the IR-code to be in SSA form.

class ppci.ir.Undefined(name: str, ty: ppci.ir.Typ)

Undefined value, this value must never be used.

Abstract instruction classes

There are some abstract instructions, which cannot be used directly but serve as base classes for other instructions.

class ppci.ir.Instruction

Base class for all instructions that go into a basic block

function

Return the function this instruction is part of

class ppci.ir.Value(name: str, ty: ppci.ir.Typ)

Base of all values

is_used

Determine whether this value is used anywhere

class ppci.ir.FinalInstruction

Final instruction in a basic block.

This instruction terminates the basic block. No more instruction may appear after this instruction.

Uml

digraph "classes_foo" {
charset="utf-8"
rankdir=BT
"0" [label="{AddressOf|src\lsrc\l|}", shape="record"];
"1" [label="{Alloc|alignment\lamount\l|}", shape="record"];
"2" [label="{BasicTyp|bits\lsize\l|}", shape="record"];
"3" [label="{Binding|GLOBAL : str\lLOCAL : str\l|}", shape="record"];
"4" [label="{Binop|a\la\lb\lb\loperation\lops : list\l|}", shape="record"];
"5" [label="{BlobDataTyp|alignment\lsize\l|}", shape="record"];
"6" [label="{Block|first_instruction\lfunction : NoneType\linstructions : list\lis_closed\lis_empty\lis_entry\lis_used\llast_instruction\lname\lphis\lpredecessors\lreferences : OrderedSet\lsuccessors\l|add_instruction(instruction)\lchange_target(old, new)\ldelete()\ldump()\linsert_instruction(instruction, before_instruction)\lremove_instruction(instruction)\lreplace_incoming(block, new_blocks)\l}", shape="record"];
"7" [label="{CJump|a\la\lb\lb\lcond\lconditions : list\llab_no\llab_no\llab_yes\llab_yes\l|}", shape="record"];
"8" [label="{Cast|src\lsrc\l|}", shape="record"];
"9" [label="{Const|value\l|}", shape="record"];
"10" [label="{CopyBlob|amount\ldst\ldst\lsrc\lsrc\l|}", shape="record"];
"11" [label="{Exit|targets : list\l|}", shape="record"];
"12" [label="{External|\l|}", shape="record"];
"13" [label="{ExternalFunction|return_ty\l|}", shape="record"];
"14" [label="{ExternalProcedure|\l|}", shape="record"];
"15" [label="{ExternalSubRoutine|argument_types\l|}", shape="record"];
"16" [label="{ExternalVariable|\l|}", shape="record"];
"17" [label="{FinalInstruction|\l|}", shape="record"];
"18" [label="{FloatingPointTyp|\l|}", shape="record"];
"19" [label="{Function|return_ty\l|}", shape="record"];
"20" [label="{FunctionCall|arguments\lcallee\lcallee\l|replace_use(old, new)\l}", shape="record"];
"21" [label="{GlobalValue|binding\l|}", shape="record"];
"22" [label="{InlineAsm|clobbers\linput_values : list\loutput_values : list\ltemplate\l|add_input_variable(value)\ladd_output_variable(value)\lreplace_use(old, new)\l}", shape="record"];
"23" [label="{Instruction|block : NoneType\lfunction\lis_phi\lis_terminator\lposition\luses : OrderedSet\l|add_use(value)\ldel_use(v)\ldelete()\lremove_from_block()\lreplace_use(old, new)\l}", shape="record"];
"24" [label="{IntegerTyp|\l|}", shape="record"];
"25" [label="{Jump|target\ltarget\l|}", shape="record"];
"26" [label="{JumpBase|targets\l|change_target(old, new)\ldelete()\lset_target_block(name, block)\l}", shape="record"];
"27" [label="{JumpTable|lab_default\llab_default\ltable\lv\lv\l|}", shape="record"];
"28" [label="{LiteralData|data\l|}", shape="record"];
"29" [label="{Load|address\laddress\lvolatile : bool\l|}", shape="record"];
"30" [label="{LocalValue|\l|used_in_blocks()\l}", shape="record"];
"31" [label="{Module|debug_db : NoneType\lexternals : list\lfunctions\lname\lvariables\l|add_external(external)\ladd_function(function)\ladd_variable(variable)\ldisplay()\lget_function(name)\lstats()\l}", shape="record"];
"32" [label="{Parameter|\l|}", shape="record"];
"33" [label="{Phi|inputs : dict\l|del_incoming(block)\lget_value(block)\lreplace_use(old, new)\lset_incoming(block, value)\l}", shape="record"];
"34" [label="{PointerTyp|\l|}", shape="record"];
"35" [label="{Procedure|\l|}", shape="record"];
"36" [label="{ProcedureCall|arguments\lcallee\lcallee\l|replace_use(old, new)\l}", shape="record"];
"37" [label="{Return|result\lresult\ltargets : list\l|}", shape="record"];
"38" [label="{SignedIntegerTyp|signed : bool\l|}", shape="record"];
"39" [label="{Store|address\laddress\lvalue\lvalue\lvolatile : bool\l|}", shape="record"];
"40" [label="{SubRoutine|arguments : list\lblock_names\lblocks : list\ldefined_names : OrderedSet\lentry : NoneType\lis_function\lis_procedure\llogger : NoneType, RootLogger\lunique_counter : int\l|add_block(block)\ladd_parameter(parameter)\lcalc_reachable_blocks()\ldelete_unreachable()\ldump()\lget_instructions()\lget_instructions_of_type(typ)\lget_out_calls()\lis_leaf()\lmake_unique_name(dut)\lnum_instructions()\lremove_block(block)\l}", shape="record"];
"41" [label="{Typ|is_blob\lis_integer\lis_signed\lis_unsigned\lname\l|}", shape="record"];
"42" [label="{Undefined|\l|}", shape="record"];
"43" [label="{Unop|a\la\loperation\lops : list\l|}", shape="record"];
"44" [label="{UnsignedIntegerTyp|signed : bool\l|}", shape="record"];
"45" [label="{Value|is_used\lname\lty\luse_count\lused_by : OrderedSet\l|add_user(i)\ldel_user(i)\lreplace_by(value)\l}", shape="record"];
"46" [label="{Variable|alignment\lamount\lvalue : NoneType, tuple\l|}", shape="record"];
"0" -> "30" [arrowhead="empty", arrowtail="none"];
"1" -> "30" [arrowhead="empty", arrowtail="none"];
"2" -> "41" [arrowhead="empty", arrowtail="none"];
"4" -> "30" [arrowhead="empty", arrowtail="none"];
"5" -> "41" [arrowhead="empty", arrowtail="none"];
"7" -> "26" [arrowhead="empty", arrowtail="none"];
"8" -> "30" [arrowhead="empty", arrowtail="none"];
"9" -> "30" [arrowhead="empty", arrowtail="none"];
"10" -> "23" [arrowhead="empty", arrowtail="none"];
"11" -> "17" [arrowhead="empty", arrowtail="none"];
"12" -> "21" [arrowhead="empty", arrowtail="none"];
"13" -> "15" [arrowhead="empty", arrowtail="none"];
"14" -> "15" [arrowhead="empty", arrowtail="none"];
"15" -> "12" [arrowhead="empty", arrowtail="none"];
"16" -> "12" [arrowhead="empty", arrowtail="none"];
"17" -> "23" [arrowhead="empty", arrowtail="none"];
"18" -> "2" [arrowhead="empty", arrowtail="none"];
"19" -> "40" [arrowhead="empty", arrowtail="none"];
"20" -> "30" [arrowhead="empty", arrowtail="none"];
"21" -> "45" [arrowhead="empty", arrowtail="none"];
"22" -> "23" [arrowhead="empty", arrowtail="none"];
"24" -> "2" [arrowhead="empty", arrowtail="none"];
"25" -> "26" [arrowhead="empty", arrowtail="none"];
"26" -> "17" [arrowhead="empty", arrowtail="none"];
"27" -> "26" [arrowhead="empty", arrowtail="none"];
"28" -> "30" [arrowhead="empty", arrowtail="none"];
"29" -> "30" [arrowhead="empty", arrowtail="none"];
"30" -> "23" [arrowhead="empty", arrowtail="none"];
"30" -> "45" [arrowhead="empty", arrowtail="none"];
"32" -> "30" [arrowhead="empty", arrowtail="none"];
"33" -> "30" [arrowhead="empty", arrowtail="none"];
"34" -> "41" [arrowhead="empty", arrowtail="none"];
"35" -> "40" [arrowhead="empty", arrowtail="none"];
"36" -> "23" [arrowhead="empty", arrowtail="none"];
"37" -> "17" [arrowhead="empty", arrowtail="none"];
"38" -> "24" [arrowhead="empty", arrowtail="none"];
"39" -> "23" [arrowhead="empty", arrowtail="none"];
"40" -> "21" [arrowhead="empty", arrowtail="none"];
"42" -> "30" [arrowhead="empty", arrowtail="none"];
"43" -> "30" [arrowhead="empty", arrowtail="none"];
"44" -> "24" [arrowhead="empty", arrowtail="none"];
"46" -> "21" [arrowhead="empty", arrowtail="none"];
}