Register allocation

Instructions generated during instruction selection phase use virtual registers and some physical registers (e.g. when an instruction expects arguments in particular register(s)). Register allocation is the process of assigning physical location (register or memory) to the remaining virtual registers.

Some key concepts in the domain of register allocation are:

  • virtual register: A location which must be mapped to a physical register.
  • physical register: A real CPU register
  • interference graph: A graph in which each node represents a location. An edge indicates that the two locations cannot have the same physical register assigned.
  • pre-colored register: A location that is already assigned a specific physical register.
  • coalescing: The process of merging two nodes in an interference graph which do not interfere and are connected by a move instruction.
  • spilling: Spilling is the process when no physical register can be found for a certain virtual register. Then this value will be placed in memory.
  • register class: Most CPU’s contain registers grouped into classes. For example, there may be registers for floating point, and registers for integer operations.
  • register alias: Some registers may alias to registers in another class. A good example are the x86 registers rax, eax, ax, al and ah.

Interference graph

Each instruction in the instruction list may use or define certain registers. A register is live between a definition and a use of that register. Registers that are live at the same point in the program interfere with each other. An interference graph is a graph in which each node represents a register and each edge represents interference between those two registers.

Graph coloring

In 1981 Chaitin presented the idea to use graph coloring for register allocation.

In a graph a node can be colored if it has less neighbours than possible colors (referred to as K from now on). This is true because when each neighbour has a different color, there is still a valid color left for the node itself.

The outline of the algorithm is: Given a graph, if a node can be colored, remove this node from the graph and put it on a stack. When added back to the graph, it can be given a color. Now repeat this process recursively with the remaining graph. When the graph is empty, place all nodes back from the stack one by one and assign each node a color when placed. Remember that when adding back, a color can be found, because this was the criteria during removal.

See: https://en.wikipedia.org/wiki/Chaitin%27s_algorithm

[Chaitin1982]

Coalescing

Coalescing is the process of merging two nodes in an interference graph. This means that two temporaries will be assigned to the same register. This is especially useful if the temporaries are used in move instructions, since when the source and the destination of a move instruction are the same register, the move can be deleted.

Coalescing a move instruction is easy when an interference graph is present. Two nodes that are used by a move instruction can be coalesced when they do not interfere.

However, if we coalesce too many moves, the graph can become uncolorable, and spilling has to be done. To prevent spilling, coalescing must be done in a controlled manner.

A conservative approach to the coalescing is the following: if the merged node has fewer than K neighbours, then the nodes can be coalesced. The reason for this is that when all nodes that can be colored are removed and the merged node and its non-colored neighbours remain, the merged node can be colored. This ensures that the coalescing of the nodes does not have a negative effect on the colorability of the graph.

[Briggs1994]

Spilling

Iterated register coalescing

Iterated register coalescing (IRC) is a combination of graph coloring, coalescing and spilling. The process consists of the following steps:

  • build an interference graph from the instruction list
  • remove trivial colorable nodes.
  • coalesce registers to remove redundant moves
  • spill registers
  • select registers

See: https://en.wikipedia.org/wiki/Register_allocation

[George1996]

Graph coloring with more register classes

Most instruction sets are not uniform, and hence simple graph coloring cannot be used. The algorithm can be modified to work with several register classes that possibly interfere.

[Runeson2003] [Smith2004]

Implementations

The following class can be used to perform register allocation.

class ppci.codegen.registerallocator.GraphColoringRegisterAllocator(arch: ppci.arch.arch.Architecture, instruction_selector)

Target independent register allocator.

Algorithm is iterated register coalescing by Appel and George. Also the pq-test algorithm for more register classes is added.

alloc_frame(frame: ppci.arch.stack.Frame)

Do iterated register allocation for a single frame.

This is the entry function for the register allocator and drives through all stages of the algorithm.

Parameters:frame – The frame to perform register allocation on.
apply_colors()

Assign colors to registers

assign_colors()

Add nodes back to the graph to color it.

Any potential spills might turn into real spills at this stage.

calc_num_blocked(node)

Calculate for the given node how many registers are blocked by it’s adjecent nodes.

This is an advanced form of a nodes degree, but then for register of different classes.

check_invariants()

Test invariants

coalesc()

Coalesc moves conservative.

This means, merge the variables of a move into one variable, and delete the move. But do this only when no spill will occur.

combine(u, v)

Combine u and v into one node, updating work lists

common_reg_class

Determine the smallest common register class of two nodes

conservative(u, v)

Briggs conservative criteria for coalescing.

If the result of the merge has fewer than K nodes that are not trivially colorable, then coalescing is safe, because when coloring, all other nodes that can be colored will be popped from the graph, leaving the merged node that then can be colored.

In the case of multiple register classes, first determine the new neighbour nodes. Then assume that all nodes that can be colored will be colored, and are taken out of the graph. Then calculate how many registers can be blocked by the remaining nodes. If this is less than the number of available registers, the coalesc is safe!

decrement_degree(m)

If a node was lowered in degree, check if there are nodes that can be moved from the spill list to the freeze of simplify list

freeze()

Give up coalescing on some node, move it to the simplify list and freeze all moves associated with it.

freeze_moves(u)

Freeze moves for node u

has_edge(t, r)

Helper function to check for an interfering edge

init_data(frame: ppci.arch.stack.Frame)

Initialize data structures

is_colorable(node) → bool

Helper function to determine whether a node is trivially colorable. This means: no matter the colors of the nodes neighbours, the node can be given a color.

In case of one register class, this is: n.degree < self.K

In case of more than one register class, somehow the worst case damage by all neighbours must be determined.

We do this now with the pq-test.

Check if a node is used by move instructions

Associate move with its source and destination

ok(t, r)

Implement coalescing testing with pre-colored register

q

The number of class B registers that can be blocked by class C.

release_pressure(node, reg_class)

Remove some register pressure from the given node.

remove_redundant_moves()

Remove coalesced moves

rewrite_program(node)

Rewrite program by creating a load and a store for each use

select_spill()

Select potential spill node.

Select a node to be spilled. This is optimistic, since this might be turned into a real spill. Continue nevertheless, to discover more potential spills, or we might be lucky and able to color the graph any ways.

simplify()

Remove nodes from the graph

class ppci.codegen.registerallocator.MiniGen(arch, selector)

Spill code generator

gen(frame, tree)

Generate code for a given tree

gen_load(frame, vreg, slot)

Generate instructions to load vreg from a stack slot

gen_store(frame, vreg, slot)

Generate instructions to store vreg at a stack slot

make_fmt(vreg)

Determine the type suffix, such as I32 or F64.