Jitting

Warning

This section is a work in progress. It is outlined as to how things should work, but it is not thorougly tested. Also, keep in mind that C support is very premature. An alternative is c3.

This howto is about how to JIT (just-in-time-compile) code and use it from python. It can occur that at some point in time, you have some python code that becomes a performance bottleneck. At this point, you have multiple options:

  • Rewrite your code in C/Fortran/Rust/Go/Swift, compile it to machine code with gcc and load it with swig/ctypes.
  • Use pypy, which contains a built-in JIT function. Usually the usage of pypy means more speed.

There are lots of more options, but in this howto we will use ppci to compile and load a piece of code that forms a bottleneck.

To do this, first we need some example code. Take the following function as an example:

def x(a, b):
    return a + b + 13

This function does some magic calculations :)

>>> x(2, 3)
18

C-way

Now, after profiling we could potentially discover that this function is a major bottleneck. So we decide to rewrite the thing in C:

int x(int a, int b)
{
  return a + b + 13;
}

Having this function, we put this function in a python string and compile it.

>>> from ppci import api
>>> import io
>>> src = io.StringIO("""
... int x(int a, int b) {
...   return a + b + 13;
... }""")
>>> arch = api.get_current_arch()
>>> obj = api.cc(src, arch, debug=True)
>>> obj  
CodeObject of ... bytes

Now that the object is compiled, we can load it into the current python process:

>>> from ppci.utils.codepage import load_obj
>>> m = load_obj(obj)
>>> dir(m)  
[..., 'x']
>>> m.x  
<CFunctionType object at ...>

Now, lets call the function:

>>> m.x(2, 3)
18

Follow-up

Instead of translating our code to C, we can as well compile python directly, by using type hints and a restricted subset of the python language. For this we can use the p2p module:

>>> from ppci.lang.python import load_py
>>> f = io.StringIO("""
... def x(a: int, b: int) -> int:
...     return a + b + 13
... """)
>>> n = load_py(f)
>>> n.x(2, 3)
18

By doing this, we do not need to reimplement the function in C, but only need to add some type hints to make it work. This might be more preferable to C. Please note that integer arithmatic is unlimited on python, but not when using compiled code.

Calling python functions

In order to callback python functions, we can do the following:

Warning

Code below is an idea, this does not work yet!

>>> func = lambda x: print('x=', x)
>>> f = io.StringIO("""
... def x(a: int, b: int) -> int:
...     func(a+3)
...     return a + b + 13
... """)
>>> o = load_py(f, functions={'func': func})
>>> o.x(2, 3)
18

Benchmarking

Now for an intersting plot twist, lets compare the two functions in a benchmark:

>>> import timeit
>>> timeit.timeit('x(2,3)', number=100000, globals={'x': x})
0.015114138000171806
>>> timeit.timeit('x(2,3)', number=100000, globals={'x': m.x})
0.07410199400010242

Turns out that the compiled code is actually slower. This can be due to the overhead of calling C functions or bad compilation. Lessons learned: first profile, then use pypy, then improve python code, and lastly: convert your code into C.

Warning

Before optimizing anything, run a profiler. Your expectations about performance bottlenecks might be wrong!