JITting ======= .. warning:: This section is a work in progress. It is outlined as to how things should work, but it is not thorougly tested. Also, keep in mind that C support is very premature. An alternative is C3. This howto is about how to JIT (just-in-time-compile) code and use it from Python. It can occur that at some point in time, you have some Python code that becomes a performance bottleneck. At this point, you have multiple options: - Rewrite your code in C/Fortran/Rust/Go/Swift, compile it to machine code with GCC or similar compiler and load it with SWIG/ctypes. - Use PyPy, which contains a built-in JIT functionality. Usually the usage of PyPy means more speed. - Use a specialized JIT engine, like Numba. In this HowTo we will implement our own specialized JIT engine, using PPCI as a backend. To do this, first we need some example code. Take the following function as an example: .. testcode:: jitting def x(a, b): return a + b + 13 This function does some magic calculations :) .. doctest:: jitting >>> x(2, 3) 18 C-way ----- Now, after profiling we could potentially discover that this function is a bottleneck. We may decide to rewrite it in C: .. code-block:: c int x(int a, int b) { return a + b + 13; } Having this function, we put this function in a Python string and compile it. .. doctest:: jitting >>> from ppci import api >>> import io >>> src = io.StringIO(""" ... int x(int a, int b) { ... return a + b + 13; ... }""") >>> arch = api.get_current_arch() >>> obj = api.cc(src, arch, debug=True) >>> obj # doctest: +ELLIPSIS CodeObject of ... bytes Now that the object is compiled, we can load it into the current Python process: .. doctest:: jitting >>> from ppci.utils.codepage import load_obj >>> m = load_obj(obj) >>> dir(m) # doctest: +ELLIPSIS [..., 'x'] >>> m.x # doctest: +ELLIPSIS Now, lets call the function: .. doctest:: jitting >>> m.x(2, 3) 18 Python-way ---------- Instead of translating our code to C, we can as well compile Python code directly, by using type hints and a restricted subset of the Python language. For this we can use the :mod:`ppci.lang.python` module: .. doctest:: jitting >>> from ppci.lang.python import load_py >>> f = io.StringIO(""" ... def x(a: int, b: int) -> int: ... return a + b + 13 ... """) >>> n = load_py(f) >>> n.x(2, 3) 18 By doing this, we do not need to reimplement the function in C, but only need to add some type hints to make it work. This might be more preferable to C. Please note that integer arithmetic is arbitrary-precision in Python, but witth the compiled code above, large value will silently wrap around. To easily compile some of your Python functions to native code, use the :func:`ppci.lang.python.jit` decorator: .. testcode:: jitting from ppci.lang.python import jit @jit def y(a: int, b: int) -> int: return a + b + 13 Now the function can be called as a normal function, JIT compilation and calling native code is handled transparently: .. doctest:: jitting >>> y(2, 3) 18 Calling Python functions from native code ----------------------------------------- In order to callback Python functions, we can do the following: .. doctest:: jitting >>> def callback_func(x: int) -> None: ... print('x=', x) ... >>> f = io.StringIO(""" ... def x(a: int, b: int) -> int: ... func(a+3) ... return a + b + 13 ... """) >>> o = load_py(f, imports={'func': callback_func}) >>> o.x(2, 3) x= 5 18 Benchmarking and call overheads ------------------------------- To conclude this section, let's benchmark the original function ``x`` with which we started this section, and its JIT counterpart: .. code:: python >>> import timeit >>> timeit.timeit('x(2,3)', number=100000, globals={'x': x}) 0.015114138000171806 >>> timeit.timeit('x(2,3)', number=100000, globals={'x': m.x}) 0.07410199400010242 Turns out that the compiled code is actually slower. This is due to the fact that for a trivial function like that, argument conversion and call preparation overheads dominate the execution time. To see benefits of native code execution, we would need to JIT functions which perform many operations in a loop, e.g. while processing large arrays. .. warning:: Before optimizing anything, run a profiler. Your expectations about performance bottlenecks might be wrong!