What's new in Cython 0.27?

Stefan Behnel

2017-09-23 07:15

Cython 0.27 is freshly released and comes with several great improvements. It finally implements all major features that were added in CPython 3.6, such as async generators and variable annotations. The long list of new features and resolved issues can be found in the changelog, or in the list of resolved tickets.

Probably the biggest new feature is the support for asynchronous generators and asynchronous comprehensions, as specified in PEP 525 and PEP 530 respectively. They allow using yield inside of async coroutines and await inside of comprehensions, so that the following becomes possible:

async def generate_results(source):
    async for i in source:
        yield i ** 2
    ...
    d = {i: await result for i, result in enumerate(async_results)}
    ...
    l = [s for c in more_async_results
         for s in await c]

As usual, this feature is available in Cython compiled modules across all supported Python versions, starting with Python 2.6. However, using async cleanup in generators, e.g. in a finally-block, requires CPython 3.6 in order to remember which I/O-loop the generator must use. Async comprehensions do not suffer from this.

The next big and long awaited feature is support for PEP 484 compatible typing. Both signature annotations (PEP 484) and variable annotations (PEP 526) are now parsed for Python types and cython.* types like list or cython.int. Complex types like List[int] are not currently evaluated as the semantics are less clear in the context of static compilation. This will be added in a future release.

One special twist here is exception handling, which tries to mimic Python more closely than the defaults in Cython code. Thus, it is no longer necessary to explicitly declare an exception return value in code like this:

@cython.cfunc
def add_1(x: cython.int) -> cython.int:
    if x < 0:
        raise ValueError("...")
    return x + 1

Cython will automatically return -1 (the default exception value for C integer types) when an exception is raised and check for exceptions after calling it. This is identical to the Cython signature declaration except? -1.

In cases where annotations are not meant as static type declarations during compilation, the extraction can be disabled with the compiler directive annotation_typing=False.

The new release brings another long awaited feature: automatic ``__richcmp__()`` generation. Previously, extension types required a major difference to Python classes with respect to the special methods for comparison, __eq__, __lt__ etc. Users had to implement their own special __richcmp__() method which implemented all comparisons at once. Now, Cython can automatically generate an efficient __richcmp__() method from the normal comparison methods, including inherited base type implementations. This brings Python classes and extension types another huge step closer.

To bring extension modules also closer to Python modules, Cython now implements the new extension module initialisation process of PEP 489 in CPython 3.5 and later. This makes the special global names like __file__ and __path__ correctly available to module level code and improves the support for module-level relative imports. As with most internal features, existing Cython code will benefit from this by simple recompilation.

As a last feature worth mentioning, the IPython/Jupyter magic integration gained a new option %%cython --pgo for easy profile guided optimisation. This allows the C compiler to take better decisions during its optimisation phase based on a (hopefully) realistic runtime profile. The option compiles the cell with PGO settings for the C compiler, executes it to generate the runtime profile, and then compiles it again using that profile for C compiler optimisation. This is currently only tested with gcc. Support for other compilers can easily be added to the IPythonMagic.py module and pull requests are welcome, as usual.

By design, the Jupyter cell itself is responsible for generating a suitable profile. This can be done by implementing the functions that should be optimised via PGO, and then calling them directly in the same cell on some realistic training data like this:

%%cython --pgo
def critical_function(data):
    for item in data:
        ...

# execute function several times to build profile
from somewhere import some_typical_data
for _ in range(100):
    critical_function(some_typical_data)

Together with the improved module initialisation in Python 3.5 and later, you can now also distinguish between the profile and non-profile runs as follows:

if "_pgo_" in __name__:
    ...  # execute critical code here