Archive for the ‘Planet Python’ Category

Cython close to hand-crafted C code for generators

Mittwoch, April 6th, 2011

I did a couple of experiments compiling itertools with the new generator support in Cython. In CPython, the itertools module is actually written in hand tuned C and does very little computation in its generators, so I knew it would be hard to reach with generated code. But Cython does a pretty good job.

Something as trivial as chain() is exactly as fast as in the C implementation, but compared to the more than 60 lines of C code, it is certainly a lot more readable in Cython:

def chain(*iterables):
    """Make an iterator that returns elements from the first iterable
    until it is exhausted, then proceeds to the next iterable, until
    all of the iterables are exhausted. Used for treating consecutive
    sequences as a single sequence.
    """
    for it in iterables:
        for element in it:
            yield element

Other functions, like islice(), are faster in C, partly because CPython actually takes a couple of shortcuts, e.g. by only looking up the iterator slot method once. You cannot do that in Python code, and I wanted to keep the implementation compatible with regular Python. Specifically, the C speed advantage for islice() is currently about 30-50% in general, although the Cython implementation can also be up to 10% faster for some cases, e.g. when extracting only a couple of items from the middle of a longer sequence. The C implementation is about 90 lines, here is the Cython implementation:

import sys
import cython

# Python 2/3 compatibility
_max_size = cython.declare(cython.Py_ssize_t,
        getattr(sys, "maxsize", getattr(sys, "maxint", None)))

@cython.locals(i=cython.Py_ssize_t, nexti=cython.Py_ssize_t,
               start=cython.Py_ssize_t, stop=cython.Py_ssize_t, step=cython.Py_ssize_t)
def islice(iterable, *args):
    """Make an iterator that returns selected elements from the
    iterable. If start is non-zero, then elements from the iterable
    are skipped until start is reached. Afterward, elements are
    returned consecutively unless step is set higher than one which
    results in items being skipped. If stop is None, then iteration
    continues until the iterator is exhausted, if at all; otherwise,
    it stops at the specified position. Unlike regular slicing,
    islice() does not support negative values for start, stop, or
    step. Can be used to extract related fields from data where the
    internal structure has been flattened (for example, a multi-line
    report may list a name field on every third line).
    """
    s = slice(*args)
    start = s.start or 0
    stop = s.stop or _max_size
    step = s.step or 1
    if start < 0:
        raise ValueError("...")
    if step < 1:
        raise ValueError("...")
    if start >= stop:
        return
    nexti = start
    for i, element in enumerate(iterable):
        if i == nexti:
            yield element
            nexti += step
            if nexti >= stop or nexti < 0:
                return

Here is one that is conceptually quite simple: count(). I had to optimise it quite a bit, because the iteration code in the C code is extremely tight. Even the tuned version below runs about 10% slower than the hand tuned C version, which is about 230 lines long.

@cython.locals(i=cython.Py_ssize_t)
def count(n=0):
    """Make an iterator that returns consecutive integers starting
    with n. If not specified n defaults to zero. Often used as an argument to imap()
    to generate consecutive data points. Also, used with zip() to add
    sequence numbers.
    """
    try:
        i = n
    except OverflowError:
        i = _max_size # skip i-loop
    else:
        n = _max_size # first value after i-loop
    while i < _max_size:
        yield i
        i += 1
    while True:
        yield n
        n += 1

Note that all of the above generators execute in the order of microseconds, so even a slow-down of 50% will likely not be measurable in real world code.

So far, I did not try any of the more fancy functions in itertools (those that actually do something). The Cython project has announced a Google Summer of Code project with exactly the intent to rewrite some of the C stdlib modules of CPython in pure Python code with Cython compiler hints. So I leave this exercise to interested readers for now.

A faster Python difflib

Montag, März 21st, 2011

As part of a push for making it easier to develop faster and more readable core/stdlib code for the CPython runtime, I have written a short patch against the difflib module in the Python standard library to make it a) compile with Cython and b) run faster as a compiled binary module. The net result is that it runs more than 50% faster with only the minor code modifications provided in the patch, and still about as fast in the normal CPython interpreter.

Seriously, there is a function for that

Freitag, Oktober 29th, 2010

I keep running into code like this:

tree = lxml.etree.parse( StringIO(bytes_data) )

The docs are actually very clear about this. There is a function called etree.fromstring(data) that is meant to parse from a string. It is the same as in ElementTree. Obviously, no-one reads documentation. But it’s there, really.

Locally loading DTDs from XML catalogs with lxml

Donnerstag, Oktober 28th, 2010

It seems that it is not obvious to all lxml users how DTDs and external entities are loaded by an XML processor. Specifically, if the system is misconfigured, it can happen that lxml fails to parse a document that needs a DTD or that it tries to load the DTD from the network repeatedly, when the no_network parser option is set to False (obviously, network access is blocked by default).

I commented on this on the lxml mailing list in 2008 when there was a discussion about high web traffic at the W3C due to excessive DTD loading, which was also attributed to parts of the Python standard library.

The right way to handle this (in general, but especially for lxml) is to configure the XML catalogs on the local system. The libxml2 site has some documentation on how to do this. The advantage of using catalogs is that most XML tools will use them when available, so it is a system wide fix for the problem. Most Linux installations come with readily configured XML catalogs, but other systems may have to get fixed up.

Unicode support in Cython 0.13

Montag, Mai 17th, 2010

I’m really happy about all the new Unicode and type inference features in the upcoming Cython 0.13. It finally has support for CPython’s Unicode code point type, Py_UNICODE, and can transform various operations on Unicode characters into plain C code. For example, this will run as a plain C integer for loop:

def count_lower_case_characters(unicode ustring):
    cdef Py_ssize_t count = 0
    for uchar in ustring:
         if uchar.islower():
             count += 1
    return count

Or, if you only want to know if there are any lower case characters in a string at all, here’s another plain C solution:

def any_lower_case_characters(unicode ustring):
    return any(uchar.islower() for uchar in ustring)

The latter is actually somewhat of a fake as Cython does not generally support generator expressions yet. However, it still shows where the language is going. In the examples above, Cython can infer that the loop variable can only ever hold a single Unicode character, so it can safely map the entire loop into C space. IMHO, a beautiful example that the integration of Python objects and C types is getting so tight that even totally innocent looking code starts running at the speed of light.

Cython C++ wrapping benchmarks

Sonntag, Februar 21st, 2010

A couple of weeks ago, I found the C++ wrapping benchmarks on the PyBindGen homepage. The author posted a short blog intro about them. I wondered why Cython wasn’t used as a comparison at the time, until I found out that the wrapper was rather tricky to write in Cython back then due to the lack of good C++ language support (especially for overloaded functions/methods).

Cython has improved its C++ support considerably since then, due to the work of Danilo Freitas and Robert Bradshaw, which was recently merged into mainline and is now scheduled for Cython 0.13. This allowed me to provide a simple and short implementation of the wrapper module used in the above benchmark. The timings are rather unsurprising: Cython beats them all. This is mainly due to the fact that Cython uses highly optimised argument handling code, which greatly reduces the call overhead of a wrapper.

I also like how readable the Cython wrapper code is, especially compared to the rather unwieldy PyBindGen implementation. Obviously, this comparison is a bit unfair because Cython is a programming language with an optimising compiler, whereas the other tools are simply glue code generators. But the benchmark results certainly speak volumes.

XPath 2.0, XSLT 2.0 and Python

Montag, Januar 7th, 2008

I noticed a few people getting desperate about XPath/XSLT 2.0 support in Python, especially since the normal sources like XML-SIG or comp.lang.python magically failed to provide helpful answers in the past.

A good place to look is the web site of the W3C XQuery group, which has a general and quite comprehensive list of XQuery 1.0 implementations. By design, these also implement XPath 2.0. Now, some of them provide their XPath2 support separately, some additionally implement XSLT2, and some even seem to have Python bindings. So there should at least be something useful to start from.

Another take on this: while some implementations are in C (just as the CPython interpreter) or at least C++, many others are written in Java. GNU’s gcj is pretty good in compiling such tools and libraries into executable binaries and it also has support for generating the necessary C headers. What I’d love to see is one of the good XPath2/XQuery tools compiled as a Python extension and/or interfaced with lxml at the C level.

One project that did something like that was PyLucene. They now seem to have switched to a new tool called JCC, which (according to the docs) generates a complete CPython extension from a Java application that interfaces with the JVM through the JNI. While I like the simplicity of this (I haven’t used it yet), I would still prefer a GCJ compiled binary over a dependency on a full-fledged JVM. But I definitely like both approaches. Any volunteers? :)

easy, easy, XML

Mittwoch, Oktober 31st, 2007

I keep answering XML questions on Python’s XML-SIG mailing list and on comp.lang.python. Most of them go: “Here’s my SAX/minidom/PyXML code. I can’t find the bug. Can you help me?”

Yes, I can. My default answers are:

  • Don’t use SAX. It’s well designed to hide the bugs in your code, but it’s no good for doing XML work.
  • Don’t use PyXML. It hasn’t been maintained for years, so if you have problems with it, you’re lucky if there is someone still alive who can help you out.
  • Avoid minidom and other DOM implementations. DOM is good if you need to write code the Java way. It’s no good if you want to get XML work done.

The second paragraph of my answer then goes:

Use ElementTree or lxml instead. The first is part of the standard library since Python 2.5, and the latter is available from http://lxml.de/. Both are mostly compatible, both are well maintained, both provide mature and stable libraries that are easy to use. Read the tutorial and get started.

It’s an advantage for Python to have these tools available, and I’m still missing a straight link from the Python homepage to both of them. These are the tools that newbees should run straight into when they look for XML support in Python, not the outdated pages that are currently hidden behind the “XML” link.