Bugs in CPython extension modules

Stefan Behnel

2013-03-24 14:27

A while back, I wrote up my opinion on writing CPython extensions by hand, using the C-API. Most of the replies I got were asking for a proof, but the article was more of a summary of my prior experience than anything really new.

Now, David Malcolm, author of the GCC Python plugin, has given a talk at this year's PyCon-US where he used a static analysis tool chain that he's been working on based on his GCC plugin to find bugs in CPython extension modules. Being a Fedora developer, he ran it against the wealth of binary Python packages in that distribution and ended up finding a lot of bugs. Very unsurprisingly to me, most of them were refcount bugs, mainly memory leaks, especially in error handling cases, but also lots of other issues with reference handling, e.g. missing NULL/error tests etc. At the end of the talk, he was asked what bugs his tools found not only in manually written code but in generated code, specifically C code generated by Cython. He answered that it was rather the other way round: he had used Cython generated code to prune false positives from his analysis tool, because it was quite obvious that the code that Cython generated was actually correct.

I think that nicely supports what I wrote in my last post.