I recently received a request for an online interview by Jonathan Ruiz, a CS student in Berlin. He's implementing graph algorithms as part of his final Bachelor thesis, and was evaluating and using Cython to get performance improvements. During his work, he thought it'd be nice to get some comments from a Cython core dev and sent me a couple of questions. Here's what I answered.
First of all, thank you Stefan for your time in this difficult situation.
Thanks for asking me.
How did your interest in programming and then compilers begin?
I have a pretty straight forward background and education in computer science and software development. But I'm not a compiler expert. In fact, I'm not even working on a compiler in the true sense. I'm working on Cython, which is a source code translator and code generator. The actual native code generation is then left to a C compiler. However, we avoid that distinction ourselves in the project because in the end, people use Cython to compile Python down to native code. So the distinction is more of an implementation detail.
I came to Cython through a bit of a diversion. I needed a Python XML library for the proof-of-concept implementation of my doctor's thesis somewhere around 2005. Not so long before that, Martijn Faassen had started writing an ElementTree-like wrapper for the XML library libxml2, called lxml, which had several features that I needed and was easy enough for me to hack on to get the features implemented that I was missing.
lxml was written in a code generator called Pyrex, and I ended up implementing a couple of features in that code generator that helped me in my work on lxml. Not all of these changes were accepted upstream, at least not in a timely fashion, and at some point I found that others had that problem, too, and had ended up with their own long-term forks. Together with Robert Bradshaw and William Stein from the University of Washington in Seattle, USA, we decided to fork Pyrex for good, and start a new official project, which we named Cython. That was in 2007, and I've worked on the Cython project ever since.
What advice would you give to students who want to break into this field?
Read code. Seriously. There is a lot that you can learn at a university about algorithms, about smart ideas that people came up with, about ways to tell and decide what's smart and what isn't, about the way things work (and should work) in general. A CS degree is an excellent way to set a basis for your future software design endeavours.
But there's nothing that comes close to reading other people's code when you're trying to understand how things work in real life and why the tools at hand don't do what you want them to do. And then fixing them to do it.
Which branches of mathematics do you think are important to become a good programmer or what particularly benefited you in Cython optimisation, for example?
I would love to say that my math education at university helped me here and there, but in retrospect, I need to admit that I could have come to the same point where I stand now with just my math lessons at school (although those were pretty decent, I guess). I would claim that statistics are surprisingly important in real life and software development, and are not always handled deeply enough at school (nor in CS studies), IMHO. Even just the understanding that the result of a benchmark run is just a single value in a cloud of scattered results really helps putting those numbers in context in your head.
There are definitely fields in software development in which math is more helpful than in the fields I've touched mostly. Graphics comes to mind, for example. But I think what's much more important than a math education is the ability to read and learn, and to be curious of the work of others. Because these days, 95+% of our software development work is based on what others have already done before us (and for us). Use existing tools, learn how they work and what their limits are, and then extend those limits when you need to.
If I'm not mistaken, since April 2019 you have also been a core developer in CPython: what responsibilities does this position entail?
The main (and most obvious) difference is the ability to click the green merge button on github. :) Seriously, you can do a lot of great work in a project without ever clicking that button. You can create tickets, investigate bugs, write documentation, advertise cool projects to others, help people use them, participate in design discussions, write feature PRs. You can move a project truly forward without being a "core developer". But once you have the merge right, you are taking over the responsibility for the code that you merge by clicking that button, wherever that code came from. If that code breaks someone's computer at the end of the world, you are the one who has to fix it, somehow. Even just by reverting the merge, but you have to do something.
Being a core developer in a project is really more of an obligation than an honour. But it can also give you a better standing in a project, because others can see that you are taking responsibility for it. So it comes with a bit of a social status, too.
Cython has been and is a key tool in scientific projects, such as the Event Horizon Telescope. Which scientific libraries are you missing in Cython right now? Are there any special ones that you are working on?
I'm not working on scientific libraries myself, although I know a lot of people from other projects in the field. I'm not missing anything here. :)
OTOH, I like hearing about things that others do with Cython. And I like to help others to make Cython do great things for them.
The really cool thing about OpenSource software development is that I'm creating "eternal" values every day. Whatever I write today may end up helping some person on the other side of the planet (or next door) to invent something cool, to answer the last questions about life, the universe and everything, to save the world or someone else's life. That's their projects, their ideas and their work, but it's the software that I am writing together with lots of other people that helps them get their work done. And that is a great feeling.
Are there any important features that you would like to implement in Cython in the future?
The issue tracker has more than 740 open tickets. :) But that answer misses the point. I think the most important goal is to keep helping users getting unstuck when they run into something that they can't really (or easily) solve themselves. Cython is a tool for others to use for their own needs. It should continue to achieve that.
How do you see Cython ten years from now?
I never liked that question when interviewing dev candidates, and I'm not going to answer it now. ;-) Ten years is about an eighth of a human's total lifetime. And it's half of an eternity in tech. It's a very long time. I like how Albert Einstein put it: "predictions are hard to make, especially about the future".
Thank you very much again for your time, Stefan. Take good care of yourself.
Have fun and stay safe.