First of all, congratulations to Joe Shaw on getting hitched! I can’t believe that, only a few months ago, I was having drinks with him, and when I asked him how he was he didn’t even mentioned getting married! I guess it was already a plain simple fact of life for him by the time.
But to keep it technical, I read through his reply to some Beagle comments, which resonated with me:
And lastly, having worked with Trow on a reasonably big desktop Python app, we wanted a strongly typed language. Writing real applications in Python requires a discipline that unfortunately most people (including myself) are unwilling to adhere to, and this easily leads to buggy and hard to maintain programs. You have to be very diligent about unit tests and code coverage for every line of code, because you can’t rely on the compiler to catch errors for you. We had been burned by this a bit, and wanted to get back to a strongly typed, but still easy to use language that integrated well with the desktop.
The core of this message is 100% true – you need discipline to write real apps in Python; you have to be very diligent about unit tests and code coverage.
The crux of the matter though is that this is true for *any* real application in *any* language. However old and boring it makes me feel to say so, doing these things are part of good software engineering practice, and for good reason. Since realizing this, all the projects I’ve started working on, as soon as they leave the single-day-hack-prototype phase, first get converted into a simple bare-bones project, and the first things I add is unit test, code coverage and API documentation infrastructure. This is very hard to retrofit on an existing codebase; you teach yourself the habit of doing so at the start so you can reap the rewards later.
Unit tests and code coverage support two separate things. The first is that they give you confidence, if done right, that overall your project code is working well. It allows you to decide to release because you have some way of measuring that the quality of your project is at least as high as on the previous release.
The second is that you can refactor correctly. Some of our hackers in the office incorrectly use the word refactor when they really mean “I don’t like or understand this big section of code and it’s buggy and I am going to rewrite it completely and see what happens.” Refactoring is much more well-defined than that. It is the act of rewriting the internal implementation of a section of code, without changing the external behaviour at all. (A logical consequence is that refactoring code does not fix bugs – you need to produce the same bugs in your refactored code!)
How do you know if you changed the external behaviour ? You can only know so if you have unit tests which cover the code you’re refactoring.
As a side note, a distinct bonus of your project having unit tests is that it makes it easier for others to hack on. Just this weekend, I found a simple bug in pychecker. Now, pychecker has a unit test setup, so that means I can add the unit test that is failing, figure out where in the code it’s failing, fix the code, verify that it works with my unit test, then verify that I did not break any other functionality by running the complete unit test. The result is 15 minutes of patching work, and it’s a joy to go bug fixing this way.
Getting back to Joe’s post. I’m not sure Joe intended it to sound that way, but I felt the conclusion he drew from his experience with Python was “I don’t use Python anymore because it forces me to think about unit tests and code coverage.”
I don’t know what kind of tests Beagle has as part of its codebase and release procedure. I wrote about my experience with Beagle at some point
and Joe commented to the effect that the kernel gives you 8192 inotify watches and when you run out, it acts funny.
Well, no shit. Of course I have over 8192 directories under my home directory. If Beagle is, for example (I have no idea if it does), really watching every .svn and every CVS directory, no wonder it is running out. Two questions spring to mind – how did Joe get away with not having over 8192 .svn and CVS directories, and why does the software not deal with this gracefully ? If I would learn about this limit, I would write an integration test (probably not a unit test, in this case) and test graceful handling of this boundary condition. (It’s entirely possible that a test like this exists in the Beagle code, I haven’t checked – I don’t intend this as an attack on Beagle or Joe, for those who assume the worst in me :))
To cast a stone at myself, GStreamer for a long time only had a bunch of small ad-hoc test applications that developers would write as they hacked on some feature, then at best it would be integrated in some make check command, or at worst it would be forgotten about and bitrot. I think one of my biggest contributions to GStreamer was to reorganize the test suite into a coherent whole, and making it easy for developers to integrate into it. It’s obviously not the only reason 0.10 is so much better than 0.8, but it has definitely helped a lot.
I’m not saying a program is automatically better for having unit tests and code coverage, that would be silly. A perfect programmer can knock out a perfect program with zero coverage and tests. A crap programmer can knock out a crap program with 100% test coverage. But all things equal, having tests and coverage makes your program better. These days, I definitely trust programs more, and prefer to use them more, if I’ve seen they come with tests, and the developers care about their quality.
The first project that really convinced me of these simple principles was the amazing Twisted project. In spite of the sometimes vicious social commentary those guys receive from their peers (bloat, overdesigned, badly documented, downright crazy ?), which makes them sound jaded sometimes, these guys have, over several years, created a consistently excellent framework. They care about their quality and their users, and have integrated so many of these good engineering practices in their daily workflow. I’m not sure they themselves realize how rare this was for a FOSS project back then, but for me, Twisted as a project was an eye opener.
Wow, this post is a lot longer than I intended it to be, after waking up way too early and still fighting the aftermath of a stomach flu. I have to get going.
Anyway, here’s what I want you to take away from today’s post.
I program in Python precisely because it forces me to write unit tests and think about coverage. You should be doing this anyway, regardless of the language you do it in. By choosing a language Python, you will realize why much earlier on in your project, or your project will fail spectacularly :)