Artillery Row

A series of tubes

Imperial College’s Covid-19 coding is unintelligible

It may be surprising to those outside the bubble, but it is no great secret within the academic community that the quality of most computer codes used is very poor. Prof. Neil Ferguson’s group at Imperial have now published their pandemic modelling code, and it is even by those standards exceptionally abysmal. What is even more alarming is that the published code has supposedly been improved over a number of weeks to be less dire. So what it looked like back in March when Prof. Ferguson was predicting hundreds of thousands of deaths is anyone’s guess.

But code quality is not merely stylistic. It goes to the heart of the problem — if others cannot read what you have written, then they cannot verify that it is correct. I have squinted and stared at this code for a day now and I am still none the wiser what is going on in many parts. Egyptian hieroglyphics may well be easier to interpret. Latin scribal abbreviations certainly are.

But this begs the question: just why are academic codes (in the modelling community the noun is codes not programmes) so invariably dreadful? That is not to say that they are all incorrect, of course. Dreadfully written just means verifying correctness is vexing to the point of impossibility. Or in other words, we simply can’t know if the ICL code is correct. The point of a clearly written code is that others can audit it and understand it.

Almost no scientific programmers are trained as such. Instead, C, or FORTRAN, or whichever language they use, is simply a tool to solve problems with which they learn on the go. Thus no one ever imparts good practices at the beginning. Then, by working alone or in small groups, there is no external driver to improve matters either. I only learned good practices when I started teaching C to others, for example.

Why does this chronic problem arise? Because of the secondary problem sometimes called “publish or perish”. It is actually better for one’s career to publish a result quickly, from one of these aforementioned unverified codes, and then just say in a future paper “oh, yeah, due to a bug, night is actually day”. I did exactly this, albeit unintentionally, back when I was a PhD student. Lewis, Bate, and Price (2015) was essentially revoked by an erratum (Lewis, Bate, Price 2017a), where a bug which had lurked in our magnetohydrodynamics code since about 2005 fatally compromised the results. And those of at least seven other papers.

But in the academic world, I could still list the 2015 paper (and scurrilously omit the erratum) on my CV, and list the second paper I published in 2017 which was the “correct” — whether you believe that is another matter — version. Worse, the 2015 paper still had “impact”, that all important academic metric. Indeed, people still sometimes cite it now, even with an erratum. Which also makes you wonder how often people read the papers they cite. Also, what about all the papers whose authors were not so dispassionate and did not publish errata?

Even worse than this, even if you were an academic who decided to spend some time writing a decent, well documented and verifiable code, there are precious few rewards for doing so. It is all but impossible to make a career out of ‘code papers’ (that is, papers which describe in detail not an algorithm or its results, but how it is implemented on a computer). Believe me, I tried. Instead, work spent improving a code is seen by the rest of the community often as wasted effort. After all, if it “works” and produces “results” then that is all that matters. Whether those results are necessarily correct or not matters less than you might expect.

So not only is there no reward for good coding, there is actually a perverse incentive to not worry too much about the quality of one’s code at all. And that’s before we talk about writing tests. Worse, spending time fixing or improving a code is not only not rewarded, but by reducing one’s output of impactful papers, it actually has a negative effect on career prospects. In other words, the status quo rewards the production of incomprehensibly dreadful codes like the ICL one, and would actively punish anyone who spent a period of time trying to make it better.

Imperial College’s Covid-19 code is quite possibly the worst production code I have ever seen.

This could be fixed if academics published their codes. But a few minor exceptions aside, there is no compulsion to do so from journals. And there is little internal incentive to do so either — all the more so if others can then see just how bad things are. The narrative around the ICL group’s model in March would have been very different had this code been published with it, after all.

But it must be fixed. There must be an expectation that when the results of a model are published, the tool used to produce them is too. We already require publication of the mathematical description of the algorithm — rightly a paper which did not do so would be rejected. Hence, it’s a small jump to also include the implementation. Some journals already require the publication of input parameters and output data files, so this is, if anything, simply completing the circle.

Some academics will be concerned that they will be “scooped” if they publish their codes. But to be so sees the problem entirely in reverse: yes others might use the code, if it is clear and intelligible, of course. This is better than the other group writing their own, equally secret, likely equally dire, code. To do so is wasted effort. Equally, the argument that the code must not be published lest others use it “badly” holds no water either. Others can equally use your mathematics badly, and we still publish that.

If it was up to me, I would go further still and require those codes to also have published test cases and proof that the version used in the paper passed them. Additionally, we need proper career progression for academics who code.

Imperial College’s Covid-19 code is quite possibly the worst production code I have ever seen. I don’t say this lightly, I have used fluid dynamics codes with a heritage from the 1980s.

This should be a wake-up call for the whole of academia, that it needs to sort out coding practices or, rightly, people will start to question the results. For theoretical astronomy, this is an anodyne problem. For epidemiology, it is, quite literally, lethal.

Enjoying The Critic online? It's even better in print

Try five issues of Britain’s newest magazine for £10

Critic magazine cover