It All Adds Up

It all comes down to the numbers - at least it does according to Dr. Ric Crossman. You’d be amazed at what can be explained by fractions, decimals, percentages and statistics; let Dr. Ric guide you…

Here’s a potentially surprising fact for you: there are mathematical methods for determining probabilities that could be completely inadmissible in court.

The jury seems to still be out - if you’ll forgive the expression - on Bayes’ Theorem, but there are plenty of other methods by which evidence can be mathematically combined and processed in order to arrive at conclusions regarding the likelihood of guilt, but the opinion of the Court of Appeal above suggests any attempt to apply them will actually harm the legal process by whipping up a storm of confusion and mistrust.

That might seem strange in this age of DNA matches and bullet trajectories, but in truth the relationship between law and probability has always been a difficult one. It’s a conceptual problem at heart, I think, with a strong dash of our species’ stubborn denial of rationality mixed in too. But what exactly might be going on? Why should expert testimony be so suspect when delivered within a mathematical context? Perhaps the best way to unpick this particular issue, as is true with so many of the major intellectual and philosophical issues with which humanity must wrestle, is to talk about Blake’s 7.

When Roj Blake’s memory blocks are shaken loose and he resumes his fight against the totalitarian Federation, he is quickly re-arrested, and his captors decide to change tack. Rather than prise open his brain-pan and poke around his cortex with an egg-whisk again, the Federation manufacture evidence that Blake is a child-molester, and put him on trial.

This being Terry Nation’s warped view of the far future, there is nothing as prosaic as a human judge with robe and gavel. On Blake’s Earth, the evidence offered by both sides is simply saved on disc and run through a powerful computer, which then uses its mighty powers of logical deduction to determine the verdict.

Presumably the viewer is supposed to feel anger at the injustice of Blake’s punishment. Not only is the evidence against him false, but it’s entirely likely the Federation have swapped out the actual judgement device and replaced it with a photocopier endlessly churning out sheets of paper with the word “GUILTY” scrawled across them (say what you want about Servalan’s lap-dogs, but their office equipment is impeccably maintained).

My reaction was a little different. In my head, the rigging of Blake’s trial wasn’t nearly as interesting as the question the scene raised: If the evidence had been accurate, and the programming honest, would a justice system run by infallible computing devices actually be a good idea?

There’s another question buried within that one, actually, and we need to consider it first. A computer, after all, is no more omniscient than we are. All it can do (hypothetically) is calculate the chance of guilt to the umpteenth decimal point. How does that compare with our justice system, with its foundation of “reasonable doubt”? If we’re going to starting thinking about replacing Judge John Deed with a Windows desktop, we need to think carefully about what that phrase actually means.

To me, it refers to the standard of evidence available. This is something I come across almost daily. Those of us in the arena of mathematics are no less concerned with the concept of proof than are our legal cousins*. We use the word in a slightly different context, in general; our proofs demonstrate absolute certainty in a way a prosecutor cannot. Nevertheless, we postpone our search for indisputable truth from time to time so that we can “prove” relationships between height and weight, or smoking and cancer cases, or even chewing gum purchase and pregnancy rates**. As I touched upon last time, this is done by choosing a value, most commonly 5% (or sometimes 1% if you need to be more sure) to represent how certain you need to be something odd is occurring (this is actually called the critical value). You then calculate the chance the data you have would look the way it does if the two variables aren’t related. If that chance turns out to be less than your critical value, you decide the variables must be related after all. In plain English, when we say “These two things are related”, what we really mean is “We are sufficiently convinced that the chance they are unrelated is too small to concern us”.

Is that “proof”, though? Well, not in the way I mean it at my day job, no. It does, however, lend itself to the legal context. What level of certainty do you need to convict someone? 95%? 99%? 99.9999%? How many nines do I need to stick on the end of the number before we can safely say we’re “beyond reasonable doubt”?

One could argue that value always exists, even though we can’t identify it, and it changes from person to person and situation to situation. And if they do exist, there must be a value for which everyone agrees we’ve passed the point of reasonable doubt. Let’s call that value p. In theory, all we need is a computer that calculates your chance of being guilty, and if that chance is higher than p, it returns a guilty verdict. You could even have different values of p for different crimes, reflecting the fact that we probably need to be more sure someone is guilty of murder than we do of, say, shop-lifting before the long arm of the law punches them in the testicles. (Not to be sexist, ladies! You’re just as capable of being murderous, thieving scum! I’ve seen Bound! Some bits, y’know, more than others.)

People, by and large, hate this idea, especially if they’ve seen The Day the Earth Stood Still (the original, I mean, the people who saw the remake mainly just hate Scott Derrickson). We shouldn’t recoil from the concept in horror over terrifying visions of Gort dispassionately obliterating humanity, though. What made the judgement of Klaatu’s metallic companion so threatening wasn’t how he would determine our guilt, but how he would determine our punishment. I realise not everyone likes the idea of empathy influencing our legal system, but those that do could imagine a system in which an entirely logical, unfeeling device decides one’s guilt, and then a human judge determines the sentence.

Would even that be desirable, though? I’m not sure people would think so. Firstly, the value of p people would suggest be used by our judgement devices might prove markedly different to (and probably higher than) the values they would actually intuitively accept once on a jury. Indeed, if people actually processed information in the manner they claimed or even believed they processed it, a lot of my friends would be out of a job. It seems plausible to suggest that those who claim they’d need to be 99.99% sure (a number people probably quote so regularly because of its rhythm rather than anything else***) before conviction might be happy to offer a guilty verdict in circumstances which are significantly murkier.

This all means that installing the JUDGOTRON 6000 in our courts might mean a sudden and potentially significant dip in the conviction rates, as our arbitrating automaton starts demanding overwhelming evidence. Of course, if we truly are regularly convicting people at levels of certainty we don’t actually consider sufficient, then that’s a problem we need to deal with, whether we start bending the knee to our benevolent robot overlords or not.

In the end though, I think the idea is a non-starter because of people’s discomfort with cold, unyielding numbers. “Beyond reasonable doubt” seems to imply a certainty that “99.99% sure” does not, even though we know rationally that the first statement is not more powerful, it’s merely less specific. How else can we explain ostensibly intelligent judges arguing that giving probability values ignores the difficulty in making judgements, when what it actually does is immediately demonstrate how difficult those judgements are? Bypassing these issues probably makes a jury’s job easier, but almost by definition, it cannot make their judgements any better. That’s like saying you become a worse poker player once you know the odds of each hand.

Furthermore, people don’t want to be told there is a 0.01% chance they’ve convicted an innocent person. They want to be told that any doubt they did so is unreasonable. Perhaps this is why mathematical testimony is required to be so simplistic, despite there being so many other disciplines where expert witnesses are so valued. We recognise how uncomfortable people are with the numbers involved, and with the idea that they might have to base convictions on them.

There might also be something even more basic at work here. I’m not sure people would genuinely want an infallible justice system in the first place. Plenty of people might claim to want one, but then most of us don’t think twice before ripping our friends CDs, or breathe a sigh of relief when we pass a residential speed camera at 35mph without triggering it. Justice is something we want to see done, not have done to us, and we construct ugly shelters of self-justification and denial to convince ourselves that’s not a problem. Replacing our judges with computers would shatter those structures, and I’m not sure how many people could cope with that.

I guess what I’m really saying is this: we don’t want a justice system that’s less fallible than we can make ourselves. In the end, that might actually be the smartest thing we could possibly do. The bedrock of wisdom is the realisation that we’re idiots.



* As oppose to our illegal cousins, who I’d stay away from if you don’t want to get yourself dragged away to face the Photocopier of Judgement.

** There’s actually a five year period in UK history where the relationship between those two variables is genuinely correlated. Anyone care to guess which five years and why?

*** Let’s consider an example. You have a coin that you think will always come down heads. You can never be completely sure just by tossing it, though, because even a fair coin can come down heads time after time. However, we can calculate (using, as usual, the binomial distribution) the number of heads we would need to get in a row before we could be 99.99% the coin was at least biased in favour of heads. That number is 14.

However, all that would tell you is that the ratio between heads and tails is greater than 1:1, hardly what we want when we want to know if the coin always shows heads. You can take it further by working out how many heads you need before you are, say, 99.99% sure the ratio is at least 9:1 in favour of heads. That number is 87. This is entirely unsurprising, obviously, because you need more evidence to prove the coin is biased more than 9:1 in favour of heads than you do to just decide the ratio is something other than 1:1.

You could even work out how many heads in a row you would need to be 99.99% sure the ratio is at least 9999:100 – i.e. there is a 99.99% chance that the coin comes down heads more than 99.99% of the time – which requires an almighty 138,149 throws.