## Absence of evidence is evidence of absence

The W’s article about Evidence of Absence is confusing. They have an anecdote:

A simple example of evidence of absence: A baker never fails to put finished pies on her windowsill, so if there is no pie on the windowsill, then no finished pies exist. This can be formulated as modus tollens in propositional logic: P implies Q, but Q is false, therefore P is false.

But then go on to say: “Per the traditional aphorism, ‘absence of evidence is not evidence of absence’, positive evidence of this kind is distinct from a lack of evidence or ignorance[1] of that which should have been found already, had it existed.[2]

And at this point I go all ?????.

And then they continue with an Irving Copi quote: “In some circumstances it can be safely assumed that if a certain event had occurred, evidence of it could be discovered by qualified investigators. In such circumstances it is perfectly reasonable to take the absence of proof of its occurrence as positive proof of its non-occurrence.”

UM.

Alright so, trying to untangle this mess, they seem to want to make a qualitative distinction between “high-expectation evidence” and “low-expectation evidence.” Now, if you have read other stuff on this blog, like stuff about Bayes’ Theorem and the Bayesian definition of evidence and the many ways to look at probability and… Well, you must know by now that probability theory has no qualitative distinctions. Everything is quantitative. Any sharp divisions are strictly ad hoc and arbitrary and not natural clusters of conceptspace.

Thankfully, there is another quote in that W article that’s closer to the mark:

If someone were to assert that there is an elephant on the quad, then the failure to observe an elephant there would be good reason to think that there is no elephant there. But if someone were to assert that there is a flea on the quad, then one’s failure to observe it there would not constitute good evidence that there is no flea on the quad. The salient difference between these two cases is that in the one, but not the other, we should expect to see some evidence of the entity if in fact it existed. Moreover, the justification conferred in such cases will be proportional to the ratio between the amount of evidence that we do have and the amount that we should expect to have if the entity existed. If the ratio is small, then little justification is conferred on the belief that the entity does not exist. [For example] in the absence of evidence rendering the existence of some entity probable, we are justified in believing that it does not exist, provided that (1) it is not something that might leave no traces and (2) we have comprehensively surveyed the area where the evidence would be found if the entity existed…[5]
—J.P. Moreland and W.L. Craig, Philosophical Foundations for a Christian Worldview

This looks much more like Bayesian reasoning than the rest of that article did. But let’s delve deeper and see how to prove a negative.

Bayes’ Theorem is symmetrical. We all know that. And we also know the law of total probability:

$P(H|X) = P(H|EX)P(E|X)+P(H|\bar EX)P(\bar E|X)$

The probability of anything is always a weighted average between the probability of that thing conditional on something else and conditional on the negation of that something else. Or, in the relevant case, the probability of a hypothesis is is a weighted average between the probability of that hypothesis conditional on some evidence and the probability of that hypothesis conditional on the negation of that evidence. And the weighs, of course, are the prior probability of the evidence itself.

What this means is that, before you observe some evidence, your uncertainty about a thing has to be somewhere between what it’d be had that evidence been there and what it’d be had it not. Yudkowsky uses a very pertinent example:

Post-hoc fitting of evidence to hypothesis was involved in a most grievous chapter in United States history: the internment of Japanese-Americans at the beginning of the Second World War. When California governor Earl Warren testified before a congressional hearing in San Francisco on February 21, 1942, a questioner pointed out that there had been no sabotage or any other type of espionage by the Japanese-Americans up to that time. Warren responded, “I take the view that this lack [of subversive activity] is the most ominous sign in our whole situation. It convinces me more than perhaps any other factor that the sabotage we are to get, the Fifth Column activities are to get, are timed just like Pearl Harbor was timed… I believe we are just being lulled into a false sense of security.”

Bayes’ Theorem is symmetrical! If no subversive activity is a sign of sabotage, then it has to be the case that subversive activity is a sign of no sabotage! No matter how unlikely the Fifth Column was to produce subversive activity, the absence of a Fifth Column is even more unlikely to produce that subversive activity!

Before I continue, I want to prove a thing. In “What is evidence?” I said that E is evidence for a hypothesis H if $P(E|HX) >P(E|X)$, or that E is more likely to be observed when H is true than baseline. Let’s show that this is equivalent to saying that $P(E|HX) > P(E|\bar HX)$ (that is, that E is more likely to be observed when H is true than when it’s false).

By the law of total probability:

$P(E|X)=P(E|HX)P(H|X)+P(E|\bar HX)P(\bar H|X)$

If E is evidence for H, then we know that:

$P(E|X)>P(E|X)P(H|X)+P(E|\bar HX)P(\bar H|X)$

Divide both sides by $P(E|X)$:

$1>P(H|X)+\frac{P(E|\bar HX)}{P(E|X)}P(\bar H|X)$

Now, I have proven that if the above is true, then:

$\frac{P(E|\bar HX)}{P(E|X)} < 1$

So, if E is evidence for H, it follows that $P(E|\bar HX), which means that it’s more likely to be observed when H is true than when it’s false. And this in turn means that absence of evidence must be evidence of absence, because $P(E|\bar HX) < P(E|X) \leftrightarrow P(H|\bar EX) < P(H|X)$. This is fairly easy to prove with Bayes’ Theorem.

Absence of evidence is evidence of absence. There could be no other way, probability theory cannot be consistent otherwise. But how much evidence is it?

$O(H|\bar EX) = O(H|X)\frac{P(\bar E|HX)}{P(\bar E|\bar HX)}$

As usual, this has to be modulated. The best way to measure the strength of the evidence on a hypothesis is by looking at its likelihood ratio. How much less likely is that evidence to not be there when the hypothesis is true than when the hypothesis is false?

Let’s look at the elephant and flea examples. In both cases, seeing the animal would be evidence for the hypothesis that the animal is there, so not seeing it is necessarily evidence against that hypothesis. But the likelihood ratio of that evidence is different in both cases.

If someone were to not see an elephant in a room, then that is significant evidence against there being an elephant in a room. Why? Because it not being seen is massively more likely when it’s not there than when it is, in fact, there: $P(\bar E|HX) \ll P(\bar E|\bar HX)$, and the likelihood ratio is minuscule.

Not seeing a flea, however, is almost insignificant evidence that the flea is not there, because even if it were there, we’d still not expect to see it: $P(\bar E|HX)\approx P(\bar E|\bar HX)$. Not seeing the flea is evidence against the flea being there, of course, but it’s negligible evidence.

Absence of evidence is, always, without a single exception, evidence of absence. How much evidence is, as usual, a thing that can only be inferred by how much you would have expected to see that evidence if the hypothesis were true. You always have to compare a hypothesis to its alternatives.