Recently, I’ve written about some introductory topics in Bayes’ Theorem. If you did not read these earlier pieces, you may want to go here and here before reading this post.

The initial impetus was to use the theorem to defend a famous maxim often attributed to Carl Sagan—extraordinary claims require extraordinary evidence. This time, I’m going to use the theorem to argue *against *another maxim associated with Sagan.

Sagan was an outspoken supporter of the Search for Extra Terrestrial Intelligence (SETI). In the following embedded video, he introduces the Drake Equation on his show *Cosmos*.

This equation leads many to believe there is almost certainly intelligent life elsewhere, even in our neighborhood of the galaxy. However, we face a problem. So far our search has been fruitless. In response to this, Sagan noted that absence of evidence is not evidence of absence.

What Sagan should have said, though it wouldn’t have been as catchy, is that absence of evidence is not *proof* of absence. There is a difference between something being a proof of a claim and being evidence for or against a claim. If we go back to thinking in Bayesian terms, we can put it like this: If something is evidence against a hypothesis, then the posterior probability will be lower than the prior probability after taking said evidence into account.

Let’s run some numbers and see whether absence of evidence leads to this lower posterior probability. It might be useful to think of this like a function. A measure of probability goes in, *stuff happens*, and another measure of probability comes out the other end. It doesn’t really matter for the purposes of this post what the prior probability is; rather, we are just concerned with how the output compares to the input. In terms of the theorem, that means we’ll want to focus our discussion on the two figures assessing the likelihood of observed evidence.

Let’s consider an example with a prior probability of H set at 0.5:

- 0.5 * Pr(E│H) / [0.5 * Pr(E│H)] + [0.5 * Pr(E│¬H)] = ?

We are concerned with the likelihood of our observed evidence given a true hypothesis, Pr(E│H),and the likelihood of the same evidence given a false hypothesis, Pr(E│¬H).

First, we should observe how the equation will react based on what we plug in for these numbers. If we plug in the exact same number for both figures, then our outcome will not change. The posterior probability will be 0.5, which will mean our evidence did not specifically favor either H or ¬H. Plugging in the same number for both essentially means the observed evidence was equally expected by both hypotheses.

But what happens if one is higher or lower, meaning the evidence is expected under one hypothesis more than the other? Let’s try plugging in Pr(E│H) = 0.7 and Pr(E│¬H) = 0.3. Our output is 0.7. Compared to the prior probability of 0.5, this is an increase, so this was evidence in favor of H. How about if we switch the figures so that Pr(E│H) = 0.3 and Pr(E│¬H) = 0.7? This time, the output was 0.3, a decrease, so this was evidence against H (against H and in favor of ¬H is really the same thing).

Now on to the big question of whether absence of evidence for some hypothesis (H) will mean a higher number in Pr(E│H) or Pr(E│¬H) or whether they will be the same. Let’s first eliminate one irrelevant possibility. In these cases, Pr(E│¬H) will *always* be =1. That is because in cases where someone claims something does not exist, like God or ghosts or aliens, there should always be an absence of evidence. That is expected 100% of the time. This means Pr(E│H) will never be higher than Pr(E│¬H); it can only be ≤1.

Whether or not Pr(E│H) will be lower than or equal to Pr(E│¬H) will depend on what H predicts. For example, say you specifically predict life on Titan, a moon of Saturn. If someone observes that there is no evidence of life on Mars, that doesn’t affect your hypothesis. So, it certainly is possible in cases of irrelevant evidence to achieve a neutral outcome. You can try plugging in some numbers yourself to see. In the following cases, the posterior probability shows no change from the prior probability because both likelihood measurements are =1:

**0.5*** 1 / [**0.5*** 1] + [0.5 * 1] =**0.5****0.9*** 1 / [**0.9*** 1] + [0.1 * 1] =**0.9****0.1*** 1 / [**0.1*** 1] + [0.9 * 1] =**0.1**

Many hypotheses, however, will not be so lucky. That is because the search for evidence is often quite relevant to the hypothesis (otherwise it would be a pretty fruitless search). So, in most cases where the evidence is relevant to the hypothesis, Pr(E│H) will be lower than Pr(E│¬H), which leads to a lower posterior probability, as shown in the following examples:

- 0.5 * 0.9 / [0.5 * 0.9] + [0.5 * 1] = 0.47
- 0.5 * 0.75 / [0.5 * 0.75] + [0.5 * 1] = 0.43
- 0.5 * 0.5 / [0.5 * 0.5] + [0.5 * 1] = 0.33

In review, as long as the lack of evidence is relevant to the hypothesis, this lack of evidence is indeed evidence *against *that hypothesis being true. The degree to which that is the case will depend specifically on the initial predictions of H, as shown in the last set of examples.

### Similar Posts:

- Extraordinary Claims Really Do Require Extraordinary Evidence
- Is naturalism a type of faith?
- How to use Bayes’ Theorem

## 11 comments

Skip to comment form ↓

## Hendy

January 5, 2012 at 5:18 pm (UTC 0) Link to this comment

You show:

— 0.5 * Pr(E│H) / [0.5 * Pr(E│H)]* [0.5 * Pr(E│¬H)]

Did you mean?

— 0.5 * Pr(E│H) / [0.5 * Pr(E│H)] + [0.5 * Pr(E│¬H)]

(Note the “+” instead of “*” for the denominator).

If I use the latter formulation, I get the same answer as you for your calculations. If I use the formula how it’s shown, I do not. For example (last equation in the post):

— 0.5 * 0.5 / [ (0.5 * 0.5) * (0.5 * 1) ] = 0.125 (0.5^2/0.5^3)

— 0.5 * 0.5 / [ (0.5 * 0.5) + (0.5 * 1) ] = 0.33 (0.5^2/0.75)

Lastly, could you clarify this statement:

,—–

| So, in most cases where the evidence is relevant to the hypothesis,

| Pr(E│H) will be lower than Pr(E│¬H)

`—–

I would have expected it to be the other way around. In other words, the probability of some evidence E, given our hypothesis H (which is supposed to be explanatory) is higher than the same evidence E if our hypothesis is false.

## Mike

January 5, 2012 at 6:52 pm (UTC 0) Link to this comment

Ah, yes, it was a typo. Thanks. I’ll fix that later.

Re: the sentence, I may have worded it confusingly. I’m still talking about absence of evidence. So, if evidence is expected and not found, then the likelihood will necessarily be lower than 1.

## Mike

January 5, 2012 at 7:29 pm (UTC 0) Link to this comment

Let me elaborate a bit more. Say you have two buckets. One of them is filled entirely with red balls and the other is filled with a mixture of red and blue balls. Now, someone says they are going to blind fold you and place a random bucket in front of you. Your prior probability of the R/B bucket is 0.5 and the same for the R bucket. You pull out a ball and someone tells you it’s red (let’s assume they aren’t lying). If it’s red, then the likelihood for the R bucket is 100%, but the likelihood for the other bucket will match whatever the mixture is, maybe 50% if it’s evenly divided between red and blue.

That’s a pretty intuitive example and what I want to say is that absence of evidence is just like drawing that red ball. It is completely expected under ~H and only partially expected under H. That’s what I’m trying to say with that sentence. If the evidence is irrelevant, then I see no reason to have any difference in likelihood. However, if the evidence is relevant (and it is not found when expected), then Pr(H) will necessarily be lower than 1.

## Hendy

January 5, 2012 at 7:35 pm (UTC 0) Link to this comment

Gotcha. I get that it needs to be lower than 1 (purely statistical), but does that imply that:

— Pr(E│H) < Pr(E│¬H)?

Or you just mean that in cases of absence of evidence, it turns out that way? I don't think there's any a priori connection between Pr(E│H) and Pr(E│¬H); they're independent.

## Hendy

January 5, 2012 at 7:40 pm (UTC 0) Link to this comment

Sorry, posted my last bit before your analogy. I think we’re on the same case. I think I’m thinking of cases where the evidence is a bit more binary — either it implies the hypothesis or not — vs. your red/blue bucket example. That makes sense, and continuing to draw red balls would certainly mean that:

— Pr(many red balls drawn | red/blue mixed bucket) << Pr(many red balls drawn | ~red/blue bucket (red bucket))

Thanks for the example; that helped.

## Mike

January 5, 2012 at 7:42 pm (UTC 0) Link to this comment

Only in absent evidence cases. That’s just because the null hypothesis must predict that there will be no positive evidence. The no ghosts hypothesis predicts that every investigated claim will turn up empty, right?

## Mike

January 5, 2012 at 7:44 pm (UTC 0) Link to this comment

Yeah, I think we’re on the same page too. I do think they’re independent. It’s just the fact about the null hypothesis that necessitates the < result.

## Jacey

August 29, 2014 at 9:45 pm (UTC 0) Link to this comment

At last! Something clear I can unaddstrne. Thanks!

## characteristic varies

October 14, 2014 at 12:13 pm (UTC 0) Link to this comment

You’ve really impressed me with that answer!

## lifetime insurance

November 25, 2014 at 12:30 pm (UTC 0) Link to this comment

If time is money you’ve made me a wealthier woman.

## personnal health insurance

November 25, 2014 at 12:55 pm (UTC 0) Link to this comment

I guess finding useful, reliable information on the internet isn’t hopeless after all.