## Logical Uncertainty

In the past while, I’ve been talking to a friend about logical uncertainty. Specifically, how do we deal with the fact that we’re not logically omniscient? Usually, when $A \rightarrow B$, we have that $P(hB|A) = 1$. But what if we don’t know that $A \rightarrow B$? What if we only have a few hints, some clues, nothing fixed, nothing concrete yet? How do we cope with our limits?

Benja Fallenstein wrote a post that touches upon this. In it, what they did was define a sort of finite “logical universe” with many “impossible possible logical worlds.” The gist of it is that they take a huge finite list of worlds (where a world is a conjunction of sentences) that’s not contradicted by a much huger but still finite list of theorems, and then those are his impossible possible logical worlds. They then distribute probability uniformly over that list and his decision agent does utility maximisation on it.

Based on that post, Manfred wrote a post and then a Sequence where he tried to make this idea somewhat better defined. He tries to model the limited agent as an unlimited agent with limited inferences. That is, an agent that can only do a limited number of logical inferences (which will then obviously have probability $1$ or $0$), and then have a maximum entropy prior over every other logical sentence. And he says that the way he’s built things, we’re violating the desiderata of probability but…

We’re not? We’re really, really not. See, there is in fact no desideratum that needs logical omniscience, no step in the proof of Bayes’ Theorem that requires that. In fact, reasoning about logical statements is totally consistent with all of our desiderata!

But we’ll get to that in a bit.

Afterwards, I read Gaifman’s paper about it, and that was much closer to what I think the mark is than before.

Okay, so, the desiderata that are used to derive the probability rules are representing plausibilities by real numbers, “qualitative correspondence with common sense,” and consistency. The first is trivial, the third just means that two different ways of getting a result have to get to the same result, we’ll always use all our information, etc. Now, that common sense desideratum is where most of our proof’s hidden. Throughout the proof, we use that as a catch-all axiom that tells us where to look, and what properties our reasoner should have. Now, in the proof, logical omniscience is actually never used. There are instances where we do use properties like logical implication, but at no point is it really necessary that we know all implications of a thing.

Particularly, at one point it’s said that if A is a direct logical consequence of C, then $P(A|C) = 1$. If we take this to mean that whenever $C \rightarrow A$ then $P(A|C) = 1$, then indeed that does require logical omniscience. However, another possible interpretation, one that’s consistent with the proof, is that the sentences $E$ and $E \rightarrow A$ are in C. Then, C itself isn’t the sentence that logically implies A, but rather C is a collection of sentences, two of which together make the conclusion that A is true certain.

And this is just modus ponens! In that case, then, $P(A|E)$ itself doesn’t necessarily equal $1$, but $P(A|E\land E\rightarrow A)$ does.

So there’s where I think the mark is. Background knowledge includes logical knowledge. It includes proofs and derivations and inference rules. So now let me belabour this a little bit.

First, I’ll show that $A\land A \rightarrow B$ gives us infinite evidence for B. Suppose I have some background knowledge X, and I also know A. Then I have some prior $P(B|AX)$. Now, suppose I run some computation C and prove that $A \rightarrow B$. I want to calculate $P(B|CAX)$, so I’ll just use Bayes’ Theorem:

$P(B|CAX) = \frac{P(C|BAX)P(B|AX)}{P(C|AX)}$

Pretty standard so far, right? And we can extend the denominator:

$P(C|AX) = P(C|BAX)P(B|AX) + P(C|\bar BAX)P(\bar B|AX)$

Now, C is a computation we’ve observed whose result contains $A\rightarrow B$, right? If that’s the case, then, what’s $P(C|\bar BAX)$?

Did you guess $0$? I’d say almost. But yes, the idea here is that the posterior probability here is modulated by your trust on the computation/proof C:

$P(B|CAX) = \frac{P(C|BAX)P(B|AX)}{P(C|BAX)P(B|AX)+P(C|\bar BAX)P(\bar B|AX)}$

These probabilities are all, of course, quite related, yes? If it is in fact the case that $A\rightarrow B$, then $P(C|\bar BAX) = P(\bar B|AX) = 0$ and $P(B|CAX) = 1$.

Next, suppose I have two sentences, A and B, and two given probabilities for them, $P(A|X)$ and $P(B|X)$. If I were to find out that the two sentences are logically equivalent, I’d necessarily have that those two probabilities have to be the same. So, how do I update upon finding that $A\equiv B$? Let’s see what Bayes has to say about this. If C is the computation/proof that proves that equivalence:

$\frac{P(A|CX)}{P(B|CX)} = \frac{P(A|BCX)P(B|CX) + P(A|\bar BCX)P(\bar B|CX)}{P(B|CX)}$

Once again, our result is modulated by our trust on the computation C. If we have infinite trust in it, or equivalently condition on the proof being true, then $P(A|\bar BCX) = 0$ and $P(A|BCX) = 1$, from which it is immediate that $\frac{P(A|CX)}{P(B|CX)} = 1$: conditioning on a proof that states logical equivalence, the two sentences necessarily have the same probability.

(What probability, exactly? I have no idea. A and B both drift towards the same number, and that number isn’t necessarily the same number either of them had before observing C, but the exact number depends on $P(C|AX)$ and $P(C|BX)$.)

Now, suppose I have part of a proof. Like, say, suppose a proof goes $A \equiv \alpha_0 \rightarrow \alpha_1\rightarrow ... \rightarrow\alpha_n\equiv B$. How should I update on a computation C that includes an incomplete portion of that proof?

I don’t know. But if I grope intuitively in the dark, I know that Bayes’ Theorem says that:

$P(B|CAX) = \frac{P(C|BAX)}{P(C|AX)}P(B|AX)$

The likelihood ratio there would have to be greater than 1, presumably, but I’m not entirely sure how to guarantee that. I do know, however, that mathematicians work in proofs through intuition, and very often a mathematician will see part of a proof and have a “hunch” that the final theorem is true or false. So this may not be an inappropriate way of modelling that kind of reasoning.

And finally, we have Monte Carlo or other probabilistic methods which can affirm certain logical properties of abstract elements with a given likelihood ratio. Those would also count as evidence.

So, this is an initial set of ideas on how to reason about logical statements that may or may not be implied by your current set of beliefs. There are open problems, of course, but it sounds like a good place to look.

–EDIT: And there is also an addendum to this post which I forgot to mention when originally writing it.

This entry was posted in Logic, Mathematics, Probability Theory, Rationality and tagged , , , , , , , , , . Bookmark the permalink.

### 5 Responses to Logical Uncertainty

1. “Background knowledge includes logical knowledge. It includes proofs and derivations and inference rules.”

This struck me as a very Jaynesian way to think about it. Often in Jaynes I have the sense of there being a symbol behind the conditional bar that is generating everything we’re doing. Generating in the sense that, everything I introduce into a derivation that doesn’t follow directly from previous steps, came somehow from the “I” or “X” behind the conditional representing our prior information.

So, I figured this wording came from your very Jaynesian perspective, and was actually surprised to see pretty much the same sentence in a paper that I’m reading. From Jon Williamson’s “Bayesian Networks for Logical Reasoning”:

“Expositions of the theory of rational belief often include a requirement of logical omniscience: if set A of sentences logically implies sentence b then p(b|A)=1. Clearly this requirement does not allow for uncertainty of logical structure, and is too strong for practical purposes. A more sensible substitute is: if X’s background knowledge contains the fact that A logically implies b then p(b|A)=1. This issue is addressed in (Williamson 1999b).”

The paper cited is Jon Williamson’s paper “Logical omniscience and rational belief”. Maybe I should check it out.

Bayesian Networks For Logical Reasoning: http://www.aaai.org/Papers/Symposia/Fall/2001/FS-01-04/FS01-04-021.pdf
Logical omniscience and rational belief: can’t find full text unfortunately