## An anti-conjunction fallacy, and why I’m a Singularitarian

When anyone talks about the possibility or probability of the creation/existence of an UFAI, there are many failure modes into which lots of people fall. One of them is the logical fallacy of generalisation from fictional evidence, where people think up instances of AI in fiction and use that as an argument. Another is how the harder a problem is, the faster someone solves it, without spending even five minutes thinking about it. The absurdity heuristic makes an appearance, too.

But someone who’s familiar with LW or the whole cognitive biases shizzaz might be a bit cleverer and argue that most futurists get it wrong and predicting the future is actually really hard (conjunction fallacy). Ozy wrote a post about donating to MIRI in which zie points this out, but in the end mentions talking to, well, yours truly about it, and I think overall there are three points where I disagree with zir.

First, I propose the existence of a fallacy related to the conjunction fallacy and the sophisticated arguer effect, something I’ll call the Anti-Conjunction Fallacy, or perhaps the Disjunction Fallacy, or something. Maybe this is not a direct countercounterargument to Ozy’s point, but it’s a more general countercounterargument to the counterargument that “predicting AIs typically invokes a highly complex narrative with a high Complexity Penalty.”

The Conjunction Fallacy is a fancy name to the idea that sometimes people judge $P(A\land B) > P(A)$, which is to say that a more complex proposition with more details seems to us more probable than a simpler one due to appealing to our sense of narrative. This is a fallacy because it’s a theorem of probability that the exact negation of that sentence is true, no matter what $A$ and $B$ are; that is, it is always the case that $P(A\land B) \leq P(A)$. But conversely, we have that $P(A\lor B)\geq P(A)$, that is, a disjunctive story is more likely than any of its components.

My proposed fallacy is this: many people (particularly rationalists) who see a long tale have an instinct to cry complexity penalty without actually checking whether the logical connective between the elements of that tale is a conjunction or a disjunction, AND or OR, and thus fall into the trap of saying that a disjunctive story has a low probability due to this instinct. And in my experience, most AGI predictions seem to be heavily disjunctive, in that the people making them (such as Nick Bostrom in his book) suggest a myriad possible disjunctive ways a superintelligence could arise, each of which relatively probable given current trends (e.g. whole brain emulations are an active research area which has seen actual results), so the posterior probability of the enterprise as a whole is much higher than that of each of those paths. This is true of many parts of the superintelligence narrative, from its formation to its takeoff to its potential powers. I don’t need five minutes to think of five different ways a superintelligence could reasonably take over the world and I’m not superintelligent.

So the moral of this part here is that, when you see a long prediction about something, first see whether it’s disjunctive or conjunctive before looking for fallacies. Isaac Asimov may have been wrong about the exact picture the future would paint, but by golly a large number of his individual predictions did in fact come true!

My second point is not so much an objection as a sort of reminder about what MIRI is actually doing. I’m not sure what its original goals were, but it most certainly isn’t trying, by itself, to program a superintelligence, at least not right now. Ozy says:

So it seems possible the solution is not independent funding, but getting the entire AGI community on board with Friendliness as a project. At that point, I can assume that they will deal with it and I can return to thinking of technology funding as a black box from which iPhones and God-AIs come out.

The thing is, that is one of MIRI’s explicit goals, outreach about AI dangers. And they seem to be at least mildly successful, or at any rate something was, given that Google created an AI Ethics board when it bought DeepMind, and given the growing number of prominent intellectuals that have been talking about the dangers of AI lately, some of which directly mentioning MIRI.

My third and final objection is that I think zie misunderstood me when I talked about the predictive skill of people who actually build technologies. I didn’t mean that they have some magical insider information or predictive superpowers that allow them to know these things; I meant that when you’re the one building a thing, what you’re doing isn’t predicting as much as it is setting goals. Predicting what Google is going to do is one thing, being inside Google actually doing the things is a whole ‘nother, and when AGI researchers talk about AGI there is frequently an undertone of “even if no one else is gonna do it, I am.” Someone who works at MIRI isn’t concerned so much with the prediction that a superintelligence is possible as they are with their own ability to bring it about, or raise the odds of a good outcome if/when it does.

My last point is something Ozy touched upon and on which I want to elaborate. Zie mentioned AGI is fundamentally different than other “large-scale” projects from before in that, unlike, say, nukes, the way it’s done will severely impact its outcome. As it is, I’d argue that almost no conclusions at all can be drawn from the past funding and development of technological advances because… the sample space is tiny. We can’t judge whether individuals funding research is an effective method of getting that research done because this idea, and the means to do so effectively, are brand new. During the 20th century, most technological advances happened due to the military, but that’s perfectly understandable given the climate: two full wars and a cold one spanning large powers, constant change in political and economic climates…

But large tech companies are a new invention, and it is my impression that, since at least mid-nineties, most of the technological advancements have had at least a hand of the private sector, and this seems to increasingly be the case. I’m not sceptical at all of the ability of individually funded technologies, especially software technologies, to play a large part in the future, because that’s what they’re doing right now, in the present.

But at any rate, there are a number of ways AGI could come about, and MIRI is trying to do what it can. So far, other than that, the FHI, and mmmmaaaaybe Google, it seems no one else is.

This entry was posted in Rationality and tagged , , , , , . Bookmark the permalink.

### 7 Responses to An anti-conjunction fallacy, and why I’m a Singularitarian

1. Will says:

While google said it would set up an “AI ethics board” there hasn’t been any real talk of that since shortly after deepmind was acquired. I strongly suspect it was an “appease these guys so we can buy the company” sort of move- certainly not something it seems they are throwing much effort into.

• pedromvilar says:

2. 1Z says:

Hi

AI threat in general isn’t highly conjunctive, but MIRIs arguments are.

1. Humans will or should build at least one superintelligent AI .
2. It will be a singleton, ie much more powerful than any rivals (possibly as a result of FOOM, ir
rapid self-improvement).
3. It will agentive.
4. It will have a utility function…
5 ….which is stable under self improvement…
6. ….which is hardcoded (explicitly specified rather than trained in)
7. ….which cannot be updated. (So no corrigibiliy , ie ability to “steer” it)
8. …. which contains detailed information about goals (direct normativity), eg a specification of human happiness.

This selection of claims is highly conjunctive: all the propositions have to be true for MIRI’spredicted problem to be likely, and for MIRIs specified solution to be applicable. In particular, without FOOM, their favoured solution is not necessary, and without goal stability, their favoured solution is not possible.

• pedromvilar says:

I disagree that these are all assumptions.
1. This is indeed an assumption.
2. Not necessarily, see Bostrom.
3. ? does not parse
4. No, they don’t assume this, this is a “goal” for the FAI; they assume nothing about non-FAI.
5. They not only don’t assume this, they actually believe the opposite, that unless active effort towards stability is made the default outcome is instability.
7. They don’t assume it cannot be updated, they assume it will be “hard” to update it past a certain level of general capabilities because the AI would probably resist changes.
8. They do not assume this, they assume the problem of how to specify such UF is unspecified and may or may not have detailed information but that it’s likely that without such information it will probably go wrong.

MIRI’s problem is “how to make sure AI is aligned” and they’re trying methods of attack. Their challenge exists exactly because by default an AI has *none* of these constraints and is thus very unpredictable and probably dangerous.

3. 1Z says:

2. I am talking specifically about MIRI. I am aware that what Bostrom says is different.

4. Assuming that an AI can only be safe, if it has a utility function is making an assumption. Assuming that Friendliness, in some sense not synomous with safety, is needed is another assumption.

5 I am aware they don’t regard stability as inevitiable. My comment “and without goal stability, their favoured solution is not possible” indicates that goals stability is required as part of their solution.
They don’t see instability as something which reduces threat or offers opportunities for safety.

6 Who’s they? MIRI? I have seen little about training or ANNs.

7. They assume both that unupdateable utility functions are a desirable safety feature, and that utility functions are naturally hard to update. Indeed., when they discovered that the ability of
an AI to retain its goals under self-modification is not a given, they handwaved the problem away with talk of probablistic reasoning, so that they could continue in the same path. See the Tiling Agents paper.

8 See The Genie Knows but does not Care, and its comments.

• pedromvilar says:

4. How is Friendliness not synonymous with safety? Also, it may not be the case that an AI can only be safe with a utility function, but even things without utility functions can be modelled as having utility functions, the mathematical framework can still be used to prove stuff even if it doesn’t reflect actual reality.

5. …yeah? Um. I uh. Yeah? I’m not sure how that’s an objection.

6. You’re getting kinda ahead of yourself by already assuming there’s a specific method, such as an ANN, that’s gonna be used to train values into the AI. Anyway, a ten-second search came up with: https://intelligence.org/files/ValueLearningProblem.pdf

7. I don’t see where they assume this? And they didn’t “discover” that an this ability is not a given, this is one of their foundational theses, which Yudkowsky had been discussing for a long time even before the SIAI was a thing. Saying that they handwaved the problem away because they’re saying “we do not yet know enough to figure this out, we’re trying the field, probabilistic logic sounds like a good first step” is very uncharitable, and also false. They’re doing foundational research, they’re exploring avenues of attack, they have not committed to any single path.

Logic, proofs, provability, and probabilistic reasoning aren’t in the same category as ANN. An ANN is a very specific kind of algorithm, whereas they’re trying to prove things as general as possible. Which is how mathematical logic works: everything that follows the axioms, no matter how weirdly those axioms are expressed, will follow its conclusions.