This post is thinking out loud, catalyzed by some stuff Rev said on Twitter. Partially digested, so quality is low, but I think the topic is important.
Anecdotes are much-maligned as sources of reliable information. “Anecdote is not the plural of data!”
Perhaps too maligned. Anecdotes *from a trusted source* – that part is necessary – can often answer a hidden query much more effectively than systematically collected data.
For example, if I look up crime rates while deciding whether to move to a new city, implicitly I am asking “Will my probability of being victimized rise when I move to this city?” But the crime stats don’t directly tell you that. Maybe there are tons of violent incidents per capita, but basically all of them are gang-related and geographically isolated. Or maybe this city is reporting “incidents” and the other city is reporting “criminal charges”, or one city decided to use geometric means for some bizarre reason. Or one of 100 other such stories, many of which aren’t easy to rule out especially if you didn’t even think of them in the first place. Of course, in receiving the stats from some authority like law-enforcement or a social science researcher, it’s a huge leap to even trust the source to be honest in the first place!
On the other hand, consider a friend in the candidate city who tells you they’ve been mugged and they know two others who have. Not a huge sample, but one targeted to your reference class with much better precision than any systematically gathered data. There are sources of bias here too (maybe your friend has a habit of midnight strolls through Flea Bottom) but they are usually easier to notice and correct for. With systematically collected data, on the other hand, it is especially true that you don’t know what you don’t know.*
There are huge failure modes involved in reasoning via anecdote, which the biases & heuristics literature goes into in great detail. I will not recapitulate them, on the assumption that readers are familiar with them. The evidence type that is consistently worst is the filtered anecdote. This is the one that all your friends are sharing on Facebook. “Fundamentalist embroiled in bear-baiting scandal.” You are hearing about this because somebody doesn’t like fundamentalists, and for no other reason. Zero epistemic content.
I think I am fonder of reasoning by anecdote now because I’ve been burned quite a few times by data that turned out to be (a) lies, (b) filtered accidentally, (c) filtered deliberately, (d) misinterpreted. I am now a bit paranoid about the trustworthiness of authoritative sources of information, so I’m much more interested in the relevant experiences of trusted friends and family.
It occurs to me that this entire post is meant as a gentle criticism of the way I perceive previous-me and my audience to be likely to think, not to the way most people think. This has characteristics of a bravery debate.
So to avoid bravery debate framing, here is a table showing the tradeoffs.
Trustworthiness easier to verify
Subject to suite of cognitive biases
Small sample ∴ weak evidence
Sometimes randomized to avoid bias
Large sample ∴ strong evidence
More difficult to interpret
Trustworthiness harder to verify