The Old Evidence Problem

I’m in the middle of writing up a post sketching a some ideas I have about Bayesian inference in order to stir up a hornet nest – in particular to prod the hornet queen, David Chapman. In the process, I ran across this old blog post by Andrew Gelman discussing this (pdf) paper by Bandyopadhyay and Brittan criticizing one form of Bayesianism – in particular the form espoused by E.T. Jaynes. One of the issues they bring up is called the old evidence problem:

Perhaps the most celebrated case in the history of science in which old data have been used to construct and vindicate a new theory concerns Einstein. He used Mercury’s perihelion shift (M) to verify the general theory of relativity (GTR). The derivation of M is considered the strongest classical test for GTR. However, according to Clark Glymour’s old evidence problem, Bayesianism fails to explain why M is regarded as
evidence for GTR. For Einstein, Pr(M) = 1 because M was known to be an anomaly for Newton’s theory long before GTR came into being. But Einstein derived M from GTR; therefore, Pr(M|GTR) = 1. Glymour contends that given equation (1), the
conditional probability of GTR given M is therefore the same as the prior probability of GTR; hence, M cannot constitute evidence for GTR.

Oh man, do I have some thoughts on this problem. I think I even wrote a philosophy paper in undergrad that touched on it after reading Jaynes. But I’m going to refrain from commenting until after I finish the main post because I think the old evidence problem illustrates several points that I want to make. In the mean time, what do *you* think of the problem? Is there a solution? What do you think of the solution Bandyopadhyay and Brittan propose in their paper?

Edit: Here’s a general statement of the problem. Suppose we have some well know piece of evidence E. Everyone is aware of this evidence and there is no doubt, so P(E)=1. Next, suppose someone invents a new theory T that perfectly accounts for the evidence – it predicts is with 100% accuracy so that P(E|T)=1. Then by Bayes’ rule we have P(T|E)=P(E|T)P(T)/P(E) = P(T), so the posterior and prior are identical and the evidence doesn’t actually tell us anything about T.


The Mark of an Ergodic Crab

I. Chasing ergodic crabs

Yes! Ergodicity is interesting and very useful. At least in statistics. Don’t ask me what physicists do with the thing though. Those guys are crazy. The problem with ergodicity is that it’s pretty complicated. Go ahead, take a look at the Wikipedia page, I’ll wait a minute.

Ok, good. The math geniuses are gone now. You know the type – they can look at some math they’ve never seen before and understand it in minutes. Right now they’re busy proving theorems about ergodic flows on n-dimensional locally non-Hausdorff manifolds. And twitching at that last sentence. You and me? We’re going to learn about ergodicity with a pretty crabby extended metaphor.

Continue reading


What is statistics?

I put the call out on twitter for ideas for my first post, and Gabe asked this:

I suppose I did ask for statistics questions. This one is a bit tough to answer because, like I hinted at on twitter, a wide variety of things get called statistics by the people doing them, statisticians do an even wider variety of things, and to muddy the waters even more, lots of things that typically get categorized as statistics often are also categorized as other things like machine learning or computer science. I suppose I should blame the computer people.

Continue reading