I’m in the middle of writing up a post sketching a some ideas I have about Bayesian inference in order to stir up a hornet nest – in particular to prod the hornet queen, David Chapman. In the process, I ran across this old blog post by Andrew Gelman discussing this (pdf) paper by Bandyopadhyay and Brittan criticizing one form of Bayesianism – in particular the form espoused by E.T. Jaynes. One of the issues they bring up is called the old evidence problem:
Perhaps the most celebrated case in the history of science in which old data have been used to construct and vindicate a new theory concerns Einstein. He used Mercury’s perihelion shift (M) to verify the general theory of relativity (GTR). The derivation of M is considered the strongest classical test for GTR. However, according to Clark Glymour’s old evidence problem, Bayesianism fails to explain why M is regarded as
evidence for GTR. For Einstein, Pr(M) = 1 because M was known to be an anomaly for Newton’s theory long before GTR came into being. But Einstein derived M from GTR; therefore, Pr(M|GTR) = 1. Glymour contends that given equation (1), the
conditional probability of GTR given M is therefore the same as the prior probability of GTR; hence, M cannot constitute evidence for GTR.
Oh man, do I have some thoughts on this problem. I think I even wrote a philosophy paper in undergrad that touched on it after reading Jaynes. But I’m going to refrain from commenting until after I finish the main post because I think the old evidence problem illustrates several points that I want to make. In the mean time, what do *you* think of the problem? Is there a solution? What do you think of the solution Bandyopadhyay and Brittan propose in their paper?
Edit: Here’s a general statement of the problem. Suppose we have some well know piece of evidence E. Everyone is aware of this evidence and there is no doubt, so P(E)=1. Next, suppose someone invents a new theory T that perfectly accounts for the evidence – it predicts is with 100% accuracy so that P(E|T)=1. Then by Bayes’ rule we have P(T|E)=P(E|T)P(T)/P(E) = P(T), so the posterior and prior are identical and the evidence doesn’t actually tell us anything about T.