I also admire Stone’s article; not just the flatland example, but his very first example of strong inconsistencies as well. Both are very simple, and it is very hard to see where the flaw is in the argument. The blame for these problems (not just strong inconsistencies but marginalization paradoxes as well) is usually assigned to the use of improper priors, but I think that that is misleading. The real problem lies with the formal posterior, which is constructed as if the improper prior were a proper one and whose validity is proven, at best, by analogy with the proper case. If we want to solve the problem of the strong inconsistencies (and of the marginalization paradoxes) we should first find a way to properly define posteriors associated with improper priors; merely writing down the formal prior (because it looks right ?) won’t do. That will only give rise to problems, as Stone and others have shown. I suggest the following construction.

If Bob really wants to use a flat prior (probably a bad choice in the first place, but not an inconsistent one) he should do so by constructing a limit, in some sense, of proper priors. I will refer to this limit as an infinitesimal prior The proper priors can be constructed using a nested sequence of truncations of the improper prior. The value at any given path of this limit is of course 0, so the infinitesimal prior is meaningful only in an integral; that is, Bob should consider the limit of the expectations with respect to the proper priors of, say, bounded functions on the domain of the priors. This limit may not exist for all bounded functions, so a subset L of the set of bounded functions is required. The proper priors give rise to proper posteriors and Bob can define a posterior corresponding to the infinitesimal prior by likewise considering the sequence of expectations with respect to the proper posteriors of, say, bounded functions on the domain of the priors. The following questions then arise: 1) for what bounded functions is the latter limit defined and 2) what is the relation between this limit and the expectation with respect to the formal prior. It turns out that, for fixed data (fixed final path in the example) and if the formal posterior is proper and non-zero for all data (all final paths), the latter limit is defined for all functions in L and using the formal posterior gives the same result as that obtained by taking the limit. That is, the formal posterior can be used, for fixed data, without having to worry about strong inconsistencies (or marginalization paradoxes) if it is proper and non-zero.

To get back to the flatland example, P(A|x) = 3/4 and P(A|theta) = 1/4 are correct, but we cannot integrate over x in the first equality without first going back to the construction of the posterior and see if, when integrating, the formal posterior can still be used in lieu of taking the limit. In the case of the flatland example, the answer is no, and the statement that P(A|x) = 3/4 implies P(A) = 3/4 is false (and Bob did not prove that 3/4 = 1/4). This may seem strange, but it is really the same as the statement that P(theta) = 0 for the infinitesimal prior does not imply that, when integrating this prior, we will always find 0. After all, the integral over the full domain equals 1. ]]>

As is clear from the difference in results between the two strategies, the answer depends in some way on the path. But the likelihood he uses depends only on $\theta$, so it shouldn’t be too surprising his answer turns out to be wrong – he’s assuming the rest of the path makes no difference.

However, if we calculate a likelihood that takes into account which way the last step in the path to $x$ is pointing, the posterior turns out to give a chance of $1/2$ of the treasure being in the direction that shortens the path, and $1/6$ of it being in each of the other directions, thus agreeing with the classical analysis.

This is straightforward to work out if you take a posterior of $\theta$, the two last directions on the path towards it, and the direction taken from $\theta$ to $x$, given the position $x$ and the last direction on the path towards it. Then you sum over all the cases to get the posterior of $\theta$ given $x$ and the last direction on the path towards $x$.

Note that the likelihood Bob took here would be correct, if we could only observe the position $x$ and not the path leading to it.

]]>can someone explain it to me, without using notation ?

the only thing i can see is that Carlas path requires fewer steps; all the other paths require a go and retrun, whereas the shorter path is one less step ]]>

The likelihood is give as follows: at each you, you

walk in each of the for directions with equal prob.

In symbols theta = x + z

where z is the last step and

P(z=N) = P(z=S)=P(z=W)=P(z=E) = 1/4 ]]>