Przykład przeora, który w przeciwieństwie do Jeffreysa, prowadzi do tylnej części ciała, która nie jest niezmienna

Odpowiadam „odpowiedź” na pytanie, które zadałem dwa tygodnie temu: Dlaczego wcześniejsza Jeffreys była przydatna? To było naprawdę pytanie (i nie miałem wtedy prawa do komentowania), więc mam nadzieję, że to będzie w porządku:

W powyższym linku omówiono, że interesującą cechą wcześniejszego Jeffreysa jest to, że podczas ponownej parametryzacji modelu wynikowy rozkład tylny daje prawdopodobieństwa tylne, które spełniają ograniczenia nałożone przez transformację. Na przykład, jak opisano tam podczas przechodzenia z sukcesem prawdopodobieństwa $\theta$ na przykład beta-Bernoulliego do kursów $\psi=\theta/(1-\theta)$ , to powinno być tak, że tylnego spełnia $P(1/3\leq\theta\leq 2/3\mid X=x)=P(1/2\leq\psi\leq 2\mid X=x)$ .

Chciałem stworzyć numeryczny przykład niezmienności Jeffreysa przed przekształceniem $\theta$ na szanse $\psi$ i, co bardziej interesujące, brak innych przeorów (np. Haldane, mundurowych lub arbitralnych).

Teraz, jeśli późniejszym prawdopodobieństwem sukcesu jest Beta (dla dowolnej wcześniejszej wersji Beta, nie tylko Jeffreysa), późniejsze szanse są zgodne z rozkładem Beta drugiego rodzaju (patrz Wikipedia) o tych samych parametrach . Następnie, jak podkreślono w poniższym przykładzie liczbowym, nie jest zbyt zaskakujące (przynajmniej dla mnie), że istnieje jakakolwiek niezmienność dla dowolnego wyboru wersji Beta przedtem (baw się z alpha0_Uibeta0_U ), nie tylko Jeffreys, por. wyjście programu.

library(GB2) 
# has the Beta density of the 2nd kind, the distribution of theta/(1-theta) if theta~Beta(alpha,beta)

theta_1 = 2/3 # a numerical example as in the above post
theta_2 = 1/3

odds_1 = theta_1/(1-theta_1) # the corresponding odds
odds_2 = theta_2/(1-theta_2)

n = 10 # some data
k = 4

alpha0_J = 1/2 # Jeffreys prior for the Beta-Bernoulli case
beta0_J = 1/2
alpha1_J = alpha0_J + k # the corresponding parameters of the posterior
beta1_J = beta0_J + n - k

alpha0_U = 0 # some other prior
beta0_U = 0
alpha1_U = alpha0_U + k # resulting posterior parameters for the other prior
beta1_U = beta0_U + n - k

# posterior probability that theta is between theta_1 and theta_2:
pbeta(theta_1,alpha1_J,beta1_J) - pbeta(theta_2,alpha1_J,beta1_J) 
# the same for the corresponding odds, based on the beta distribution of the second kind
pgb2(odds_1, 1, 1,alpha1_J,beta1_J) - pgb2(odds_2, 1, 1,alpha1_J,beta1_J) 

# same for the other prior and resulting posterior
pbeta(theta_1,alpha1_U,beta1_U) - pbeta(theta_2,alpha1_U,beta1_U)
pgb2(odds_1, 1, 1,alpha1_U,beta1_U) - pgb2(odds_2, 1, 1,alpha1_U,beta1_U)

To prowadzi mnie do następujących pytań:

Czy popełniam błąd?
Jeśli nie, to czy istnieje taki efekt, jak brak braku niezmienności w rodzinach małżeńskich, czy coś takiego? (Szybka kontrola prowadzi mnie do podejrzenia, że na przykład nie mógłbym również wykazać braku niezmienniczości w normalnym i normalnym przypadku).
Znasz (najlepiej prosty) przykład, w którym możemy zrobić dostać brak niezmienności?

bayesian mathematical-statistics fisher-information jeffreys-prior invariance Christoph Hanck
źródło

Nie potrzebujesz kodu R (którego nie mogę uruchomić z wersją R 3.0.2), aby zweryfikować niezmienność, ponieważ jest to właściwość prawdopodobieństwa. Przez uprzednią niezmienność należy rozumieć konstrukcję reguły wybierania wcześniejszej, która nie zależy od wyboru parametryzacji modelu próbkowania.

Xi'an

Przepraszam za niedogodności. Działa z R 3.1.2 na moim komputerze. Jeśli mogę podjąć dalsze działania, czy twój komentarz sugeruje, że źle zrozumiałem komentarz Zen'a do zaakceptowanej odpowiedzi, punkt 1. Stephane Laurenta na temat Dlaczego użyteczne jest wcześniejsze Jeffreys? ?

Christoph Hanck

Odpowiedzi:

$p(\theta)$

$p_{\theta \mid D}(\theta \mid D)$
Transform the aforementioned posterior into the other parametrization to obtain $p_{\psi \mid D}(\psi \mid D)$

and

Transform the prior $p_\theta(\theta)$ into the other parametrization to obtain $p_\psi(\psi)$
Using the prior $p_\psi(\psi)$ , compute the posterior $p_{\psi \mid D}(\psi \mid D)$

lead to the same posterior for $\psi$ . This will indeed always occur (caveat; as long as the transformation is such that a distribution over $\psi$ is determined by a distribution over $\theta$ ).

However, this is not the point of the invariance in question. Instead, the question is whether, when we have a particular Method For Deciding The Prior, the following two procedures:

Use the Method For Deciding The Prior to decide $p_\theta(\theta)$
Convert that distribution into $p_\psi(\psi)$

and

Use the Method For Deciding The Prior to decide $p_\psi(\psi)$

result in the same prior distribution for $\psi$ . If they result in the same prior, they will indeed result in the same posterior, too (as you have verified for a couple of cases).

As mentioned in @NeilG's answer, if your Method For Deciding The Prior is 'set uniform prior for the parameter', you will not get the same prior in the probability/odds case, as the uniform prior for $\theta$ over $[0,1]$ is not uniform for $\psi$ over $[0,\infty)$ .

Instead, if your Method For Deciding The Prior is 'use Jeffrey's prior for the parameter', it will not matter whether you use it for $\theta$ and convert into the $\psi$ -parametrization, or use it for $\psi$ directly. This is the claimed invariance.

Juho Kokkala
źródło

It looks like you're verifying the likelihoods induced by the data are unaffected by parametrization, which has nothing to do with the prior.

If your way of choosing priors is to, e.g., "choose the uniform prior", then what is uniform under one parametrization (say Beta, i.e. Beta(1,1)) is not uniform under another, say, BetaPrime(1,1) (which is skewed) — it's BetaPrime(1,-1) is uniform if such a thing exists.

The Jeffreys prior is the only "way to choose priors" that is invariant under reparametrization. So it is less assumptive than any other way of choosing priors.

Neil G
źródło

I do not think the Jeffreys prior is the only invariant prior. When they differ, left and right Haar measures are both invariant.

Xi'an

@Neil G, I am not sure I can follow your reasoning that I only look at the likelihood. When plugging (e.g.) alpha1_J into pbeta and pgb2 this parameter is determined by both a prior parameter (alpha1_J) and the data (k), likewise for all the other parameters.

Christoph Hanck

(+1) You'd hope elicitation of subjective priors would be parametrization-invariant too.

Scortchi - Reinstate Monica

@Zen: yes indeed,I was too hasty: Haar measures are a incorrect example. Still, I wonder why Jeffreys' is the only invariant prior...

Xi'an

@Xi'an: if my memory doesn't fail me, there is a Theorem in Cencov book (amazon.com/…) which, in some sense (?), proves that Jeffreys prior is the only guy in the town with the necessary invariance. His proof is inaccessible to me. It uses the language of Category Theory, functors, morphisms and all that. en.wikipedia.org/wiki/Category_theory

Zen