Poisson ma wykładniczy charakter, tak jak Gamma-Poisson do czego?

Rozkład Poissona może mierzyć zdarzenia na jednostkę czasu, a parametr to $\lambda$ . Rozkład wykładniczy mierzy czas do następnego zdarzenia za pomocą parametru $\frac{1}{\lambda}$ . Można przekształcić jedną dystrybucję w drugą, w zależności od tego, czy łatwiej jest modelować zdarzenia lub czasy.

Teraz gamma-poissona jest „rozciągniętym” poissonem o większej wariancji. Rozkład Weibulla jest „rozciągnięty” wykładniczy z większą wariancją. Ale czy te dwa można łatwo przekształcić w siebie, w taki sam sposób, w jaki Poisson można przekształcić w wykładniczy?

Czy jest jakiś inny rozkład, który jest bardziej odpowiedni do użycia w połączeniu z rozkładem gamma-Poissona?

Gamma-Poissona jest również znany jako ujemny rozkład dwumianowy lub NBD.

poisson-distribution negative-binomial gamma-distribution exponential-family zbicyclist
źródło

Odpowiedzi:

Jest to dość prosty problem. Chociaż istnieje związek między rozkładami Poissona i ujemnych dwumianów, tak naprawdę uważam, że nie jest to pomocne w przypadku konkretnego pytania, ponieważ zachęca ludzi do myślenia o negatywnych procesach dwumianowych. Zasadniczo masz szereg procesów Poissona:

Y_{i} (t_{i}) | λ_{i} \sim P o i s s o n (λ_{i} t_{i})

$Y_i(t_i)|\lambda_i\sim Poisson(\lambda_i t_i)$

Gdzie jest procesem, a jest czas obserwować go, a oznacza fizycznych. Mówisz, że te procesy są „podobne”, wiążąc stawki razem poprzez rozkład: $Y_i$ $t_i$ $i$

λ_{i} \sim G a m m a (α, β)

$\lambda_i\sim Gamma(\alpha,\beta)$

Wykonując integrację / miksowanie nad , masz: $\lambda_i$

Y_{i} (t_{i}) | α β \sim N e g B i n (α, p_{i}) w h e r e p_{i} = \frac{t_{i}}{t_{i} + β}

$Y_i(t_i)|\alpha\beta\sim NegBin(\alpha,p_i)\;\;\; where \;\;p_i=\frac{t_i}{t_i+\beta}$

Ma to pmf:

P r (Y_{i} (t_{i}) = y_{i} | α β) = \frac{Γ (α + y_{i})}{Γ (α) y_{i}!} p_{i}^{y_{i}} (1 - p_{i})^{α}

$Pr(Y_i(t_i)=y_i|\alpha\beta) = \frac{\Gamma(\alpha+y_i)}{\Gamma(\alpha)y_i!}p_i^{y_i}(1-p_i)^\alpha$

Aby uzyskać rozkład czasu oczekiwania, zauważamy, że:

P r (T_{i} \leq t_{i} | α β) = 1 - P r (T_{i} > t_{i} | α β) = 1 - P r (Y_{i} (t_{i}) = 0 | α β)

$Pr(T_i\leq t_i|\alpha\beta)=1-Pr(T_i> t_i|\alpha\beta)=1-Pr(Y_i(t_i)=0|\alpha\beta)$

= 1 - (1 - p_{i})^{α} = 1 - {(1 + \frac{t_{i}}{β})}^{- α}

$=1-(1-p_i)^\alpha=1-\left(1+\frac{t_i}{\beta}\right)^{-\alpha}$

Zróżnicuj to i masz plik PDF:

p_{T_{i}} (t_{i} | α β) = \frac{α}{β} {(1 + \frac{t_{i}}{β})}^{- (α + 1)}

$p_{T_i}(t_i|\alpha\beta)=\frac{\alpha}{\beta}\left(1+\frac{t_i}{\beta}\right)^{-(\alpha+1)}$

Jest to członek uogólnionych dystrybucji Pareto, typ II. Użyłbym tego jako twojego rozkładu czasu oczekiwania.

Aby zobaczyć związek z rozkładem Poissona, zwróć uwagę, że , więc jeśli ustawimy $\frac{\alpha}{\beta}=E(\lambda_i|\alpha\beta)$ a następnie weź granicęotrzymujemy: $\beta=\frac{\alpha}{\lambda}$ $\alpha\to\infty$

lim_{α \to \infty} \frac{α}{β} {(1 + \frac{t_{i}}{β})}^{- (α + 1)} = lim_{α \to \infty} λ {(1 + \frac{λ t_{i}}{α})}^{- (α + 1)} = λ \exp (- λ t_{i})

$\lim_{\alpha\to\infty}\frac{\alpha}{\beta}\left(1+\frac{t_i}{\beta}\right)^{-(\alpha+1)}=\lim_{\alpha\to\infty}\lambda\left(1+\frac{\lambda t_i}{\alpha}\right)^{-(\alpha+1)}=\lambda\exp(-\lambda t_i)$

This means that you can interpret $\frac{1}{\alpha}$ as an over-dispersion parameter.

probabilityislogic
źródło

You can also note that the waiting time distribution is, roughly speaking, an exponential distribution with a Gamma random rate parameter and strictly speaking this is a Beta distribution of the second kind, as for any Gamma distribution with a Gamma random rate parameter.

Stéphane Laurent

Using @probabilityislogic as a basis, I found the following article providing more detail on the relationship between NBD and Pareto: Gupta, Sunil and Donald G. Morrison. Estimating Heterogeneith in Consumers' Purchase Rates. Marketing Science, 1991, 10(3), 264-269. Thanks to all who helped me answer this question.

zbicyclist

+1, I guess this nice analytical form may no longer exist for

P o i s s o n (λ_{i} t_{i} + c)

$Poisson(\lambda_i t_i + c)$ , where

c

$c$ is a constant.

Randel

@randel - you could get a "nice-ish" form by noting this rv is the sum of two independent rvs...

Z_{i} = Y_{i} + X_{i}

$Z_i=Y_i+X_i$ where

Y_{i}

$Y_i$ is the same as above and

X_{i} \sim p o i s s o n (c)

$X_i\sim poisson (c)$ . As

X_{i}

$X_i$ doesn't depend on

λ_{i}

$\lambda_i$ or

Y_{i}

$Y_i$ the pdf of

Z_{i}

$Z_i$ is the convolution of the above negative binomial pdf and a poisson pdf. To get the waiting time distribution just multiply

P r (Y_{i} = 0)

$Pr(Y_i=0)$ in the above answer by

P r (X_{i} = 0) = e^{- c}

$Pr(X_i=0)=e^{-c}$ . You then get waiting time cdf of

1 - e^{- c} {(1 + \frac{t_{i}}{β})}^{- α}

$1-e^{-c}\left(1+\frac{t_i}{\beta}\right)^{-\alpha}$ and pdf of

e^{- c} \frac{α}{β} {(1 + \frac{t_{i}}{β})}^{- (α + 1)}

$e^{-c}\frac {\alpha}{\beta}\left(1+\frac{t_i}{\beta}\right)^{-(\alpha+1)}$ .

probabilityislogic

This won't work in terms of the mixing distribution, because you need

λ_{i} < c t_{i}^{- 1}

$\lambda_i <ct_i^{-1}$ (else the poisson mean is negative). The gamma mixing distribution would need to be truncated (I also assumed that

c > 0

$c>0$ in my previous answer). This would mean no nb distribution.

probabilityislogic

One possibility: Poisson is to Exponential as Negative-Binomial is to ... Exponential!

There is a pure-jump increasing Lévy process called the Negative Binomial Process such that at time $t$ the value has a negative binomial distribution. Unlike the Poisson process, the jumps are not almost surely $1$ . Instead, they follow a logarithmic distribution. By the law of total variance, some of the variance comes from the number of jumps (scaled by the average size of the jumps), and some of the variance comes from the sizes of the jumps, and you can use this to check that it is overdispersed.

There may be other useful descriptions. See "Framing the negative binomial distribution for DNA sequencing."

Let me be more explicit about how the Negative Binomial Process described above can be constructed.

Choose $p \lt 1$ .
Let $X_1, X_2, X_3, ...$ be IID with logarithmic distributions, so $P(x_i = k) = \frac{-1}{\log(1-p)} \frac{p^k}{k}.$
Let $N$ be a Poisson process with constant rate $-\log(1-p)$ , so $N(t) = \text{Pois}(-t \log(1-p)).$
Let $NBP$ be the process so that

N B P (t) = \sum_{i = 1}^{N (t)} X_{i} .

$NBP(t) = \sum_{i=1}^{N(t)} X_i.$

$NBP$ is a pure jump process with logarithmically distributed jumps. The gaps between jumps follow an exponential distribution with rate $-\log(1-p).$

I don't think it is obvious from this description that $NBP(t)$ has a negative binomial $NB(t,p)$ distribution, but there is a short proof using probability generating functions on Wikipedia, and Fisher also proved this when he introduced the logarithmic distribution to analyze the relative frequencies of species.

Douglas Zare
źródło

No, any compound Poisson process has an exponential waiting time. This means you add

Pois (λ t)

$\text{Pois}(\lambda t)$ IID random variables with some distribution.

Douglas Zare

No, that is not what is meant by a compound Poisson process. en.wikipedia.org/wiki/Compound_Poisson_process " The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution." I did not say IID Poisson variables. You take the

N

$N$ th partial sum of IID logarithmic random variables where

N

$N$ is the value of a Poisson process.

Douglas Zare

If you multiply a Poisson process by

2

$2$ , this is not a Poisson process and the waiting times remain exponential.

Douglas Zare

let us continue this discussion in chat

Douglas Zare

I am not able to comment yet so I apologize is this isn't a definitive solution.

You are asking for the appropriate distribution to use with an NB but appropriate isn't entirely defined. If an appropriate distribution means appropriate for explaining data and you are starting with an overdispersed Poisson then you may have to look further into the cause of the overdispersion. The NB doesn't distinguish between a Poisson with heterogeneous means or a positive occurrence dependence (that one event occurring increases the probability of another occurring). In continuous time there is also duration dependence, eg positive duration dependence means the passage of time increases the probability of an occurrence. It was also shown that negative duration dependence asymptotically causes an overdispersed Poisson[1]. This adds to the list of what might be the appropriate waiting time model.

Meadowlark Bradsher
źródło

cause of the overdispersion: This is consumer purchase data. Individual consumers are poisson, each with a rate of purchase lambda. But not every consumer has the same lambda -- that's the cause of the overdispersion. The lambda purchasing rates are considered to be distributed as gamma. This is a common model (traces back to A.S.C. Ehrenberg), but I haven't found anything in his writing that answers this question.

zbicyclist