Mam przeczytać , że suma gamma zmiennych losowych o tym samym parametrem skali jest inna zmienna losowa Gamma. Widziałem też artykuł Moschopoulosa opisujący metodę sumowania ogólnego zestawu zmiennych losowych Gamma. Próbowałem wdrożyć metodę Moschopoulosa, ale jeszcze nie odniosłem sukcesu.
Jak wygląda suma ogólnego zestawu zmiennych losowych Gamma? Aby sprecyzować to pytanie, jak to wygląda:
Jeśli powyższe parametry nie są szczególnie odkrywcze, sugeruj inne.
Odpowiedzi:
Najpierw połącz dowolne sumy o tym samym współczynniku skali : a plus Γ ( m , β ) zmieniają się od a Γ ( n + m , β )Γ ( n , β) Γ ( m , β) Γ ( n + m , β) .
Następnie zauważ, że funkcją charakterystyczną (cf) dla jest ( 1 - i β t ) - n , skąd cf sumy tych rozkładów jest iloczynemΓ ( n , β) ( 1 - i βt )−n
Gdy są integralną, produkt ten rozszerza się częściowo frakcji do liniowej kombinacji z ( 1 - i β J T ) - ν gdzie ν liczby całkowite od 1 i n j . W przykładzie z β 1 = 1 , n 1 = 8 (z sumy Γ ( 3 , 1 ) i Γ ( 5 )nj (1−iβjt)−ν ν 1 nj β1=1,n1=8 Γ(3,1) Γ(5,1) ) i znajdujemyβ2=2,n2=4
The inverse of taking the cf is the inverse Fourier Transform, which is linear: that means we may apply it term by term. Each term is recognizable as a multiple of the cf of a Gamma distribution and so is readily inverted to yield the PDF. In the example we obtain
for the PDF of the sum.
This is a finite mixture of Gamma distributions having scale factors equal to those within the sum and shape factors less than or equal to those within the sum. Except in special cases (where some cancellation might occur), the number of terms is given by the total shape parametern1+n2+⋯ (assuming all the nj are different).
As a test, here is a histogram of104 results obtained by adding independent draws from the Γ(8,1) and Γ(4,2) distributions. On it is superimposed the graph of 104 times the preceding function. The fit is very good.
Moschopoulos carries this idea one step further by expanding the cf of the sum into an infinite series of Gamma characteristic functions whenever one or more of theni is non-integral, and then terminates the infinite series at a point where it is reasonably well approximated.
źródło
I will show another possible solution, that is quite widely applicable, and with todays R software, quite easy to implement. That is the saddlepoint density approximation, which ought to be wider known!
For terminology about the gamma distribution, I will follow https://en.wikipedia.org/wiki/Gamma_distribution with the shape/scale parametrization,k is shape parameter and θ is scale. For the saddlepoint approximation I will follow Ronald W Butler: "Saddlepoint approximations with applications" (Cambridge UP). The saddlepoint approximation is explained here: How does saddlepoint approximation work?
here I will show how it is used in this application.
Then the saddlepoint approximation to the densityf of X is given by
Now letX1,X2,…,Xn be independent gamma random variables, where Xi has the distribution with parameters (ki,θi) . Then the cumulant generating function is
R
code calculating this, and will use the parameter valuesR
code uses a new argument in the uniroot function introduced in R 3.1, so will not run in older R's.resulting in the following plot:
I will leave the normalized saddlepoint approximation as an exercise.
źródło
R
code work to compare the approximation to the exact answer. Any attempt to invokefhat
generates errors, apparently in the use ofuniroot
.The Welch–Satterthwaite equation could be used to give an approximate answer in the form of a gamma distribution. This has the nice property of letting us treat gamma distributions as being (approximately) closed under addition. This is the approximation in the commonly used Welch's t-test.
(The gamma distribution is can be viewed as a scaled chi-square distribution, and allowing non-integer shape parameter.)
I've adapted the approximation to thek,θ parametrization of the gamma distriubtion:
Letk=(3,4,5) , θ=(1,2,1)
So we get approximately Gamma(10.666... ,1.5)
We see the shape parameterk has been more or less totalled, but slightly less because the input scale parameters θi differ. θ is such that the sum has the correct mean value.
źródło
An exact solution to the convolution (i.e., sum) ofn gamma distributions is given as Eq. (1) in the linked pdf by DiSalvo. As this is a bit long, it will take some time to copy it over here. For only two gamma distributions, their exact sum in closed form is specified by Eq. (2) of DiSalvo and without weights by Eq. (5) of Wesolowski et al., which also appears on the CV site as an answer to that question. That is,
źródło