Czy kowariancja równa zero oznacza niezależność binarnych zmiennych losowych?

Jeśli $X$ i $Y$ są dwiema losowymi zmiennymi, które mogą przyjmować tylko dwa możliwe stany, jak mogę pokazać, że $Cov(X,Y) = 0$ oznacza niezależność? Ten rodzaj jest sprzeczny z tym, czego nauczyłem się w tym dniu, że $Cov(X,Y) = 0$ nie oznacza niezależności ...

Wskazówka mówi, aby zacząć od $1$ i $0$ jako możliwe stany i od tego generalizować. I mogę to zrobić i pokazać $E(XY) = E(X)E(Y)$ , ale to nie oznacza niezależności ???

Chyba trochę zmieszany, jak to zrobić matematycznie.

covariance independence użytkownik3604869
źródło

Zasadniczo nie jest to prawdą, jak sugeruje nagłówek pytania…

Michael R. Chernick

Stwierdzenie, które próbujesz udowodnić, jest rzeczywiście prawdziwe. Jeśli

X $X$ i

Y $Y$ są losowymi zmiennymi Bernoulliego, wot parametry

p1 $p_1$ i

p2 $p_2$ , to

E[X]=p1 $E[X]=p_1$ i

E[Y]=p2 $E[Y]=p_2$ . Zatem

cov(X,Y)=E[XY]−E[X]E[Y] $\operatorname{cov}(X,Y)=E[XY]-E[X]E[Y]$ wynosi

0 $0$ tylko jeśli

E[XY]=P{X=1,Y=1} $E[XY]=P\{X=1,Y=1\}$ jest równe

p1p2=P{X=1}P{Y=1} $p_1p_2=P\{X=1\}P\{Y=1\}$ co oznacza, że

{X=1} $\{X=1\}$ i

{Y=1} $\{Y=1\}$ są niezależne wydarzenia . Jest to standardowy wynik, jeśli

A $A$ i

B $B$ są parą niezależnych zdarzeń, więc są też

A,Bc $A,B^c$ i

Ac,B $A^c,B$ oraz

Ac,Bc $A^c,B^c$ niezależne zdarzenia, tj.

X $X$ i

Y $Y$ są niezależnymi zmiennymi losowymi. Teraz uogólnij.

Dilip Sarwate

Odpowiedzi:

W przypadku zmiennych binarnych ich oczekiwana wartość jest równa prawdopodobieństwu, że są one równe jeden. W związku z tym,

E (X Y) = P (X Y = 1) = P (X = 1 \cap Y = 1) E (X) = P (X = 1) E (Y) = P (Y = 1)

$E(XY) = P(XY = 1) = P(X=1 \cap Y=1) \\ E(X) = P(X=1) \\ E(Y) = P(Y=1) \\$

Jeśli dwa mają zerową kowariancję, oznacza to $E(XY) = E(X)E(Y)$ , co oznacza

P (X = 1 \cap Y = 1) = P (X = 1) \cdot P (Y = 1)

$P(X=1 \cap Y=1) = P(X=1) \cdot P(Y=1)$

Trywialne jest obserwowanie mnożenia wszystkich innych prawdopodobieństw połączeń, przy użyciu podstawowych zasad dotyczących niezależnych zdarzeń (tj. Jeśli $A$ i $B$ są niezależne, to ich uzupełnienia są niezależne itp.), Co oznacza, że funkcja masy wspólnej rozkłada się, co jest definicją dwóch niezależnych zmiennych losowych.

gammer
źródło

Concise and elegant. Classy! +1 =D

Marcelo Ventura

Both correlation and covariance measure linear association between two given variables and it has no obligation to detect any other form of association else.

Tak więc te dwie zmienne mogą być powiązane na kilka innych nieliniowych sposobów, a kowariancja (a zatem korelacja) nie mogła odróżnić od niezależnego przypadku.

Jako bardzo dydaktyczny, sztuczne i dla realistycznego przykład, można rozważyć tak, że dla i również rozważyć $X$ $P(X=x)=1/3$ $x=−1,0,1$ $Y=X^2$ . Zauważ, że są one nie tylko powiązane, ale jedno jest funkcją drugiego. Niemniej ich kowariancja wynosi 0, ponieważ ich powiązanie jest ortogonalne względem związku, który kowariancja może wykryć.

EDYTOWAĆ

Rzeczywiście, jak wskazał @whuber, powyższa oryginalna odpowiedź była w rzeczywistości komentarzem, w jaki sposób twierdzenie nie jest ogólnie prawdziwe, jeśli obie zmienne niekoniecznie byłyby dychotomiczne. Mój błąd!

So let's math up. (The local equivalent of Barney Stinson's "Suit up!")

Particular Case

If both $X$ and $Y$ were dichotomous, then you can assume, without loss of generality, that both assume only the values $0$ and $1$ with arbitrary probabilities $p$ , $q$ and $r$ given by

P (X = 1) = p \in [0, 1] P (Y = 1) = q \in [0, 1] P (X = 1, Y = 1) = r \in [0, 1],

$\begin{align*} P(X=1) = p \in [0,1] \\ P(Y=1) = q \in [0,1] \\ P(X=1,Y=1) = r \in [0,1], \end{align*}$ which characterize completely the joint distribution of

X $X$ and

Y $Y$ . Taking on @DilipSarwate's hint, notice that those three values are enough to determine the joint distribution of

(X,Y) $(X,Y)$ , since

P (X = 0, Y = 1) P (X = 1, Y = 0) P (X = 0, Y = 0) = P (Y = 1) - P (X = 1, Y = 1) = q - r = P (X = 1) - P (X = 1, Y = 1) = p - r = 1 - P (X = 0, Y = 1) - P (X = 1, Y = 0) - P (X = 1, Y = 1) = 1 - (q - r) - (p - r) - r = 1 - p - q - r .

$\begin{align*} P(X=0,Y=1) &= P(Y=1) - P(X=1,Y=1) = q - r\\ P(X=1,Y=0) &= P(X=1) - P(X=1,Y=1) = p - r\\ P(X=0,Y=0) &= 1 - P(X=0,Y=1) - P(X=1,Y=0) - P(X=1,Y=1) \\ &= 1 - (q - r) - (p - r) - r = 1 - p - q - r. \end{align*}$ (On a side note, of course

r $r$ is bound to respect both

p−r∈[0,1] $p-r\in[0,1]$ ,

q−r∈[0,1] $q-r\in[0,1]$ and

1−p−q−r∈[0,1] $1-p-q-r\in[0,1]$ beyond

r∈[0,1] $r\in[0,1]$ , which is to say

r∈[0,min(p,q,1−p−q)] $r\in[0,\min(p,q,1-p-q)]$ .)

Notice that $r = P(X=1,Y=1)$ might be equal to the product $p\cdot q = P(X=1) P(Y=1)$ , which would render $X$ and $Y$ independent, since

P (X = 0, Y = 0) P (X = 1, Y = 0) P (X = 0, Y = 1) = 1 - p - q - p q = (1 - p) (1 - q) = P (X = 0) P (Y = 0) = p - p q = p (1 - q) = P (X = 1) P (Y = 0) = q - p q = (1 - p) q = P (X = 0) P (Y = 1) .

$\begin{align*} P(X=0,Y=0) &= 1 - p - q - pq = (1-p)(1-q) = P(X=0)P(Y=0)\\ P(X=1,Y=0) &= p - pq = p(1-q) = P(X=1)P(Y=0)\\ P(X=0,Y=1) &= q - pq = (1-p)q = P(X=0)P(Y=1). \end{align*}$

Yes, $r$ might be equal to $pq$ , BUT it can be different, as long as it respects the boundaries above.

Well, from the above joint distribution, we would have

E (X) E (Y) E (X Y) C o v (X, Y) = 0 \cdot P (X = 0) + 1 \cdot P (X = 1) = P (X = 1) = p = 0 \cdot P (Y = 0) + 1 \cdot P (Y = 1) = P (Y = 1) = q = 0 \cdot P (X Y = 0) + 1 \cdot P (X Y = 1) = P (X Y = 1) = P (X = 1, Y = 1) = r = E (X Y) - E (X) E (Y) = r - p q

$\begin{align*} E(X) &= 0\cdot P(X=0) + 1\cdot P(X=1) = P(X=1) = p \\ E(Y) &= 0\cdot P(Y=0) + 1\cdot P(Y=1) = P(Y=1) = q \\ E(XY) &= 0\cdot P(XY=0) + 1\cdot P(XY=1) \\ &= P(XY=1) = P(X=1,Y=1) = r\\ Cov(X,Y) &= E(XY) - E(X)E(Y) = r - pq \end{align*}$

Now, notice then that $X$ and $Y$ are independent if and only if $Cov(X,Y)=0$ . Indeed, if $X$ and $Y$ are independent, then $P(X=1,Y=1)=P(X=1)P(Y=1)$ , which is to say $r=pq$ . Therefore, $Cov(X,Y)=r-pq=0$ ; and, on the other hand, if $Cov(X,Y)=0$ , then $r-pq=0$ , which is to say $r=pq$ . Therefore, $X$ and $Y$ are independent.

General Case

About the without loss of generality clause above, if $X$ and $Y$ were distributed otherwise, let's say, for $a<b$ and $c<d$ ,

P (X = b) = p P (Y = d) = q P (X = b, Y = d) = r

$\begin{align*} P(X=b)=p \\ P(Y=d)=q \\ P(X=b, Y=d)=r \end{align*}$ then

X′ $X'$ and

Y′ $Y'$ given by

X' = X - a b - a and Y' = Y - c d - c

$X'=\frac{X-a}{b-a} \qquad \text{and} \qquad Y'=\frac{Y-c}{d-c}$ would be distributed just as characterized above, since

X = a \Leftrightarrow X' = 0, X = b \Leftrightarrow X' = 1, Y = c \Leftrightarrow Y' = 0 and Y = d \Leftrightarrow Y' = 1.

$X=a \Leftrightarrow X'=0, \quad X=b \Leftrightarrow X'=1, \quad Y=c \Leftrightarrow Y'=0 \quad \text{and} \quad Y=d \Leftrightarrow Y'=1.$ So

X $X$ and

Y $Y$ are independent if and only if

X′ $X'$ and

Y′ $Y'$ are independent.

Also, we would have

E (X') E (Y') E (X' Y') C o v (X', Y') = E (X - a b - a) = E ( X ) - a b - a = E (Y - c d - c) = E ( Y ) - c d - c = E (X - a b - a Y - c d - c) = E [ ( X - a ) ( Y - c ) ] ( b - a ) ( d - c ) = E ( X Y - X c - a Y + a c ) ( b - a ) ( d - c ) = E ( X Y ) - c E ( X ) - a E ( Y ) + a c ( b - a ) ( d - c ) = E (X' Y') - E (X') E (Y') = E ( X Y ) - c E ( X ) - a E ( Y ) + a c ( b - a ) ( d - c ) - E ( X ) - a b - a E ( Y ) - c d - c = [ E ( X Y ) - c E ( X ) - a E ( Y ) + a c ] - [ E ( X ) - a ] [ E ( Y ) - c ] ( b - a ) ( d - c ) = [ E ( X Y ) - c E ( X ) - a E ( Y ) + a c ] - [ E ( X ) E ( Y ) - c E ( X ) - a E ( Y ) + a c ] ( b - a ) ( d - c ) = E ( X Y ) - E ( X ) E ( Y ) ( b - a ) ( d - c ) = 1 ( b - a ) ( d - c ) C o v (X, Y) .

$\begin{align*} E(X') &= E\left(\frac{X-a}{b-a}\right) = \frac{E(X)-a}{b-a} \\ E(Y') &= E\left(\frac{Y-c}{d-c}\right) = \frac{E(Y)-c}{d-c} \\ E(X'Y') &= E\left(\frac{X-a}{b-a} \frac{Y-c}{d-c}\right) = \frac{E[(X-a)(Y-c)]}{(b-a)(d-c)} \\ &= \frac{E(XY-Xc-aY+ac)}{(b-a)(d-c)} = \frac{E(XY)-cE(X)-aE(Y)+ac}{(b-a)(d-c)} \\ Cov(X',Y') &= E(X'Y')-E(X')E(Y') \\ &= \frac{E(XY)-cE(X)-aE(Y)+ac}{(b-a)(d-c)} - \frac{E(X)-a}{b-a} \frac{E(Y)-c}{d-c} \\ &= \frac{[E(XY)-cE(X)-aE(Y)+ac] - [E(X)-a] [E(Y)-c]}{(b-a)(d-c)}\\ &= \frac{[E(XY)-cE(X)-aE(Y)+ac] - [E(X)E(Y)-cE(X)-aE(Y)+ac]}{(b-a)(d-c)}\\ &= \frac{E(XY)-E(X)E(Y)}{(b-a)(d-c)} = \frac{1}{(b-a)(d-c)} Cov(X,Y). \end{align*}$ So

Cov(X,Y)=0 $Cov(X,Y)=0$ if and only

Cov(X′,Y′)=0 $Cov(X',Y')=0$ .

Marcelo Ventura
źródło

I recycled that answer from this post.

Marcelo Ventura

Verbatim cut and paste from your other post. Love it. +1

gammer

The problem with copy-and-paste is that your answer no longer seems to address the question: it is merely a comment on the question. It would be better, then, to post a comment with a link to your other answer.

whuber

How is thus an answer to the question asked?

Dilip Sarwate

Your edits still don't answer the question, at least not at the level the question is asked. You write "Notice that

$r~\ldots$ not necessarily equal to the product

$pq$ . That exceptional situation corresponds to the case of independence between

$X$ and

$Y$ ." which is a perfectly true statement but only for the cognoscenti because for the hoi polloi, independence requires not just that

$P(X=1,Y=1)=P(X=1)P(Y=1)\tag 1$ but also

$P(X=u,Y=v)=P(X=u)P(Y=v),~u.v\in\{0,1\}.\tag 2$ Yes,

$(1) \implies(2)$ as the cognoscenti know; for lesser mortals, a proof that

$(1) \implies (2)$ is helpful.

Dilip Sarwate

IN GENERAL:

The criterion for independence is $F(x,y) = F_X(x)F_Y(y)$ . Or

$f_{X,Y}(x,y)=f_X(x)\,f_Y(y)\tag 1$

"If two variables are independent, their covariance is $0.$ But, having a covariance of $0$ does not imply the variables are independent."

This is nicely explained by Macro here, and in the Wikipedia entry for independence.

$\text {independence} \Rightarrow \text{zero cov}$ , yet

$\text{zero cov}\nRightarrow \text{independence}.$

Great example: $X \sim N(0,1)$ , and $Y= X^2.$ Covariance is zero (and $\mathbb E(XY)=0$ , which is the criterion for orthogonality), yet they are dependent. Credit goes to this post.

IN PARTICULAR (OP problem):

These are Bernoulli rv's, $X$ and $Y$ with probability of success $\Pr(X=1)$ , and $\Pr(Y=1)$ .

$\begin{align}\mathrm{cov}(X,Y)&=\mathrm E[XY] - \mathrm E[X]\,\mathrm E[Y]\\[2ex] &\underset{*}{=} \Pr(X=1 \cap Y=1) - \Pr(X=1)\, \Pr(Y=1)\\[2ex] &\implies \Pr(X=1 , Y=1) = \Pr (X=1)\,\Pr(Y=1). \end{align}$

This is equivalent to the condition for independence in Eq. $(1).$

$(*)$ :

$\mathrm E[XY]\quad \underset{**}{=} \quad \displaystyle \sum_{\text{domain X, Y}} \Pr(X=x\cap Y=y)\, x\,y \underset{\neq\,0\text{ iff } x \times y\neq 0}= \Pr(X=1 \cap Y=1).$

$(**)$ : by LOTUS.

As pointed out below, the argument is incomplete without what Dilip Sarwate had pointed out in his comments shortly after the OP appeared. After searching around, I found this proof of the missing part here:

If events $A$ and $B$ are independent, then events $A^c$ and $B$ are independent, and events $A^c$ and $B^c$ are also independent.

Proof By definition,

$A$ and $B$ are independent $\iff P(A\cap B) = P(A)P(B).$

But $B=(A\cap B) + ( A^c \cup B)$ , so $P(B)= P(A\cap B) + P(A^c \cup B)$ , which yields:

$\small P(A^c \cap B) = P(B) - P(A\cap B) = P(B) - P(A)\,P(B) = P(B) \left[1 - P(A)\right] = P(B)\,P( A^c).$

Repeat the argument for the events $A^c$ and $B^c,$ this time starting from the statement that $A^c$ and $B$ are independent and taking the complement of $B.$

Similarly. $A$ and $B^c$ are independent events.

So, we have shown already that

$\Pr(X=1 , Y=1) = \Pr (X=1)\,\Pr(Y=1)$ and the above shows that this implies that

$\Pr(X=i , Y=j) = \Pr (X=i)\,\Pr(Y=j), ~~i, j \in \{0,1\}$ that is, the joint pmf factors into the product of marginal pmfs everywhere, not just at

$(1,1)$ . Hence, uncorrelated Bernoulli random variables

$X$ and

$Y$ are also independent random variables.

Antoni Parellada
źródło

Actually that's not an equivalent condition to Eq (1). All you showed was that

$f_{X,Y}(1,1) = f_{X}(1) f_{Y}(1)$

gammer

Please consider replacing that image with your own equations, preferably ones that don't use overbars to denote complements. The overbars in the image are very hard to see.

Dilip Sarwate

@DilipSarwate No problem. Is it better, now?

Antoni Parellada

Thanks. Also, note that strictly speaking, you also need to show that

$A$ and

$B^c$ are independent events since the factorization of the joint pdf into the product of the marginal pmts must hold at all four points. Perhaps adding the sentence "Similarly.

$A$ and

$B^c$ are independent events" right after the proof that

$A^c$ and

$B$ are independent events will work.

Dilip Sarwate

@DilipSarwate Thank you very much for your help getting it right. The proof as it was before all the editing seemed self-explanatory, because of all the inherent symmetry, but it clearly couldn't be taken for granted. I am very appreciative of your assistance.

Antoni Parellada