Jaka jest różnica między

18

Ogólnie, jaka jest różnica między $E(X|Y)$ i $E(X|Y=y)$ ?

Poprzednia jest funkcją $y$ a ostatnia jest funkcją $x$ ? To takie mylące ...

conditional-expectation notation definition 신범준
źródło

Hmmm ... Ta ostatnia nie powinna być funkcją x, ale liczbą! Czy się mylę?

David

23

Z grubsza mówiąc, różnica między $E(X \mid Y)$ i $E(X \mid Y = y)$ polega na tym, że ta pierwsza jest zmienną losową, podczas gdy druga (w pewnym sensie) jest realizacją $E(X \mid Y)$ . Na przykład, jeśli

(X, Y) \sim N (0, (1 ρ ρ 1))

$(X, Y) \sim \mathcal N\left(\mathbf 0, \begin{pmatrix} 1 & \rho \\ \rho & 1 \end{pmatrix}\right)$ to

E(X∣Y) $E(X \mid Y)$ jest zmienną losową

E (X ∣ Y) = ρ Y .

$E(X \mid Y) = \rho Y.$ I odwrotnie, pozaobserwowaniu

Y=y $Y = y$ , bardziej prawdopodobne byłoby zainteresowanie wielkością

E(X∣Y=y)=ρy $E(X \mid Y = y) = \rho y$ która jest skalarem.

Być może wydaje się to niepotrzebną komplikacją, ale uznanie $E(X \mid Y)$ za zmienną losową samą w sobie jest tym, co sprawia, że takie prawo jak wieża $E(X) = E[E(X \mid Y)]$ ma sens - coś wewnątrz nawiasów klamrowych jest losowe, więc możemy zapytać, jakie jest jego oczekiwanie, podczas gdy nie ma nic losowego $E(X \mid Y = y)$ . W większości przypadków możemy mieć nadzieję na obliczenie

E (X ∣ Y = y) = \int x f X ∣ Y (x ∣ y) d x

$E(X \mid Y = y) = \int x f_{X\mid Y}(x \mid y) \ dx$

a następnie uzyskaj $E(X \mid Y)$ poprzez „wpięcie” losowej zmiennej $Y$ zamiast $y$ w wynikowym wyrażeniu. Jak wskazano we wcześniejszym komentarzu, istnieje pewna subtelność, która może wkradać się w odniesieniu do rygorystycznego definiowania tych rzeczy i łączenia ich w odpowiedni sposób. Zdarza się to z prawdopodobieństwem warunkowym, z powodu pewnych problemów technicznych związanych z podstawową teorią.

chłopak
źródło

8

Załóżmy, że $X$ i $Y$ są zmiennymi losowymi.

Niech $y_0$ będzie stałą liczbą rzeczywistą, powiedzmy $y_0 = 1$ . Następnie $E[X\mid Y=y_0]= E[X\mid Y = 1]$ to ilość : jest uwarunkowane wartością oczekiwaną z $X$ ponieważ $Y$ ma wartość $1$ . Teraz zwróć uwagę na inną stałą liczbę rzeczywistą $y_1$ , powiedzmy $y_1=1.5$ , $E[X\mid Y = y_1] = E[X\mid Y = 1.5]$ będzie warunkową wartością oczekiwaną $X$ biorąc pod uwagę $Y = 1.5$ (liczba rzeczywista). Nie ma powodu przypuszczać, że $E[X\mid Y = 1.5]$ i $E[X\mid Y = 1]$ mają tę samą wartość. Zatem możemy również uwzględnić $E[X\mid Y=y]$ za a funkcja o wartościach rzeczywistych $g(y)$ która odwzorowuje liczby rzeczywiste $y$ na liczby rzeczywiste $E[X\mid Y = y]$ . Zauważ, że stwierdzenie w pytaniu PO, że $E[X\mid Y = y]$ jest funkcją $x$ jest niepoprawne: $E[X\mid Y = y]$ jest funkcją $y$ o wartości rzeczywistej .

Z drugiej strony, $E[X\mid Y]$ jest zmienną losową $Z$ który okazuje się być funkcją zmiennej losowej $Y$ . Teraz, ilekroć piszemy $Z = h(Y)$ , rozumiemy przez to, że ilekroć zmienna losowa $Y$ ma wartość $y$ , zmienna losowa $Z$ ma wartość $h(y)$ . Ilekroć $Y$ przyjmuje wartość $y$ , zmienna losowa $Z = E[X\mid Y]$ takes on value $E[X\mid Y = y] = g(y)$ . Thus, $E[X\mid Y]$ is just another name for the random variable $Z = g(Y)$ . Note that $E[X\mid Y]$ is a function of $Y$ (not $y$ as in the statement of the OP's question).

As a a simple illustrative example, suppose that $X$ and $Y$ are discrete random variables with joint distribution

P (X = 0, Y = 0) P (X = 1, Y = 0) = 0.1, P (X = 0, Y = 1) = 0.2, = 0.3, P (X = 1, Y = 1) = 0.4.

$\begin{align} P(X=0,Y=0) &= 0.1,~~ P(X=0, Y=1) = 0.2,\\ P(X=1,Y=0) &= 0.3,~~ P(X=1,Y=1) = 0.4. \end{align}$ Note that

X $X$ and

Y $Y$ are (dependent) Bernoulli random variables with parameters

0.7 $0.7$ and

0.6 $0.6$ respectively, and so

E[X]=0.7 $E[X] = 0.7$ and

E[Y]=0.6 $E[Y] = 0.6$ . Now, note that conditioned on

Y=0 $Y=0$ ,

X $X$ is a Bernoulli random variable with parameter

0.75 $0.75$ while conditioned on

Y=1 $Y = 1$ ,

X $X$ is a Bernoulli random variable with parameter

23 $\frac 23$ . If you cannot see why this is so immediately, just work out the details: for example

P (X = 1 ∣ Y = 0) = P ( X = 1 , Y = 0 ) P ( Y = 0 ) = 0.3 0.4 = 3 4, P (X = 0 ∣ Y = 0) = P ( X = 0 , Y = 0 ) P ( Y = 0 ) = 0.1 0.4 = 1 4,

$P(X=1\mid Y = 0) = \frac{P(X=1, Y=0)}{P(Y=0)} = \frac{0.3}{0.4} = \frac 34,\\ P(X=0\mid Y = 0) = \frac{P(X=0, Y=0)}{P(Y=0)} = \frac{0.1}{0.4} = \frac 14,$ and similarly for

P(X=1∣Y=1) $P(X=1\mid Y=1)$ and

P(X=0∣Y=1) $P(X=0\mid Y = 1)$ . Hence, we have that

E [X ∣ Y = 0] = 3 4, E [X ∣ Y = 1] = 2 3 .

$E[X\mid Y = 0] = \frac 34, \quad E[X \mid Y = 1] = \frac 23.$ Thus,

E[X∣Y=y]=g(y) $E[X\mid Y = y] = g(y)$ where

g(y) $g(y)$ is a real-valued function enjoying the properties:

g (0) = 3 4, g (1) = 2 3 .

$g(0) = \frac 34, \quad g(1) = \frac 23.$

On the other hand, $E[X\mid Y] = g(Y)$ is a random variable that takes on values $\frac 34$ and $\frac 23$ with probabilities $0.4 = P(Y=0)$ and $0.6 = P(Y=1)$ respectively. Note that $E[X\mid Y]$ is a discrete random variable but is not a Bernoulli random variable.

As a final touch, note that

E [Z] = E [E [X ∣ Y]] = E [g (Y)] = 0.4 \times 3 4 + 0.6 \times 2 3 = 0.7 = E [X] .

$E[Z] = E\left[E[X\mid Y]\right] = E[g(Y)] = 0.4\times \frac 34 + 0.6\times \frac 23 = 0.7 = E[X].$ That is, the expected value of this function of

Y $Y$ , which we computed using only the marginal distribution of

Y $Y$ , happens to have the same numerical value as

E[X] $E[X]$ !! This is an illustration of a more general result that many people believe is a LIE:

E [E [X ∣ Y]] = E [X] .

$E\left[E[X\mid Y]\right] = E[X].$

Sorry, that's just a small joke. LIE is an acronym for Law of Iterated Expectation which is a perfectly valid result that everyone believes is the truth.

Dilip Sarwate
źródło

3

$E(X|Y)$ is the expectation of a random variable: the expectation of $X$ conditional on $Y$ . $E(X|Y=y)$ , on the other hand, is a particular value: the expected value of $X$ when $Y=y$ .

Think of it this way: let $X$ represent the caloric intake and $Y$ represent height. $E(X|Y)$ is then the caloric intake, conditional on height - and in this case, $E(X|Y=y)$ represents our best guess at the caloric intake ( $X$ ) when a person has a certain height $Y = y$ , say, 180 centimeters.

abaumann
źródło

4

I believe your first sentence should replace "distribution" with "expectation" (twice).

Glen_b -Reinstate Monica

4

$E(X\mid Y)$ isn't the distribution of

$X$ given

$Y$ ; this would be more commonly denotes by the conditional density

$f_{X \mid Y} (x \mid y)$ or conditional distribution function.

$E(X \mid Y)$ is the conditional expectation of

$X$ given

$Y$ , which is a

$Y$ -measurable random variable.

$E(X \mid Y = y)$ might be thought of as the realization of the random variable

$E(X \mid Y)$ when

$Y = y$ is observed (but there is the possibility for measure-theoretic subtlety to creep in).

guy

1

@guy Your explanation is the first accurate answer yet provided (out of three offered so far). Would you consider posting it as an answer?

whuber

@whuber I would but I'm not sure how to strike the balance between accuracy and making the answer suitably useful to OP and I'm paranoid about getting tripped up on technicalities :)

guy

@Guy I think you have already done a good job with the technicalities. Since you are sensitive about communicating well with the OP (which is great!), consider offering a simple example to illustrate--maybe just a joint distribution with binary marginals.

whuber

1

$E(X|Y)$ is expected value of values of $X$ given values of $Y$ $E(X|Y=y)$ is expected value of $X$ given the value of $Y$ is $y$

Generally $P(X|Y)$ is probability of values $X$ given values $Y$ , but you can get more precise and say $P(X=x|Y=y)$ , i.e. probability of value $x$ from all $X$ 's given the $y$ 'th value of $Y$ 's. The difference is that in the first case it is about "values of" and in the second you consider a certain value.

You could find the diagram below helpful.

Bayes theorem diagram form Wikipedia

Tim
źródło

This answer discusses probability, while the question asks about expectation. What is the connection?

whuber

Jaka jest różnica między

Odpowiedzi: