Czy przykładowa macierz kowariancji jest zawsze symetryczna i dodatnia?

33

Czy przy obliczaniu macierzy kowariancji próbki można uzyskać macierz symetryczną i dodatnią?

Obecnie mój problem obejmuje próbkę 4600 wektorów obserwacyjnych i 24 wymiary.

sampling covariance Morten
źródło

Do próbkowania macierzy kowariancji używam wzoru:

Q_{n} = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x}) (x_{i} - \bar{x})^{⊤}

$Q_n = \frac{1}{n} \sum\limits_{i=1}^n (x_i-\bar{x})(x_i-\bar{x})^\top$ gdzie

n

$n$ jest liczbą próbek, a

\bar{x}

$\bar{x}$ jest średnią próbki.

Morten

4

Można to normalnie nazwać „obliczaniem macierzy kowariancji próbki” lub „szacowaniem macierzy kowariancji” zamiast „próbkowaniem macierzy kowariancji”.

Glen_b

1

Częstą sytuacją, w której macierz kowariancji nie jest określona, jest sytuacja, gdy 24 „wymiary” rejestrują skład mieszaniny, którego suma wynosi 100%.

whuber

41

Dla próbki wektorów $x_i=(x_{i1},\dots,x_{ik})^\top$ , przy $i=1,\dots,n$ , średni wektor próbki wynosi

\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i},

$\bar{x}=\frac{1}{n} \sum_{i=1}^n x_i \, ,$ a macierz kowariancji próbki wynosi

Q = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x}) (x_{i} - \bar{x})^{⊤} .

$Q = \frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})(x_i-\bar{x})^\top \, .$ Dla niezerowego wektora

y \in R^{k}

$y\in\mathbb{R}^k$ mamy

y^{⊤} Q y = y^{⊤} (\frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x}) (x_{i} - \bar{x})^{⊤}) y

$y^\top Qy = y^\top\left(\frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})(x_i-\bar{x})^\top\right) y$

= \frac{1}{n} \sum_{i = 1}^{n} y^{⊤} (x_{i} - \bar{x}) (x_{i} - \bar{x})^{⊤} y

$= \frac{1}{n} \sum_{i=1}^n y^\top (x_i-\bar{x})(x_i-\bar{x})^\top y$

= \frac{1}{n} \sum_{i = 1}^{n} {((x_{i} - \bar{x})^{⊤} y)}^{2} \geq 0 . (*)

$= \frac{1}{n} \sum_{i=1}^n \left( (x_i-\bar{x})^\top y \right)^2 \geq 0 \, . \quad (*)$ Dlatego

Q

$Q$ jest zawsze dodatnimpółokreślonym.

The additional condition for $Q$ to be positive definite was given in whuber's comment bellow. It goes as follows.

Zdefiniuj $z_i=(x_i-\bar{x})$ , dla $i=1,\dots,n$ . Dla każdego niezerowego $y\in\mathbb{R}^k$ , $(*)$ wynosi zero wtedy i tylko wtedy, gdy $z_i^\top y=0$ , dla każdego $i=1,\dots,n$ . Załóżmy, że zestaw $\{z_1,\dots,z_n\}$ obejmuje $\mathbb{R}^k$ . Następnie istnieją liczby rzeczywiste $\alpha_1,\dots,\alpha_n$ takie, że $y=\alpha_1 z_1 +\dots+\alpha_n z_n$ . Ale wtedy mamy $y^\top y=\alpha_1 z_1^\top y + \dots +\alpha_n z_n^\top y=0$ , co daje, że $y=0$ , sprzeczność. Stąd, jeślirozpiętość $z_i$ $\mathbb{R}^k$ , then $Q$ is positive definite. This condition is equivalent to $\mathrm{rank} [z_1 \dots z_n] = k$ .

Zen
źródło

2

I like this approach, but would advise some care:

Q

$Q$ is not necessarily positive definite. The (necessary and sufficient) conditions for it to be so are described in my comment to Konstantin's answer.

whuber

1

Since the rank of

[z_{1}, z_{2}, \dots, z_{n}]

$[z_1, z_2, \cdots, z_n]$ is less or equal to

k

$k$ , the condition can be simplified to the rank is equal to k.

an offer can't refuse

13

A correct covariance matrix is always symmetric and positive *semi*definite.

The covariance between two variables is defied as $\sigma(x,y) = E [(x-E(x))(y-E(y))]$ .

This equation doesn't change if you switch the positions of $x$ and $y$ . Hence the matrix has to be symmetric.

It also has to be positive *semi-*definite because:

You can always find a transformation of your variables in a way that the covariance-matrix becomes diagonal. On the diagonal, you find the variances of your transformed variables which are either zero or positive, it is easy to see that this makes the transformed matrix positive semidefinite. However, since the definition of definity is transformation-invariant, it follows that the covariance-matrix is positive semidefinite in any chosen coordinate system.

When you estimate your covariance matrix (that is, when you calculate your sample covariance) with the formula you stated above, it will obv. still be symmetric. It also has to be positive semidefinite (I think), because for each sample, the pdf that gives each sample point equal probability has the sample covariance as its covariance (somebody please verify this), so everything stated above still applies.

Konstantin Schubert
źródło

1

PS: I am starting to think that this wasn't your question...

Konstantin Schubert

But if you want to know whether your sampling algorithm guarantees it, you will have to state how you are sampling.

Konstantin Schubert

1

u Q_{n} u^{'} \geq 0

$uQ_nu'\ge 0$

u

$u$

Q_{n}

$Q_n$

1 / n

$1/n$

v_{i} v_{i}^{'}

$v_iv_i'$

v_{i} = x_{i} - \bar{x})

$v_i=x_i-\bar{x})$ , whence

n u Q_{n} u^{'}

$n uQ_nu'$ is a sum of

u (v_{i} v_{i}^{'}) u^{'}

$u(v_iv_i')u'$ =

(u v_{i}) (u v_{i})^{'}

$(uv_i)(uv_i)'$ , which is the squared length of the vector

u v_{i}

$uv_i$ . Because

n > 0

$n\gt 0$ and a sum of squares cannot ever be negative,

u Q_{n} u^{'} \geq 0

$uQ_nu'\ge 0$ , QED. This also shows that

u Q_{n} u^{'} = 0

$uQ_nu'=0$ precisely for those vectors

u

$u$ which are orthogonal to all the

v_{i}

$v_i$ (i.e.,

u v_{i} = 0

$uv_i=0$ for all

i

$i$ ). When the

v_{i}

$v_i$ span, then

u = 0

$u=0$ and

Q_{n}

$Q_n$ is definite.

whuber

1

@Morten The transformation-invariance is pretty clear if you understand a matrix multiplication geometrically. Think of your vector as an arrow. The numbers that describe your vector change with the coordinate system, but the direction and length of your vector doesnt. Now, a multiplication with a matrix means that you change length and direction of that arrow, but again the effect is geometrically the same in each coordinate system. The same is with a scalar product: It is defined geometrically and Geometriy is transformation-invariant. So your equation has the same result in all systems.

Konstantin Schubert

1

@Morten When you think in coordinates, the argument goes like this: When

A

$A$ is your transformation matrix then:

v^{'} = A v

$v'=Av$ with

v^{'}

$v'$ as the transformed coordinate-vector,

M^{'} = A M A^{T}

$M'=AMA^T$ , so when you transform each element in the equation

v^{T} M v > 0

$v^T M v > 0$ , you get

v^{' T} M^{'} v^{'} = (A v)^{T} A M A^{T} A v > 0

$v'^T M' v' = (Av)^T A M A^T A v > 0$ , which equals

v^{T} A^{T} A M A^{T} A v > 0

$v^T A^T A M A^T A v > 0$ , and, because A is orthogonal,

A^{T} A

$A^T A$ is the unit matrix and we again get

v^{T} M v > 0

$v^T M v > 0$ , which means that the transformed and the untransformed equation have the same scalar as result, so their are either both or both not greater zero.

Konstantin Schubert

0

Variance-Covariance matrices are always symmetric, as it can be proven from the actual equation to calculate each term of said matrix.

Also, Variance-Covariance matrices are always square matrices of size n, where n is the number of variables in your experiment.

Eigenvectors of symmetric matrices are always orthogonal.

With PCA, you determine the eigenvalues of the matrix to see if you could reduce the number of variables used in your experiment.

GEN
źródło

1

Welcome Gen. Note that your username, identicon, & a link to your user page are automatically added to every post you make, so there is no need to sign your posts.

Antoine Vernet

3

This answer could be improved by addressing the issue of positive definiteness

Silverfish

This doesn't really answer the question: it's just a collection of unsupported assertions that might or might not be relevant. Could you reframe it in a way that shows how the question is answered and explains the reasoning?

whuber

0

I would add to the nice argument of Zen the following which explains why we often say that the covariance matrix is positive definite if $n-1\geq k$ .

If $x_1,x_2,...,x_n$ are a random sample of a continuous probability distribution then $x_1,x_2,...,x_n$ are almost surely (in the probability theory sense) linearly independent. Now, $z_1,z_2,...,z_n$ are not linearly independent because $\sum_{i=1}^n z_i = 0$ , but because of $x_1,x_2,...,x_n$ being a.s. linearly independent, $z_1,z_2,...,z_n$ a.s. span $\mathbb{R}^{n-1}$ . If $n-1\geq k$ , they also span $\mathbb{R}^k$ .

To conclude, if $x_1,x_2,...,x_n$ are a random sample of a continuous probability distribution and $n-1\geq k$ , the covariance matrix is positive definite.

giominas
źródło

0

For those with a non-mathematical background like me who don't quickly catch the abstract mathematical formulae, this is a worked out example excel for the most upvoted answer. The covariance matrix can be derived in other ways also.

Parikshit Bhinde
źródło

Could you explain how this spreadsheet demonstrates positive-definiteness of the covariance matrix?

whuber

It does not. I had a hard time visualizing the covariance matrix in its notational form itself. So i created this sheet for myself and thought it could help someone.

Parikshit Bhinde

Please, then, edit it to include an answer to the question.

whuber

Done :) Thanks for suggesting.

Parikshit Bhinde

The question is "is one then guaranteed to get a symmetric and positive-definite matrix?" I am unable to perceive any element of your post that addresses this, because (1) it never identifies a covariance matrix; (2) it does not demonstrate positive-definiteness of anything.

whuber

Czy przykładowa macierz kowariancji jest zawsze symetryczna i dodatnia?

Odpowiedzi: