Załóżmy, że mam następujący model
where , is a vector of explanatory variables, is the parameters of non-linear function and , where naturally is matrix.
The goal is the usual to estimate and . The obvious choice is maximum likelihood method. Log-likelihood for this model (assuming we have a sample ) looks like
Now this seems simple, the log-likelihood is specified, put in data, and use some algorithm for non-linear optimisation. The problem is how to ensure that is positive definite. Using for example optim
in R (or any other non-linear optimisation algorithm) will not guarantee me that is positive definite.
So the question is how to ensure that stays positive definite? I see two possible solutions:
Reparametrise as where is upper-triangular or symmetric matrix. Then will always be positive-definite and can be unconstrained.
Use profile likelihood. Derive the formulas for and . Start with some and iterate , until convergence.
Is there some other way and what about these 2 approaches, will they work, are they standard? This seems pretty standard problem, but quick search did not give me any pointers. I know that Bayesian estimation would be also possible, but for the moment I would not want to engage in it.
Odpowiedzi:
Assuming that in constructing the covariance matrix, you are automatically taking care of the symmetry issue, your log-likelihood will be−∞ when Σ is not positive definite because of the logdet Σ term in the model right? To prevent a numerical error if det Σ<0 I would precalculate det Σ and, if it is not positive, then make the log likelihood equal -Inf, otherwise continue. You have to calculate the determinant anyways, so this is not costing you any extra calculation.
źródło
As it turns out you can use profile maximum likelihood to ensure the necessary properties. You can prove that for givenθ^ , l(θ^,Σ) is maximised by
where
Then it is possible to show that
hence we only need to maximise
Naturally in this caseΣ will satisfy all the necessary properties. The proofs are identical for the case when f is linear which can be found in Time Series Analysis by J. D. Hamilton page 295, hence I omitted them.
źródło
An alternative parameterization for the covariance matrix is in terms of eigenvaluesλ1,...,λp and p(p−1)/2 "Givens" angles θij .
That is, we can write
whereG is orthonormal, and
withλ1≥...≥λp≥0 .
Meanwhile,G can be parameterized uniquely in terms of p(p−1)/2 angles, θij , where i=1,2,...,p−1 and j=i,...,p−1 .[1]
(details to be added)
[1]: Hoffman, Raffenetti, Ruedenberg. "Generalization of Euler Angles to N‐Dimensional Orthogonal Matrices". J. Math. Phys. 13, 528 (1972)
źródło
Along the lines of charles.y.zheng's solution, you may wish to modelΣ=Λ+CC⊤ , where Λ is a diagonal matrix, and C is a Cholesky factorization of a rank update to Λ . You only then need to keep the diagonal of Λ positive to keep Σ positive definite. That is, you should estimate the diagonal of Λ and the elements of C instead of estimating Σ .
źródło