Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Mark International Macroeconomics and Finance Theory and Empirical Methods.pdf
Скачиваний:
86
Добавлен:
22.08.2013
Размер:
2.29 Mб
Скачать

2.2. GENERALIZED METHOD OF MOMENTS

35

2.2Generalized Method of Moments

OLS can be viewed as a special case of the generalized method of moments (GMM) estimator studied by Hansen [70]. Since you are presumably familiar with OLS, you can build your intuition about GMM by Þrst thinking about using it to estimate a linear regression. After getting that under your belt, thinking about GMM estimation in more complicated and possibly nonlinear environments is straightforward.

OLS and GMM. Suppose you want to estimate the coe cients in the regression

qt = zt0β + ²t,

(2.31)

 

 

 

 

where β is the k-dimensional vector of coe cients, zt is a k-dimensional

 

 

 

 

iid

 

 

 

 

 

 

 

 

 

 

 

vector of regressors and ²t (0, σ2) and (qt, zt) are jointly covariance

stationary. The OLS estimator of β is chosen to minimize

1

T

1

T

 

 

 

 

 

 

 

 

 

 

 

 

 

²t2 =

 

(qt β0zt)(qt zt0 β)

 

 

 

 

 

T

T

 

 

 

 

 

 

=1

 

t=1

 

 

 

 

 

 

 

 

 

 

 

 

 

Xt

 

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

T

1

T

1

T

 

 

=

 

qt2

− 2

β

 

 

ztqt +

β

0

 

(ztzt0 )

β

. (2.32)

 

 

T

T

T

 

 

 

 

 

 

 

 

 

 

=1

 

 

 

 

t=1

 

t=1

 

 

 

 

Xt

 

 

 

 

X

 

X

When you di erentiate (2.32) with respect to β and set the result to zero, you get the Þrst-order conditions,

2

T

1

T

1

T

 

 

 

 

 

Xt

 

 

 

X

 

 

 

 

 

 

X

(ztzt0 ) = 0.

 

T

zt²t = −2T

(ztqt) + 2

β

 

T

(2.33)

=1

t=1

 

 

t=1

 

|

 

({z

} |

 

 

 

 

({zb)

 

 

 

}

 

 

 

 

 

 

 

 

 

 

 

 

a)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

If the regression is correctly speciÞed, the Þrst-order conditions form a set of k orthogonality or ‘zero’ conditions that you used to estimate β. These orthogonality conditions are labeled (a) in (2.33). OLS estimation is straightforward because the Þrst-order conditions are the set of k linear equations in k unknowns labeled (b) in (2.33) which are solved

7 ˆ

by matrix inversion. Solving (2.33) for the minimizer β, you get,

7In matrix notation, we usually write the regression as q = Zβ + ² where q is the T-dimensional vector of observations on qt, Z is the T × k dimensional matrix of observations on the independent variables whose t-th row is z0t, β is the k-dimensional vector of parameters that we want to estimate, ² is the T-dimensional

ˆ

vector of regression errors, and β = (Z0Z)−1Z0q.

(16) (last line of footnote)

 

36

 

 

CHAPTER 2. SOME USEFUL TIME-SERIES METHODS

 

 

 

 

 

 

 

 

 

βˆ =

ÃT

 

 

=1 ztzt0

!

−1

ÃT

 

t=1(ztqt)! .

 

 

 

(2.34)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

T

 

 

 

 

 

1

 

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

T

 

 

 

t

t

Xt

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

{ t}

 

 

 

 

 

Let

 

Q = plim

1

 

 

 

z

z0

 

and let

W = σ2Q. Because

 

²

 

is an iid

 

 

 

 

 

 

 

 

 

 

 

 

 

{

}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

also iid. It follows from the Lindeberg-Levy cen-

 

sequence, zt²t

 

 

is

 

P

 

 

T

tT=1 zt²t

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

tral limit theorem that

1

 

 

 

 

 

 

 

N(0, W). Let the residuals be

 

 

 

 

 

ˆ

 

 

 

 

 

 

 

 

 

 

 

 

 

P

 

 

 

 

 

 

 

 

 

 

2

1

 

 

T

 

2

 

 

 

²ˆt

= qt

2zt0β, the

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

t=1

 

t

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

T

 

 

 

 

 

 

 

 

 

 

 

estimated error variance be σˆ =

 

 

 

 

 

²ˆ , and let

 

ˆ

σˆ

 

 

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

to do, you can

 

W = T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

t=1 ztzt0. While it may seem like a silly thingP

 

 

 

 

 

 

 

 

 

quadratic form using the orthogonality conditions and get the

 

set up a P

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OLS estimator by minimizing

 

 

 

 

ÃT

 

=1(zt²t)!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ÃT

t=1(zt²t)!

Wˆ −1

 

,

 

 

 

 

 

(2.35)

 

 

 

 

 

 

 

 

 

 

1

 

T

 

 

 

 

 

 

 

0

 

 

 

 

 

1

 

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

 

 

 

Xt

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

with respect to β. This is the GMM estimator for the linear regression

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(2.31). The Þrst-order conditions to this problem are

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Wˆ −1 T

 

X zt²t = T X zt²t = 0,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(15)

which are identical to the OLS Þrst-order conditions (2.33). You also

know that the asymptotic distribution of the OLS estimator of β is

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

 

D

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(2.36)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

T (β

β)

→ N(0, V),

 

 

 

 

 

 

 

where V = σ2Q−1. If you let D = E(∂(zt²t)/∂β0) = Q, the GMM covariance matrix V can be expressed as V = σ2Q−1 = [D0W−1D]−1.

The Þrst equality is the standard OLS calculation for the covariance matrix and the second equality follows from the properties of (2.35).

You would never do OLS by minimizing (2.35) since to get the

ˆ −1

weighting matrix W , you need an estimate of β which is what you want in the Þrst place. But this is what you do in the generalized environment.

Generalized environment. Suppose you have an economic theory that relates qt to a vector xt. The theory predicts the set of orthogonality conditions

E[zt²t(qt, xt, β)] = 0,

2.2. GENERALIZED METHOD OF MOMENTS

37

where zt is a vector of instrumental variables which may be di erent from xt and ²t(qt, xt, β) may be a nonlinear function of the underlying k-dimensional parameter vector β and observations on qt and xt.8 To

estimate β by GMM, let wt zt²t(qt, xt, β) where we now write the vector of orthogonality conditions as E(wt) = 0. Mimicking the steps above for GMM estimation of the linear regression coe cients, you’ll want to choose the parameter vector β to minimize

1

T

0

 

1

T

 

 

 

 

Xt

 

 

 

 

X

 

 

ÃT

!

Wˆ −1

 

 

! ,

 

=1 wt

ÃT t=1 wt

(2.37)

ˆ

where W is a consistent estimator of the asymptotic covariance matrix of √1T P wt. It is sometimes called the long-run covariance matrix. You cannot guarantee that wt is iid in the generalized environment. It may be serially correlated and conditionally heteroskedastic. To allow for these possibilities, the formula for the weighting matrix is

(17)

(18) (eq. 2.37)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

W = Ω0 +

(Ωj + Ω0),

 

 

 

(2.38)

 

 

 

 

 

 

 

 

=1

 

 

 

 

j

 

 

 

 

 

 

 

 

 

 

 

 

 

jX

 

 

 

 

 

 

 

 

 

where Ω0

= E(wtw0 ) and Ωj = E(wtw0

). A popular choice for esti-

ˆ

 

 

t

 

 

 

 

 

 

t−j

 

 

 

 

 

 

 

 

mating W is the method of Newey and West [114]

 

 

 

 

 

 

Wˆ =

ˆ0 +

1 m

 

 

+ 1

ˆj + Ωˆj0

 

,

(2.39)

 

 

 

T j=1 µ1 − j

T

 

´

 

 

 

 

 

 

 

X

 

 

 

 

 

³

 

 

 

 

where Ωˆ

=

1

T

w

w0 , and Ωˆ

=

1

 

T

 

w

w0

. The weighting

T

 

T

 

t=j+1

0

 

t=1 t

t

 

j

 

 

t

t−j

 

 

ˆ

 

 

(j+1)

 

 

 

 

 

 

window. When

 

function 1

 

 

P is called the Bartlett

W constructed

 

P

 

 

 

 

 

 

 

 

 

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

by Newey and West, it is guaranteed to be positive deÞnite which is a good thing since you need to invert it to do GMM. To guarantee consistency, the Newey-West lag length (m) needs go to inÞnity, but at a slower rate than T .9 You might try values such as m = T 1/4. To test

8Alternatively, you may be interested in a multiple equation system in which the theory imposes parameter restrictions across equations so not only may the model be nonlinear, ²t could be a vector of error terms.

9Andrews [2] and Newey and West [115] o er recommendations for letting the data determine m.

38

 

CHAPTER 2. SOME USEFUL TIME-SERIES METHODS

hypotheses, use the fact that

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

 

D

(2.40)

 

 

 

 

 

 

 

 

 

 

 

T(β

− β) → N(0, V),

 

 

 

 

1

 

 

 

 

 

 

w

 

where

V = (D0W−1D), and D = E µ

 

 

t

. To estimate D, you can

∂β0

 

1

T

∂wˆt

 

 

 

 

 

 

 

 

use Dˆ

=

T

Pt=1

µ ∂β0 .

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Let R be a k×q restriction matrix and r is a q dimensional vector of constants. Consider the q linear restrictions Rβ = r on the coe cient vector. The Wald statistic has an asymptotic chi-square distribution

(19)under the null hypothesis that the restrictions are true

(eq. 2.41)

W

T

= T(Rβˆ

r)0[RVR0]−1(Rβˆ

r)

D

χ2.

(2.41)

 

 

 

 

 

q

 

It follows that the linear restrictions can be tested by comparing the Wald statistic against the chi-square distribution with q degrees of freedom.

GMM also allows you to conduct a generic test of a set of overidentifying restrictions. The theory predicts that there are as many orthogonality conditions, n, as is the dimensionality of wt. The parameter vector β is of dimension k < n so actually only k linear combinations of the orthogonality conditions are set to zero in estimation. If the theoretical restrictions are true, however, the remaining n − k orthogonality conditions should di er from zero only by chance. The

minimized value of the GMM objective function, obtained by evaluat-

ˆ 2 ing the objective function at β, turns out to be asymptotically χn−k

under the null hypothesis that the model is correctly speciÞed.

2.3Simulated Method of Moments

Under GMM, you chose β to match the theoretical moments to sample moments computed from the data. In applications where it is di cult or impossible to obtain analytical expressions for the moment conditions E(wt) they can be generated by numerical simulation. This is the simulated method of moments (SMM) proposed by Lee and Ingram [92] and Du e and Singleton [40].

In SMM, we match computer simulated moments to the sample moments. We use the following notation.

2.3. SIMULATED METHOD OF MOMENTS

39

β is the vector of parameters to be estimated.

{qt}Tt=1 is the actual time-series data of length T. Let q0 = (q1, q2, . . . , qT ) denote the collection of the observations.

{q˜i(β)}Mi=1 is a computer simulated time-series of length M which is generated according to the underlying economic theory. Let 0(β) = (˜q1(β), q˜2(β), . . . , q˜M (β)) denote the collection of these M observations.

h(qt) is some vector function of the data from which to simulate the moments. For example, setting h(qt) = (qt, qt2, qt3)0 will pick o the Þrst three moments of qt.

HT (q) = T1 PTt=1 h(qt) is the vector of sample moments of qt.

HM (˜q

(

β

)) =

1

 

 

iM=1 h(˜qi(

β

)) is the corresponding vector of simulated

M

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

where the length of the simulated series is M.

 

moments P

 

 

 

 

ut = h(qt) − HT (q) is h in deviation from the mean form.

ˆ0 =

1

 

PtT=1 utut0

 

 

is the sample short-run variance of ut.

T

 

ˆj =

1

PtT=1 utut0

−j is the sample cross-covariance matrix of ut.

T

ˆ ˆ 1 Pm j+1 ˆ ˆ0

WT = Ω0 + T j=1(1 − T )(Ωj + Ωj) is the Newey-West estimate of the long-run covariance matrix of ut.

gT,M (β) = HT (q) − HM (q˜(β)) is the deviation of the sample moments from the simulated moments.

The SMM estimator is that value of β that minimizes the quadratic distance between the simulated moments and the sample moments

hi

 

 

 

gT,M (β)0 W−1

gT,M (β),

(2.42)

 

 

 

 

 

 

 

 

 

 

 

 

T,M

 

 

 

 

 

 

 

T

 

 

 

 

 

 

 

 

 

ˆ

 

 

 

 

where WT,M =

1 +

M

WT

. Let

β

S be SMM estimator. It is asymp-

 

 

 

 

 

 

 

 

 

totically

normally distributed with

 

 

 

 

 

´

 

 

 

 

i

 

 

 

 

 

 

 

 

 

ˆ

 

β

D

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

T (

β

S

) → N(0, VS),