- •Preface
- •Contents
- •Chapter 1
- •1.1 International Financial Markets
- •Foreign Exchange
- •Covered Interest Parity
- •Uncovered Interest Parity
- •Futures Contracts
- •1.2 National Accounting Relations
- •National Income Accounting
- •The Balance of Payments
- •1.3 The Central Bank’s Balance Sheet
- •Chapter 2
- •2.1 Unrestricted Vector Autoregressions
- •Lag-Length Determination
- •Granger Causality, Econometric Exogeniety and Causal
- •Priority
- •The Vector Moving-Average Representation
- •Impulse Response Analysis
- •Forecast-Error Variance Decomposition
- •Potential Pitfalls of Unrestricted VARs
- •2.2 Generalized Method of Moments
- •2.3 Simulated Method of Moments
- •2.4 Unit Roots
- •The Levin—Lin Test
- •The Im, Pesaran and Shin Test
- •The Maddala and Wu Test
- •Potential Pitfalls of Panel Unit-Root Tests
- •2.6 Cointegration
- •The Vector Error-Correction Representation
- •2.7 Filtering
- •The Spectral Representation of a Time Series
- •Linear Filters
- •The Hodrick—Prescott Filter
- •Chapter 3
- •The Monetary Model
- •Cassel’s Approach
- •The Commodity-Arbitrage Approach
- •3.5 Testing Monetary Model Predictions
- •MacDonald and Taylor’s Test
- •Problems
- •Chapter 4
- •The Lucas Model
- •4.1 The Barter Economy
- •4.2 The One-Money Monetary Economy
- •4.4 Introduction to the Calibration Method
- •4.5 Calibrating the Lucas Model
- •Appendix—Markov Chains
- •Problems
- •Chapter 5
- •Measurement
- •5.2 Calibrating a Two-Country Model
- •Measurement
- •The Two-Country Model
- •Simulating the Two-Country Model
- •Chapter 6
- •6.1 Deviations From UIP
- •Hansen and Hodrick’s Tests of UIP
- •Fama Decomposition Regressions
- •Estimating pt
- •6.2 Rational Risk Premia
- •6.3 Testing Euler Equations
- •Volatility Bounds
- •6.4 Apparent Violations of Rationality
- •6.5 The ‘Peso Problem’
- •Lewis’s ‘Peso-Problem’ with Bayesian Learning
- •6.6 Noise-Traders
- •Problems
- •Chapter 7
- •The Real Exchange Rate
- •7.1 Some Preliminary Issues
- •7.2 Deviations from the Law-Of-One Price
- •The Balassa—Samuelson Model
- •Size Distortion in Unit-Root Tests
- •Problems
- •Chapter 8
- •The Mundell-Fleming Model
- •Steady-State Equilibrium
- •Exchange rate dynamics
- •8.3 A Stochastic Mundell—Fleming Model
- •8.4 VAR analysis of Mundell—Fleming
- •The Eichenbaum and Evans VAR
- •Clarida-Gali Structural VAR
- •Appendix: Solving the Dornbusch Model
- •Problems
- •Chapter 9
- •9.1 The Redux Model
- •9.2 Pricing to Market
- •Full Pricing-To-Market
- •Problems
- •Chapter 10
- •Target-Zone Models
- •10.1 Fundamentals of Stochastic Calculus
- •Ito’s Lemma
- •10.3 InÞnitesimal Marginal Intervention
- •Estimating and Testing the Krugman Model
- •10.4 Discrete Intervention
- •10.5 Eventual Collapse
- •Chapter 11
- •Balance of Payments Crises
- •Flood—Garber Deterministic Crises
- •11.2 A Second Generation Model
- •Obstfeld’s Multiple Devaluation Threshold Model
- •Bibliography
- •Author Index
- •Subject Index
2.2. GENERALIZED METHOD OF MOMENTS |
35 |
2.2Generalized Method of Moments
OLS can be viewed as a special case of the generalized method of moments (GMM) estimator studied by Hansen [70]. Since you are presumably familiar with OLS, you can build your intuition about GMM by Þrst thinking about using it to estimate a linear regression. After getting that under your belt, thinking about GMM estimation in more complicated and possibly nonlinear environments is straightforward.
OLS and GMM. Suppose you want to estimate the coe cients in the regression
qt = zt0β + ²t, |
(2.31) |
||
|
|
|
|
where β is the k-dimensional vector of coe cients, zt is a k-dimensional
|
|
|
|
iid |
|
|
|
|
|
|
|
|
|
|
|
vector of regressors and ²t (0, σ2) and (qt, zt) are jointly covariance |
|||||||||||||||
stationary. The OLS estimator of β is chosen to minimize |
|||||||||||||||
1 |
T |
1 |
T |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
²t2 = |
|
(qt − β0zt)(qt − zt0 β) |
|
|
|
|
|||||||
|
T |
T |
|
|
|
|
|||||||||
|
|
=1 |
|
t=1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
Xt |
|
X |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
T |
1 |
T |
1 |
T |
|||||||
|
|
= |
|
qt2 |
− 2 |
β |
|
|
ztqt + |
β |
0 |
|
(ztzt0 ) |
β |
. (2.32) |
|
|
T |
T |
T |
|||||||||||
|
|
|
|
|
|
||||||||||
|
|
|
|
=1 |
|
|
|
|
t=1 |
|
t=1 |
||||
|
|
|
|
Xt |
|
|
|
|
X |
|
X |
When you di erentiate (2.32) with respect to β and set the result to zero, you get the Þrst-order conditions,
2 |
T |
1 |
T |
1 |
T |
|
|
|
||||||||
|
|
Xt |
|
|
|
X |
|
|
|
|
|
|
X |
(ztzt0 ) = 0. |
|
|
−T |
zt²t = −2T |
(ztqt) + 2 |
β |
|
T |
(2.33) |
||||||||||
=1 |
t=1 |
|
|
t=1 |
|
|||||||||||
| |
|
({z |
} | |
|
|
|
|
({zb) |
|
|
|
} |
|
|||
|
|
|
|
|
|
|
|
|
||||||||
|
|
a) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If the regression is correctly speciÞed, the Þrst-order conditions form a set of k orthogonality or ‘zero’ conditions that you used to estimate β. These orthogonality conditions are labeled (a) in (2.33). OLS estimation is straightforward because the Þrst-order conditions are the set of k linear equations in k unknowns labeled (b) in (2.33) which are solved
7 ˆ
by matrix inversion. Solving (2.33) for the minimizer β, you get,
7In matrix notation, we usually write the regression as q = Zβ + ² where q is the T-dimensional vector of observations on qt, Z is the T × k dimensional matrix of observations on the independent variables whose t-th row is z0t, β is the k-dimensional vector of parameters that we want to estimate, ² is the T-dimensional
ˆ
vector of regression errors, and β = (Z0Z)−1Z0q.
(16) (last line of footnote)
|
36 |
|
|
CHAPTER 2. SOME USEFUL TIME-SERIES METHODS |
||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
βˆ = |
ÃT |
|
|
=1 ztzt0 |
! |
−1 |
ÃT |
|
t=1(ztqt)! . |
|
|
|
(2.34) |
||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
T |
|
|
|
|
|
1 |
|
T |
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
T |
|
|
|
t |
t |
Xt |
|
|
|
|
|
|
|
|
|
X |
|
|
|
|
|
{ t} |
|
|
|
||||||
|
|
Let |
|
Q = plim |
1 |
|
|
|
z |
z0 |
|
and let |
W = σ2Q. Because |
|
² |
|
is an iid |
|||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
||||||||||||||||||||||||||||||||
|
|
|
|
{ |
} |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
also iid. It follows from the Lindeberg-Levy cen- |
|||||||||||||||||||||||||||||
|
sequence, zt²t |
|
|
is |
|
P |
|
|
T |
tT=1 zt²t |
→ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||||||||||
|
tral limit theorem that |
√1 |
|
|
|
|
|
|
|
N(0, W). Let the residuals be |
||||||||||||||||||||||||||||||
|
|
|
|
|
ˆ |
|
|
|
|
|
|
|
|
|
|
|
|
|
P |
|
|
|
|
|
|
|
|
|
|
2 |
1 |
|
|
T |
|
2 |
|
|
||
|
²ˆt |
= qt |
2− zt0β, the |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
t=1 |
|
t |
|||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
T |
|
|
||||||||||||||||
|
|
|
|
|
|
|
|
|
estimated error variance be σˆ = |
|
|
|
|
|
²ˆ , and let |
|||||||||||||||||||||||||
|
ˆ |
σˆ |
|
|
T |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
to do, you can |
|||||
|
W = T |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||
|
|
t=1 ztzt0. While it may seem like a silly thingP |
|
|
|
|
|
|||||||||||||||||||||||||||||||||
|
|
|
|
quadratic form using the orthogonality conditions and get the |
||||||||||||||||||||||||||||||||||||
|
set up a P |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
OLS estimator by minimizing |
|
|
|
|
ÃT |
|
=1(zt²t)! |
|
|
|
|
|
|
|
|
|
|||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
ÃT |
t=1(zt²t)! |
Wˆ −1 |
|
, |
|
|
|
|
|
(2.35) |
|||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
1 |
|
T |
|
|
|
|
|
|
|
0 |
|
|
|
|
|
1 |
|
T |
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
X |
|
|
|
|
|
|
|
|
|
|
|
|
|
Xt |
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||
|
with respect to β. This is the GMM estimator for the linear regression |
|||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||||||||||||||
|
(2.31). The Þrst-order conditions to this problem are |
|
|
|
|
|
|
|||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
Wˆ −1 T |
|
X zt²t = T X zt²t = 0, |
|
|
|
|
|
|
|
|
|
|||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||
(15) |
which are identical to the OLS Þrst-order conditions (2.33). You also |
|||||||||||||||||||||||||||||||||||||||
know that the asymptotic distribution of the OLS estimator of β is |
||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
√ |
|
|
|
ˆ |
|
|
D |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(2.36) |
||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
T (β |
− β) |
→ N(0, V), |
|
|
|
|
|
|
|
where V = σ2Q−1. If you let D = E(∂(zt²t)/∂β0) = Q, the GMM covariance matrix V can be expressed as V = σ2Q−1 = [D0W−1D]−1.
The Þrst equality is the standard OLS calculation for the covariance matrix and the second equality follows from the properties of (2.35).
You would never do OLS by minimizing (2.35) since to get the
ˆ −1
weighting matrix W , you need an estimate of β which is what you want in the Þrst place. But this is what you do in the generalized environment.
Generalized environment. Suppose you have an economic theory that relates qt to a vector xt. The theory predicts the set of orthogonality conditions
E[zt²t(qt, xt, β)] = 0,
2.2. GENERALIZED METHOD OF MOMENTS |
37 |
where zt is a vector of instrumental variables which may be di erent from xt and ²t(qt, xt, β) may be a nonlinear function of the underlying k-dimensional parameter vector β and observations on qt and xt.8 To
estimate β by GMM, let wt ≡ zt²t(qt, xt, β) where we now write the vector of orthogonality conditions as E(wt) = 0. Mimicking the steps above for GMM estimation of the linear regression coe cients, you’ll want to choose the parameter vector β to minimize
1 |
T |
0 |
|
1 |
T |
|
|
||
|
|
Xt |
|
|
|
|
X |
|
|
ÃT |
! |
Wˆ −1 |
|
|
! , |
|
|||
=1 wt |
ÃT t=1 wt |
(2.37) |
ˆ
where W is a consistent estimator of the asymptotic covariance matrix of √1T P wt. It is sometimes called the long-run covariance matrix. You cannot guarantee that wt is iid in the generalized environment. It may be serially correlated and conditionally heteroskedastic. To allow for these possibilities, the formula for the weighting matrix is
(17)
(18) (eq. 2.37)
|
|
|
|
|
|
|
|
∞ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
W = Ω0 + |
(Ωj + Ω0), |
|
|
|
(2.38) |
||||||||
|
|
|
|
|
|
|
|
=1 |
|
|
|
|
j |
|
|
|
|
|
|
|
|
|
|
|
|
|
jX |
|
|
|
|
|
|
|
|
|
|
where Ω0 |
= E(wtw0 ) and Ωj = E(wtw0 |
). A popular choice for esti- |
||||||||||||||||
ˆ |
|
|
t |
|
|
|
|
|
|
t−j |
|
|
|
|
|
|
|
|
mating W is the method of Newey and West [114] |
|
|
|
|||||||||||||||
|
|
|
Wˆ = |
Ωˆ0 + |
1 m |
|
|
+ 1 |
¶ |
Ωˆj + Ωˆj0 |
|
, |
(2.39) |
|||||
|
|
|
T j=1 µ1 − j |
T |
|
´ |
||||||||||||
|
|
|
|
|
|
|
X |
|
|
|
|
|
³ |
|
|
|
|
|
where Ωˆ |
= |
1 |
T |
w |
w0 , and Ωˆ |
= |
1 |
|
T |
|
w |
w0 |
. The weighting |
|||||
T |
|
T |
|
t=j+1 |
||||||||||||||
0 |
|
t=1 t |
t |
|
j |
|
|
t |
t−j |
|
|
ˆ |
||||||
|
|
(j+1) |
|
|
|
|
|
|
window. When |
|
||||||||
function 1 |
|
|
P is called the Bartlett |
W constructed |
||||||||||||||
− |
|
P |
|
|
|
|
|
|
|
|||||||||
|
|
T |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
by Newey and West, it is guaranteed to be positive deÞnite which is a good thing since you need to invert it to do GMM. To guarantee consistency, the Newey-West lag length (m) needs go to inÞnity, but at a slower rate than T .9 You might try values such as m = T 1/4. To test
8Alternatively, you may be interested in a multiple equation system in which the theory imposes parameter restrictions across equations so not only may the model be nonlinear, ²t could be a vector of error terms.
9Andrews [2] and Newey and West [115] o er recommendations for letting the data determine m.
38 |
|
CHAPTER 2. SOME USEFUL TIME-SERIES METHODS |
|||||||||||||||
hypotheses, use the fact that |
|
|
|
|
|
|
|
|
|||||||||
|
|
|
|
|
|
√ |
|
|
ˆ |
|
|
D |
(2.40) |
||||
|
|
|
|
|
|||||||||||||
|
|
|
|
|
|
T(β |
− β) → N(0, V), |
||||||||||
|
|
|
|
1 |
|
|
|
|
|
|
∂w |
|
|||||
where |
V = (D0W−1D)− , and D = E µ |
|
|
t |
¶ . To estimate D, you can |
||||||||||||
∂β0 |
|||||||||||||||||
|
1 |
T |
∂wˆt |
|
|
|
|
|
|
|
|
||||||
use Dˆ |
= |
T |
Pt=1 |
µ ∂β0 ¶. |
|
|
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Let R be a k×q restriction matrix and r is a q dimensional vector of constants. Consider the q linear restrictions Rβ = r on the coe cient vector. The Wald statistic has an asymptotic chi-square distribution
(19)under the null hypothesis that the restrictions are true
(eq. 2.41) |
W |
T |
= T(Rβˆ |
− |
r)0[RVR0]−1(Rβˆ |
− |
r) |
D |
χ2. |
(2.41) |
|
|
|
|
|
→ |
q |
|
It follows that the linear restrictions can be tested by comparing the Wald statistic against the chi-square distribution with q degrees of freedom.
GMM also allows you to conduct a generic test of a set of overidentifying restrictions. The theory predicts that there are as many orthogonality conditions, n, as is the dimensionality of wt. The parameter vector β is of dimension k < n so actually only k linear combinations of the orthogonality conditions are set to zero in estimation. If the theoretical restrictions are true, however, the remaining n − k orthogonality conditions should di er from zero only by chance. The
minimized value of the GMM objective function, obtained by evaluat-
ˆ 2 ing the objective function at β, turns out to be asymptotically χn−k
under the null hypothesis that the model is correctly speciÞed.
2.3Simulated Method of Moments
Under GMM, you chose β to match the theoretical moments to sample moments computed from the data. In applications where it is di cult or impossible to obtain analytical expressions for the moment conditions E(wt) they can be generated by numerical simulation. This is the simulated method of moments (SMM) proposed by Lee and Ingram [92] and Du e and Singleton [40].
In SMM, we match computer simulated moments to the sample moments. We use the following notation.
2.3. SIMULATED METHOD OF MOMENTS |
39 |
β is the vector of parameters to be estimated.
{qt}Tt=1 is the actual time-series data of length T. Let q0 = (q1, q2, . . . , qT ) denote the collection of the observations.
{q˜i(β)}Mi=1 is a computer simulated time-series of length M which is generated according to the underlying economic theory. Let q˜0(β) = (˜q1(β), q˜2(β), . . . , q˜M (β)) denote the collection of these M observations.
h(qt) is some vector function of the data from which to simulate the moments. For example, setting h(qt) = (qt, qt2, qt3)0 will pick o the Þrst three moments of qt.
HT (q) = T1 PTt=1 h(qt) is the vector of sample moments of qt.
HM (˜q |
( |
β |
)) = |
1 |
|
|
iM=1 h(˜qi( |
β |
)) is the corresponding vector of simulated |
|||||
M |
|
|||||||||||||
|
|
|
|
|
|
|||||||||
|
|
|
|
|
|
|
|
|
where the length of the simulated series is M. |
|||||
|
moments P |
|
|
|
|
|||||||||
ut = h(qt) − HT (q) is h in deviation from the mean form. |
||||||||||||||
Ωˆ0 = |
1 |
|
PtT=1 utut0 |
|
|
is the sample short-run variance of ut. |
||||||||
T |
|
|||||||||||||
Ωˆj = |
1 |
PtT=1 utut0 |
−j is the sample cross-covariance matrix of ut. |
|||||||||||
T |
ˆ ˆ 1 Pm j+1 ˆ ˆ0
WT = Ω0 + T j=1(1 − T )(Ωj + Ωj) is the Newey-West estimate of the long-run covariance matrix of ut.
gT,M (β) = HT (q) − HM (q˜(β)) is the deviation of the sample moments from the simulated moments.
The SMM estimator is that value of β that minimizes the quadratic distance between the simulated moments and the sample moments
hi
|
|
|
gT,M (β)0 W−1 |
gT,M (β), |
(2.42) |
|||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
T,M |
|
|
|
|
||
|
|
|
T |
|
|
|
|
|
|
|
|
|
ˆ |
|
|
|
|
|
where WT,M = |
1 + |
M |
WT |
. Let |
β |
S be SMM estimator. It is asymp- |
||||||||||||
|
|
|
|
|
|
|
|
|
||||||||||
totically |
normally distributed with |
|
|
|
|
|||||||||||||
|
h³ |
´ |
|
|
|
|
i |
|
|
|
|
|||||||
|
|
|
√ |
|
|
ˆ |
|
β |
D |
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|||||||||||
|
|
|
|
|
T ( |
β |
S |
− |
) → N(0, VS), |
|
||||||||
|
|
|
|
|
|
|
|