Commit Graph

2 Commits (cd76bdff9e1092b2d551e1742519279aeae656ec)

Author SHA1 Message Date
Stéphane Adjemian (Ryûk) cd76bdff9e
Fix and update unit tests for local_state_space_iteration_2.
Add mat file with git lfs.
2022-05-13 12:13:46 +02:00
Willi Mutschler 5a8c206760 Added parameter derivatives of perturbation solution up to 3 order
# Preliminary comments
I finished the identification toolbox at orders two and three using the pruned state space system, but before I merge request this, I decided to first merge the new functionality to compute parameter derivatives of perturbation solution matrices at higher orders. So after this is approved, I merge the identification toolbox.
I guess @rattoma, @sebastien, and @michel are best choices to review this.
I outline the main idea first and then provide some more detailed changes I made to the functions.

***

# Main idea
This merge request is concerned with the *analytical*computation of the parameter derivatives of first, second and third order perturbation solution matrices, i.e. using _closed-form_ expressions to efficiently compute the derivative of  $g_x$ , $g_u$, $g_{xx}$, $g_{xu}$, $g_{uu}$, $g_{\sigma\sigma}$, $g_{xxx}$, $g_{xxu}$, $g_{xuu}$, $g_{uuu}$, $g_{x\sigma\sigma}$, $g_{u\sigma\sigma}$ *with respect to model parameters*  $\theta$.  Note that $\theta$ contains model parameters, stderr and corr parameters of shocks. stderr and corr parameters of measurement errors are not yet supported, (they can easily be included as exogenous shocks). The availability of such derivatives is beneficial in terms of more reliable analysis of model sensitivity and parameter identifiability as well as more efficient estimation methods, in particular for models solved up to third order, as it is well-known that numerical derivatives are a tricky business, especially for large models.

References for my approach are:
* Iskrev (2008, 2010) and Schmitt-Grohé and Uribe (2012, Appendix)  who were the first to compute the parameter derivatives analytically at first order, however, using inefficient (sparse) Kronecker products.
* Mutschler (2015) who provides the expressions for a second-order, but again using inefficient (sparse) Kronecker products.
* Ratto and Iskrev (2012) who show how the first-order system can be solved accurately, fast and efficiently using existing numerical algorithms for generalized Sylvester equations by taking the parameter derivative with respect to each parameter separately.
* Julliard and Kamenik (2004) who provide the perturbation solution equation system in tensor notation at any order k.
* Levintal (2017) who introduces permutation matrices to express the perturbation solution equation system in matrix notation up to fifth order.
Note that @rattoma already implemented the parameter derivatives of $g_x$ and $g_u$ analytically (and numerically), and I rely heavily on his work in `get_first_order_solution_params_derivs.m` (previously `getH.m`). My additions are mainly to this function and thus it is renamed to `get_perturbation_params_derivs.m`.

The basic idea of this merge request is to take the second- and third-order perturbation solution systems in Julliard and Kamenik (2004), unfold these into an equivalent matrix representation using permutation matrices as in Levintal (2017). Then extending Ratto and Iskrev (2012) one takes the derivative with respect to each parameter separately and gets a computational problem that is linear, albeit large, as it involves either solving generalized Sylvester equations or taking inverses of highly sparse matrices. I will now briefly summarize the perturbation solution system at third order and the system that results when taking the derivative with respect to parameters.

## Perturbation Solution
The following systems arise at first, second, and third order:
$(ghx): f_{x} z_{x} = f_{y_{-}^*} + f_{y_0} g_{x} + f_{y_{+}^{**}} g^{**}_{x} g^{*}_{x}= A g_{x} + f_{y_{-}^*}=0$

$(ghu): f_{z} z_{u} = f_{y_0} g_{u} + f_{y_{+}^{**}} g^{**}_{x} g^{*}_{u} + f_{u}= A g_u + f_u = 0$

$(ghxx) : A g_{xx} + B g_{xx} \left(g^{*}_{x} \otimes g^{*}_{x}\right) + f_{zz} \left( z_{x} \otimes z_{x} \right) = 0$

$(ghxu) : A g_{xu} + B g_{xx} \left(g^{*}_{x} \otimes g^{*}_{u}\right) + f_{zz} \left( z_{x} \otimes z_{u} \right) = 0$

$(ghuu) : A g_{uu} + B g_{xx} \left(g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zz} \left( z_{u} \otimes z_{u} \right) = 0$

$(ghs2) : (A+B) g_{\sigma\sigma} +  \left( f_{y^{**}_{+}y^{**}_{+}} \left(g^{**}_{u} \otimes g^{**}_{u}\right) + f_{y^{**}_{+}} g^{**}_{uu}\right)vec(\Sigma) = 0$

$(ghxxx) : A g_{xxx} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{x} \otimes g^{*}_{x}\right) + f_{y_{+}}g^{**}_{xx} \left(g^{*}_x \otimes g^{*}_{xx}\right)P_{x\_xx} + f_{zz} \left( z_{x} \otimes z_{xx} \right)P_{x\_xx} + f_{zzz} \left( z_{x} \otimes z_{x} \otimes z_{x} \right) = 0$

$(ghxxu) : A g_{xxu} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{x} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{x} \otimes z_{x} \otimes z_{u} \right) + f_{zz} \left( \left( z_{x} \otimes z_{xu} \right)P_{x\_xu} + \left(z_{xx} \otimes z_{u}\right) \right) + f_{y_{+}}g^{**}_{xx} \left( \left(g^{*}_{x} \otimes g^{*}_{xu}\right)P_{x\_xu} + \left(g^{*}_{xx} \otimes g^{*}_{u}\right) \right) = 0$

$(ghxuu) : A g_{xuu} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{x} \otimes z_{u} \otimes z_{u} \right)+ f_{zz} \left( \left( z_{xu} \otimes z_{u} \right)P_{xu\_u} + \left(z_{x} \otimes z_{uu}\right) \right) + f_{y_{+}}g^{**}_{xx} \left( \left(g^{*}_{xu} \otimes g^{*}_{u}\right)P_{xu\_u} + \left(g^{*}_{x} \otimes g^{*}_{uu}\right) \right) = 0$

$(ghuuu) : A g_{uuu} + B g_{xxx} \left(g^{*}_{u} \otimes g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{u} \otimes z_{u} \otimes z_{u} \right)+ f_{zz} \left( z_{u} \otimes z_{uu} \right)P_{u\_uu} + f_{y_{+}}g^{**}_{xx} \left(g^{*}_{u} \otimes g^{*}_{uu}\right)P_{u\_uu}  = 0$

$(ghx\sigma\sigma) : A g_{x\sigma\sigma} + B g_{x\sigma\sigma} g^{*}_x + f_{y_{+}} g^{**}_{xx}\left(g^{*}_{x} \otimes g^{*}_{\sigma\sigma}\right) + f_{zz} \left(z_{x} \otimes z_{\sigma\sigma}\right) + F_{xu_{+}u_{+}}\left(I_{n_x} \otimes vec(\Sigma)\right) = 0$
$F_{xu_{+}u_{+}} = f_{y_{+}^{\ast\ast}} g_{xuu}^{\ast\ast} (g_x^{\ast} \otimes I_{n_u^2}) + f_{zz} \left( \left( z_{xu_{+}} \otimes z_{u_{+}} \right)P_{xu\_u} + \left(z_{x} \otimes z_{u_{+}u_{+}}\right) \right) + f_{zzz}\left(z_{x} \otimes z_{u_{+}} \otimes z_{u_{+}}\right)$

$(ghu\sigma\sigma) : A g_{u\sigma\sigma} + B g_{x\sigma\sigma} g^{*}_{u} + f_{y_{+}} g^{**}_{xx}\left(g^{*}_{u} \otimes g^{*}_{\sigma\sigma}\right) + f_{zz} \left(z_{u} \otimes z_{\sigma\sigma}\right) + F_{uu_{+}u_{+}}\left(I_{n_u} \otimes vec(\Sigma_u)\right) = 0$
$F_{uu_{+}u_{+}} = f_{y_{+}^{\ast\ast}} g_{xuu}^{\ast\ast} (g_u^{\ast} \otimes I_{n_u^2})  + f_{zz} \left( \left( z_{uu_{+}} \otimes z_{u_{+}} \right)P_{uu\_u} + \left(z_{u} \otimes z_{u_{+}u_{+}}\right) \right) + f_{zzz}\left(z_{u} \otimes z_{u_{+}} \otimes z_{u_{+}}\right)$

A and B are the common perturbation matrices:

$A = f_{y_0} + \begin{pmatrix} \underbrace{0}_{n\times n_{static}} &\vdots& \underbrace{f_{y^{**}_{+}} \cdot g^{**}_{x}}_{n \times n_{spred}} &\vdots& \underbrace{0}_{n\times n_{frwd}}  \end{pmatrix}$and $B = \begin{pmatrix} \underbrace{0}_{n \times n_{static}}&\vdots & \underbrace{0}_{n \times n_{pred}} & \vdots & \underbrace{f_{y^{**}_{+}}}_{n \times n_{sfwrd}} \end{pmatrix}$

and $z=(y_{-}^{\ast}; y; y_{+}^{\ast\ast}; u)$ denotes the dynamic model variables as in `M_.lead_lag_incidence`, $y^\ast$ denote state variables, $y^{\ast\ast}$ denote forward looking variables, $y_+$ denote the variables with a lead, $y_{-}$ denote variables with a lag, $y_0$ denote variables at period t, $f$ the model equations, and $f_z$ the first-order dynamic model derivatives, $f_{zz}$ the second-order dynamic derivatives, and $f_{zzz}$ the third-order dynamic model derivatives. Then:
$z_{x} = \begin{pmatrix}I\\g_{x}\\g^{**}_{x} g^{*}_{x}\\0\end{pmatrix}$, $z_{u} =\begin{pmatrix}0\\g_{u}\\g^{**}_{x} \cdot g^{*}_{u}\\I\end{pmatrix}$, $z_{u_{+}} =\begin{pmatrix}0\\0\\g^{**}_{u}\\0\end{pmatrix}$
$z_{xx} = \begin{pmatrix} 0\\g_{xx}\\g^{**}_{x} \left( g^{*}_x \otimes g^{*}_{x} \right) + g^{**}_{x} g^{*}_{x}\\0\end{pmatrix}$, $z_{xu} =\begin{pmatrix}0\\g_{xu}\\g^{**}_{xx} \left( g^{*}_x \otimes g^{*}_{u} \right) + g^{**}_{x} g^{*}_{xu}\\0\end{pmatrix}$, $z_{uu} =\begin{pmatrix}0\\g_{uu}\\g^{**}_{xx} \left( g^{*}_u \otimes g^{*}_{u} \right) + g^{**}_{x} g^{*}_{uu}\\0\end{pmatrix}$,
$z_{xu_{+}} =\begin{pmatrix}0\\0\\g^{**}_{xu} \left( g^{*}_x \otimes I \right)\\0\end{pmatrix}$, $z_{uu_{+}} =\begin{pmatrix}0\\0\\g^{**}_{xu} \left( g^{*}_{u} \otimes I \right)\\0\end{pmatrix}$, $z_{u_{+}u_{+}} =\begin{pmatrix}0\\0\\g^{\ast\ast}_{uu}\\0\end{pmatrix}$, $z_{\sigma\sigma} = \begin{pmatrix}0\\ g_{\sigma\sigma}\\ g^{\ast\ast}_{x}g^{\ast}_{\sigma\sigma} + g^{\ast\ast}_{\sigma\sigma}\\0 \end{pmatrix}$

$P$ are permutation matrices that can be computed using Matlab's `ipermute` function.

## Parameter derivatives of perturbation solutions
First, we need the parameter derivatives of first, second, third, and fourth derivatives of the dynamic model (i.e. g1,g2,g3,g4 in dynamic files), I make use of the implicit function theorem: Let $f_{z^k}$ denote the kth derivative (wrt all dynamic variables) of the dynamic model, then let $df_{z^k}$ denote the first-derivative (wrt all model parameters) of $f_{z^k}$ evaluated at the steady state. Note that $f_{z^k}$  is a function of both the model parameters $\theta$  and of the steady state of all dynamic variables $\bar{z}$, which also depend on the parameters. Hence, implicitly $f_{z^k}=f_{z^k}(\theta,\bar{z}(\theta))$  and $df_{z^k}$ consists of two parts:
1. direct derivative wrt to all model parameters given by the preprocessor in the `_params_derivs.m` files
2. contribution of derivative of steady state of dynamic variables (wrt all model parameters): $f_{z^{k+1}} \cdot d\bar{z}$
Note that we already have functionality to compute $d\bar{z}$ analytically.

Having this, the above perturbation systems are basically equations of the following types
$AX +BXC = RHS$ or $AX = RHS$
Now when taking the derivative (wrt to one single parameter $\theta_j$), we get
$A\mathrm{d}\{X\} + B\mathrm{d}\{X\}C = \mathrm{d}\{RHS\} - \mathrm{d}\{A\}X -  \mathrm{d}\{B\}XC - BX\mathrm{d}\{C\}$
or
$A\mathrm{d}\{X\}  = \mathrm{d}\{RHS\} - \mathrm{d}\{A\}X$
The first one is a Sylvester type equation, the second one can be solved by taking the inverse of $A$. The only diffculty and tedious work arrises in computing (the highly sparse) derivatives of $RHS$.

***

# New functions: `
## get_perturbation_params_derivs.m`and `get_perturbation_params_derivs_numerical_objective.m`
* The parameter derivatives up to third order are computed in the new function`get_perturbation_params_derivs.m` both analytically and numerically. For numerical derivatives `get_perturbation_params_derivs_numerical_objective.m` is the objective for `fjaco.m` or `hessian_sparse.m` or `hessian.m`.
* `get_perturbation_params_derivs.m` is basically an extended version of the previous `get_first_order_solution_params_derivs.m` function.
* * `get_perturbation_params_derivs_numerical_objective.m`builds upon `identification_numerical_objective.m`. It is used for numerical derivatives, whenever `analytic_derivation_mode=-1|-2`. It takes from `identification_numerical_objective.m` the parts that compute numerical parameter Jacobians of steady state, dynamic model equations, and perturbation solution matrices. Hence, these parts are removed in `identification_numerical_objective.m` and it only computes numerical parameter Jacobian of moments and spectrum which are needed for identification analysis in `get_identification_jacobians.m`, when `analytic_derivation_mode=-1` only.
* Detailed changes:
      * Most important: notation of this function is now in accordance to the k_order_solver, i.e. we do not compute derivatives of Kalman transition matrices A and B, but rather the solution matrices ghx,ghu,ghxx,ghxu,ghuu,ghs2,ghxxx,ghxxu,ghxuu,ghuuu,ghxss,ghuss in accordance with notation used in `oo_.dr`. As a byproduct at first-order, focusing on ghx and ghu instead of Kalman transition matrices A and B makes the computations slightly faster for large models (e.g. for Quest the computations were faster by a couple of seconds, not much, but okay).
      * Removed use of `kstate`, see also Dynare/dynare#1653 and Dynare/dynare!1656
      * Output arguments are stored in a structure `DERIVS`, there is also a flag `d2flag` that computes parameter hessians needed only in `dsge_likelihood.m`.
      * Removed `kronflag` as input. `options_.analytic_derivation_mode` is now used instead of `kronflag`.
      * Removed `indvar`, an index that was used to selected specific variables in the derivatives. This is not needed, as we always compute the parameter derivatives for all variables first and then select a subset of variables. The selection now takes place in other functions, like `dsge_likelihood.m`.
      * Introduced some checks: (i) deterministic exogenous variables are not supported, (ii) Kronecker method only compatible with first-order approximation so reset to sylvester method, (iii) for purely backward or forward models we need to be careful with the rows in `M_.lead_la	g_incidence`, (iv) if `_params_derivs.m` files are missing an error is thrown.
      * For numerical derivatives, if mod file does not contain an `estimated_params_block`, a temporary one with the most important parameter information is created.
## `unfold_g4.m`
* When evaluating g3 and g4 one needs to take into account that these do not contain symmetric elements, so one needs to use `unfold_g3.m` and the new function `unfold_g4.m`. This returns an unfolded version of the same matrix (i.e. with symmetric elements).

***

# New test models
`.gitignore` and `Makefile.am` are changed accordingly. Also now it is possible to run test suite on analytic_derivatives, i.e. run `make check m/analytic_derivatives`

## `analytic_derivatives/BrockMirman_PertParamsDerivs.mod`
* This is the Brock Mirman model, where we know the exact policy function $g$ for capital and consumption. As this does not imply a nonzero $g_{\sigma\sigma}$, $g_{x\sigma\sigma}$, $g_{u\sigma\sigma}$ I added some artificial equations to get nonzero solution matrices with respect to $\sigma$. The true perturbation solution matrices  $g_x$ , $g_u$, $g_{xx}$, $g_{xu}$, $g_{uu}$, $g_{\sigma\sigma}$, $g_{xxx}$, $g_{xxu}$, $g_{xuu}$, $g_{uuu}$, $g_{x\sigma\sigma}$, $g_{u\sigma\sigma}$ are then computed analytically with Matlab's symbolic toolbox and saved in `nBrockMirmanSYM.mat`. There is a preprocessor flag that recreates these analytical computations if changes are needed (and to check whether I made some errors here ;-) )
* Then solution matrices up to third order and their parameter Jacobians are then compared to the ones computed by Dynare's `k_order_solver` and by `get_perturbation_params_derivs` for all `analytic_derivation_mode`'s. There will be an error if the maximum absolute deviation is too large, i.e. for numerical derivatives (`analytic_derivation_mode=-1|-2`) the tolerance is choosen lower (around 1e-5); for analytical methods we are stricter: around 1e-13 for first-order,  1e-12 for second order, and 1e-11 for third-order.
* As a side note, this mod file also checks Dynare's `k_order_solver` algorithm and throws an error if something is wrong.
* This test model shows that the new functionality works well. And analytical derivatives perform way better and accurate than numerical ones, even for this small model.
## `analytic_derivatives/burnside_3_order_PertParamsDerivs.mod`
* This builds upon `tests/k_order_perturbation/burnside_k_order.mod` and computes the true parameter derivatives analytically by hand.
      * This test model also shows that the new functionality works well.

## `analytic_derivatives/LindeTrabandt2019.mod`
* Shows that the new functionality also works for medium-sized models, i.e. a SW type model solved at third order with 35 variables (11 states). 2 shocks and 20 parameters.
* This mod file can be used to tweak the speed of the computations in the future.
* Compares numerical versus analytical parameter derivatives (for first, second and third order). Note that this model clearly shows that numerical ones are quite different than analytical ones even at first order!
## `identification/LindeTrabandt2019_xfail.mod`
* This model is a check for issue Dynare/dynare#1595, see fjaco.m below, and will fail.
* Removed `analytic_derivatives/ls2003.mod` as this mod file is neither in the testsuite nor does it work.

***

# Detailed changes in other functions
## `get_first_order_solution_params_derivs.m`
* Deleted, or actually, renamed to `get_perturbation_params_derivs.m`, as this function now is able to compute the derivatives up to third order

## `identification_numerical_objective.m`
* `get_perturbation_params_derivs_numerical_objective.m`builds upon `identification_numerical_objective.m`. It takes from `identification_numerical_objective.m` the parts that compute numerical parameter Jacobians of steady state, dynamic model equations, and perturbation solution matrices. Hence, these parts are removed in `identification_numerical_objective.m` and it only computes numerical parameter Jacobian of moments and spectrum which are needed for identification analysis in `get_identification_jacobians.m`, when `analytic_derivation_mode=-1` only.

## `dsge_likelihood.m`
* As `get_first_order_solution_params_derivs.m`is renamed to `get_perturbation_params_derivs.m`, the call is adapted. That is,`get_perturbation_params_derivs` does not compute the derivatives of the Kalman transition `T`matrix anymore, but instead of the dynare solution matrix `ghx`. So we recreate `T` here as this amounts to adding some zeros and focusing on selected variables only.
* Added some checks to make sure the first-order approximation is selected.
* Removed `kron_flag` as input, as `get_perturbation_params_derivs` looks into `options_.analytic_derivation_mode` for `kron_flag`.

## `dynare_identification.m`
* make sure that setting `analytic_derivation_mode` is set both in `options_ident` and `options_`. Note that at the end of the function we restore the `options_` structure, so all changes are local. In a next merge request, I will remove the global variables to make all variables local.

## `get_identification_jacobians.m`
* As `get_first_order_solution_params_derivs.m`is renamed to `get_perturbation_params_derivs.m`, the call is adapted. That is,`get_perturbation_params_derivs` does not compute the derivatives of the Kalman transition `A` and `B` matrix anymore, but instead of the dynare solution matrix `ghx` and `ghu`. So we recreate these matrices here instead of in `get_perturbation_params_derivs.m`.
* Added `str2func` for better function handles in `fjaco.m`.

## `fjaco.m`
* make `tol`an option, which can be adjusted by changing `options_.dynatol.x`for identification and parameter derivatives purposes.
* include a check and an informative error message, if numerical derivatives (two-sided finite difference method) yield errors in `resol.m` for identification and parameter derivatives purposes. This closes issue  Dynare/dynare#1595.
* Changed year of copyright to 2010-2017,2019

***

# Further suggestions and questions
* Ones this is merged, I will merge request an improvement of the identification toolbox, which will work up to third order using the pruned state space. This will also remove some issues and bugs, and also I will remove global variables in this request.
* The third-order derivatives can be further improved by taking sparsity into account and use mex versions for kronecker products etc. I leave this for further testing (and if anybody actually uses this ;-) )
2019-12-17 18:17:09 +00:00