Merge pull request #474 from houtanb/geweke

Geweke
time-shift
Sébastien Villemot 2013-09-25 05:54:31 -07:00
commit cfea63b4de
8 changed files with 337 additions and 18 deletions

View File

@ -4212,12 +4212,19 @@ graphs of smoothed shocks, smoothed observation errors, smoothed and historical
@algorithmshead
The Monte Carlo Markov Chain (MCMC) univariate diagnostics are generated
by the estimation command if @ref{mh_nblocks} is larger than 1, if
@ref{mh_replic} is larger than 2000, and if option @ref{nodiagnostic} is
not used. As described in section 3 of @cite{Brooks and Gelman (1998)}
the convergence diagnostics are based on comparing pooled and within
MCMC moments (Dynare displays the second and third order moments, and
The Monte Carlo Markov Chain (MCMC) diagnostics are generated
by the estimation command if @ref{mh_replic} is larger than 2000 and if
option @ref{nodiagnostic} is not used. If @ref{mh_nblocks} is equal to one,
the convergence diagnostics of @cite{Geweke (1992,1999)} is computed. It uses a
chi square test to compare the means of the first and last draws specified by
@ref{geweke_interval} after discarding the burnin of @ref{mh_drop}. The test is
computed using variance estimates under the assumption of no serial correlation
as well as using tapering windows specified in @ref{taper_steps}.
If @ref{mh_nblocks} is larger than 1, the convergence diagnostics of
@cite{Brooks and Gelman (1998)} are used instead.
As described in section 3 of @cite{Brooks and Gelman (1998)} the univariate
convergence diagnostics are based on comparing pooled and within MCMC moments
(Dynare displays the second and third order moments, and
the length of the Highest Probability Density interval covering 80% of
the posterior distribution). Due to computational reasons, the
multivariate convergence diagnostic does not follow @cite{Brooks and
@ -4358,8 +4365,8 @@ the total number of Metropolis draws available. Default:
@code{2}
@item mh_drop = @var{DOUBLE}
The fraction of initially generated parameter vectors to be dropped
before using posterior simulations. Default: @code{0.5}
@anchor{mh_drop}
The fraction of initially generated parameter vectors to be dropped as a burnin before using posterior simulations. Default: @code{0.5}
@item mh_jscale = @var{DOUBLE}
The scale to be used for the jumping distribution in
@ -4758,6 +4765,18 @@ Value used to test if a generalized eigenvalue is 0/0 in the generalized
Schur decomposition (in which case the model does not admit a unique
solution). Default: @code{1e-6}.
@item taper_steps = [@var{INTEGER1} @var{INTEGER2} @dots{}]
@anchor{taper_steps}
Percent tapering used for the spectral window in the @cite{Geweke (1992,1999)}
convergence diagnostics (requires @ref{mh_nblocks}=1). The tapering is used to
take the serial correlation of the posterior draws into account. Default: @code{[4 8 15]}.
@item geweke_interval = [@var{DOUBLE} @var{DOUBLE}]
@anchor{geweke_interval}
Percentage of MCMC draws at the beginning and end of the MCMC chain taken
to compute the @cite{Geweke (1992,1999)} convergence diagnostics (requires @ref{mh_nblocks}=1)
after discarding the first @ref{mh_drop} percent of draws as a burnin. Default: @code{[0.2 0.5]}.
@end table
@customhead{Note}
@ -5048,6 +5067,56 @@ Upper/lower bound of the 90\% HPD interval taking into account both parameter an
@end defvr
@defvr {MATLAB/Octave variable} oo_.convergence.geweke
@anchor{convergence.geweke}
Variable set by the convergence diagnostics of the @code{estimation} command when used with @ref{mh_nblocks}=1 option (@pxref{mh_nblocks}).
Fields are of the form:
@example
@code{oo_.convergence.geweke.@var{VARIABLE_NAME}.@var{DIAGNOSTIC_OBJECT}}
@end example
where @var{DIAGNOSTIC_OBJECT} is one of the following:
@table @code
@item posteriormean
Mean of the posterior parameter distribution
@item posteriorstd
Standard deviation of the posterior parameter distribution
@item nse_iid
Numerical standard error (NSE) under the assumption of iid draws
@item rne_iid
Relative numerical efficiency (RNE) under the assumption of iid draws
@item nse_x
Numerical standard error (NSE) when using an x% taper
@item rne_x
Relative numerical efficiency (RNE) when using an x% taper
@item pooled_mean
Mean of the parameter when pooling the beginning and end parts of the chain
specified in @ref{geweke_interval} and weighting them with their relative precision.
It is a vector containing the results under the iid assumption followed by the ones
using the @ref{taper_steps} (@pxref{taper_steps}).
@item pooled_nse
NSE of the parameter when pooling the beginning and end parts of the chain and weighting them with their relative precision. See @code{pooled_mean}
@item prob_chi2_test
p-value of a chi squared test for equality of means in the beginning and the end
of the MCMC chain. See @code{pooled_mean}. A value above 0.05 indicates that
the null hypothesis of equal means and thus convergence cannot be rejected
at the 5 percent level. Differing values along the @ref{taper_steps} signal
the presence of significant autocorrelation in draws. In this case, the
estimates using a higher tapering are usually more reliable.
@end table
@end defvr
@deffn Command model_comparison @var{FILENAME}[(@var{DOUBLE})]@dots{};
@deffnx Command model_comparison (marginal_density = laplace | modifiedharmonicmean) @var{FILENAME}[(@var{DOUBLE})]@dots{};
@ -8692,6 +8761,16 @@ Fernández-Villaverde, Jesús and Juan Rubio-Ramírez (2005): ``Estimating
Dynamic Equilibrium Economies: Linear versus Nonlinear Likelihood,''
@i{Journal of Applied Econometrics}, 20, 891--910
@item
Geweke, John (1992): ``Evaluating the accuracy of sampling-based approaches
to the calculation of posterior moments'', in J.O. Berger, J.M. Bernardo,
A.P. Dawid, and A.F.M. Smith (eds.) Proceedings of the Fourth Valencia
International Meeting on Bayesian Statistics, pp. 169--194, Oxford University Press
@item
Geweke, John (1999): ``Using simulation methods for Bayesian econometric models:
Inference, development and communication,'' @i{Econometric Reviews}, 18(1), 1--73
@item
Ireland, Peter (2004): ``A Method for Taking Models to the Data,''
@i{Journal of Economic Dynamics and Control}, 28, 1205--26

View File

@ -1,4 +1,4 @@
function McMCDiagnostics(options_, estim_params_, M_)
function oo_ = McMCDiagnostics(options_, estim_params_, M_, oo_)
% function McMCDiagnostics
% Computes convergence tests
%
@ -8,7 +8,7 @@ function McMCDiagnostics(options_, estim_params_, M_)
% M_ [structure]
%
% OUTPUTS
% none
% oo_ [structure]
%
% SPECIAL REQUIREMENTS
% none
@ -38,10 +38,6 @@ MhDirectoryName = CheckPath('metropolis',M_.dname);
TeX = options_.TeX;
nblck = options_.mh_nblck;
% Brooks and Gelman tests need more than one block
if nblck == 1
return;
end
npar = estim_params_.nvx;
npar = npar + estim_params_.nvn;
npar = npar + estim_params_.ncx;
@ -56,6 +52,57 @@ NumberOfMcFilesPerBlock = size(dir([MhDirectoryName ,filesep, M_.fname '_mh*_blc
% check if all previous files are there for block 1
check_presence_consecutive_MC_files(MhDirectoryName,M_.fname,1)
if nblck == 1 % Brooks and Gelman tests need more than one block
convergence_diagnostics_geweke=zeros(npar,4+2*length(options_.convergence.geweke.taper_steps));
if any(options_.convergence.geweke.geweke_interval<0) || any(options_.convergence.geweke.geweke_interval>1) || length(any(options_.convergence.geweke.geweke_interval<0))~=2 ...
|| (options_.convergence.geweke.geweke_interval(2)-options_.convergence.geweke.geweke_interval(1)<0)
fprintf('\nCONVERGENCE DIAGNOSTICS: Invalid option for geweke_interval. Using the default of [0.2 0.5].\n')
options_.convergence.geweke.geweke_interval=[0.2 0.5];
end
first_obs_begin_sample = max(1,ceil(options_.mh_drop*options_.mh_replic));
last_obs_begin_sample = first_obs_begin_sample+round(options_.convergence.geweke.geweke_interval(1)*options_.mh_replic*options_.mh_drop);
first_obs_end_sample = first_obs_begin_sample+round(options_.convergence.geweke.geweke_interval(2)*options_.mh_replic*options_.mh_drop);
param_name=[];
for jj=1:npar
param_name = strvcat(param_name,get_the_name(jj,options_.TeX,M_,estim_params_,options_));
end
fprintf('\nGeweke (1992) Convergence Tests, based on means of draws %d to %d vs %d to %d.\n',first_obs_begin_sample,last_obs_begin_sample,first_obs_end_sample,options_.mh_replic);
fprintf('p-values are for Chi2-test for equality of means.\n');
Geweke_header={'Parameter', 'Post. Mean', 'Post. Std', 'p-val No Taper'};
print_string=['%',num2str(size(param_name,2)+3),'s \t %12.3f \t %12.3f \t %12.3f'];
print_string_header=['%',num2str(size(param_name,2)+3),'s \t %12s \t %12s \t %12s'];
for ii=1:length(options_.convergence.geweke.taper_steps)
Geweke_header=[Geweke_header, ['p-val ' num2str(options_.convergence.geweke.taper_steps(ii)),'% Taper']];
print_string=[print_string,'\t %12.3f'];
print_string_header=[print_string_header,'\t %12s'];
end
print_string=[print_string,'\n'];
print_string_header=[print_string_header,'\n'];
fprintf(print_string_header,Geweke_header{1,:});
for jj=1:npar
startline=0;
for n = 1:NumberOfMcFilesPerBlock
load([MhDirectoryName '/' M_.fname '_mh',int2str(n),'_blck1.mat'],'x2');
nx2 = size(x2,1);
param_draws(startline+(1:nx2),1) = x2(:,jj);
startline = startline + nx2;
end
[results_vec, results_struct] = geweke_moments(param_draws,options_);
convergence_diagnostics_geweke(jj,:)=results_vec;
param_draws1 = param_draws(first_obs_begin_sample:last_obs_begin_sample,:);
param_draws2 = param_draws(first_obs_end_sample:end,:);
[results_vec1] = geweke_moments(param_draws1,options_);
[results_vec2] = geweke_moments(param_draws2,options_);
results_struct = geweke_chi2_test(results_vec1,results_vec2,results_struct,options_);
eval(['oo_.convergence.geweke.',param_name(jj,:),'=results_struct;'])
fprintf(print_string,param_name(jj,:),results_struct.posteriormean,results_struct.posteriorstd,results_struct.prob_chi2_test)
end
skipline(2);
return;
end
for blck = 2:nblck
tmp = size(dir([MhDirectoryName ,filesep, M_.fname '_mh*_blck' int2str(blck) '.mat']),1);
if tmp~=NumberOfMcFilesPerBlock

View File

@ -611,8 +611,8 @@ if (any(bayestopt_.pshape >0 ) && options_.mh_replic) || ...
CutSample(M_, options_, estim_params_);
return
else
if ~options_.nodiagnostic && options_.mh_replic > 2000 && options_.mh_nblck > 1
McMCDiagnostics(options_, estim_params_, M_);
if ~options_.nodiagnostic && options_.mh_replic > 2000
oo_= McMCDiagnostics(options_, estim_params_, M_,oo_);
end
%% Here i discard first half of the draws:
CutSample(M_, options_, estim_params_);

74
matlab/geweke_chi2_test.m Normal file
View File

@ -0,0 +1,74 @@
function results_struct = geweke_chi2_test(results1,results2,results_struct,options)
% results_struct = geweke_chi2_test(results1,results2,results_struct,options)
% PURPOSE: computes Geweke's chi-squared test for two sets of MCMC sample draws
%
% INPUTS
% results1 [1 by (4+n_taper*2) vector] vector with post. mean,
% std, NSE_iid, RNE_iid, and tapered NSE and RNE
% for chain part 1
% results2 [1 by (4+n_taper*2) vector] vector with post. mean,
% std, NSE_iid, RNE_iid, and tapered NSE and RNE
% for chain part 2
% results_struct [structure] results structure generated by geweke_moments
% Dynareoptions [structure]
%
% OUTPUTS
% results_struct [structure] containing the following fields:
% pooled_mean Pooled mean of the chain parts, weighted
% with precision
% rpooled_nse Pooled NSE of the chain parts, weighted
% with precision
% prob_chi2_test p-value of Chi2-test for equality of
% means in both chain parts
% -----------------------------------------------------------------
%
% SPECIAL REQUIREMENTS
% None.
% Copyright (C) 2013 Dynare Team
%
% This file is part of Dynare.
%
% Dynare is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
%
% Dynare is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with Dynare. If not, see <http://www.gnu.org/licenses/>.
%
% ------------------------------------------------
% REFERENCES: Geweke (1992), `Evaluating the accuracy of sampling-based
% approaches to the calculation of posterior moments', in J.O. Berger,
% J.M. Bernardo, A.P. Dawid, and A.F.M. Smith (eds.) Proceedings of
% the Fourth Valencia International Meeting on Bayesian Statistics,
% pp. 169-194, Oxford University Press
% Geweke (1999): `Using simulation methods for Bayesian econometric models:
% Inference, development and communication', Econometric Reviews, 18(1),
% 1-73
% written by: Johannes Pfeifer,
% based on code by James P. LeSage, who in turn
% drew on MATLAB programs written by Siddartha Chib
for k=1:length(options.convergence.geweke.taper_steps)+1;
NSE=[results1(:,3+(k-1)*2) results2(:,3+(k-1)*2)];
means=[results1(:,1) results2(:,1)];
diff_Means=means(:,1)-means(:,2);
sum_of_weights=sum(1./(NSE.^2),2);
pooled_mean=sum(means./(NSE.^2),2)./sum_of_weights;
pooled_NSE=1./sqrt(sum_of_weights);
test_stat=diff_Means.^2./sum(NSE.^2,2);
p = 1-chi2cdf(test_stat,1);
results_struct.pooled_mean(:,k) = pooled_mean;
results_struct.pooled_nse(:,k) = pooled_NSE;
results_struct.prob_chi2_test(:,k) = p;
end;

109
matlab/geweke_moments.m Normal file
View File

@ -0,0 +1,109 @@
function [results_vec, results_struct] = geweke_moments(draws,Dynareoptions)
%[results_vec, results_struct] = geweke_moments(draws,Dynareoptions)
% PURPOSE: computes Gewke's convergence diagnostics NSE and RNE
% (numerical std error and relative numerical efficiencies)
% INPUTS
% draws [ndraws by 1 vector]
% Dynareoptions [structure]
%
% OUTPUTS
% results_vec
% results_struct [structure] containing the following fields:
% posteriormean= posterior parameter mean
% posteriorstd = posterior standard deviation
% nse_iid = nse assuming no serial correlation for variable i
% rne_iid = rne assuming no serial correlation for variable i
% nse_x = nse using x% autocovariance tapered estimate
% rne_x = rne using x% autocovariance taper
% -----------------------------------------------------------------
%
% SPECIAL REQUIREMENTS
% None.
% Copyright (C) 2013 Dynare Team
%
% This file is part of Dynare.
%
% Dynare is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
%
% Dynare is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with Dynare. If not, see <http://www.gnu.org/licenses/>.
% REFERENCES: Geweke (1992), `Evaluating the accuracy of sampling-based
% approaches to the calculation of posterior moments', in J.O. Berger,
% J.M. Bernardo, A.P. Dawid, and A.F.M. Smith (eds.) Proceedings of
% the Fourth Valencia International Meeting on Bayesian Statistics,
% pp. 169-194, Oxford University Press
% Geweke (1999): `Using simulation methods for Bayesian econometric models:
% Inference, development and communication', Econometric Reviews, 18(1),
% 1-73
% -----------------------------------------------------------------
% written by: Johannes Pfeifer,
% based on code by James P. LeSage, who in turn
% drew on MATLAB programs written by Siddartha Chib
ndraw = size(draws,1);
n_groups=100;
taper_steps=Dynareoptions.convergence.geweke.taper_steps;
results_vec=zeros(1,4+2*length(taper_steps));
ns = floor(ndraw/n_groups); %step_size
n_draws_used = ns*n_groups; %effective number of draws used after rounding down
window_means= zeros(n_groups,1);
window_uncentered_variances= zeros(n_groups,1);
for ig=1:n_groups;
window_means(ig,1)=sum(draws((ig-1)*ns+1:ig*ns,1))/ns;
window_uncentered_variances(ig,1)=sum(draws((ig-1)*ns+1:ig*ns,1).^2)/ns;
end; %for ig
total_mean=mean(window_means);
total_variance=mean(window_uncentered_variances)-total_mean^2;
% save posterior means and std deviations to results_struct structure
results_vec(1,1)=total_mean;
results_vec(1,2)=sqrt(total_variance);
results_struct.posteriormean = total_mean;
results_struct.posteriorstd = results_vec(1,2);
% numerical standard error assuming no serial correlation
NSE=std(draws(1:n_draws_used,1),1)/sqrt(n_draws_used);
% save to results_struct structure
results_vec(1,3)=NSE;
results_vec(1,4)=total_variance/(n_draws_used*NSE^2);
results_struct.nse_iid = NSE;
results_struct.rne_iid = results_vec(1,4);
%get autocovariance of grouped means
centered_window_means=window_means-total_mean;
autocov_grouped_means=zeros(n_groups,1);
for lag=0:n_groups-1;
autocov_grouped_means(lag+1)=centered_window_means(lag+1:n_groups,1)'*centered_window_means(1:n_groups-lag,1)/100;
end;
% numerical standard error with tapered autocovariance functions
for taper_index=1:length(taper_steps)
taper=taper_steps(taper_index);
taper_lags=(1:taper-1)';
taper_lag_weight=1-taper_lags/taper;
tapered_sum_of_covariances=autocov_grouped_means(1)+sum(2*taper_lag_weight.*autocov_grouped_means(1+taper_lags));
NSE_taper=sqrt(tapered_sum_of_covariances/n_groups);
% save results_struct in structure
results_vec(1,4+taper_index*2-1)=NSE_taper;
results_vec(1,4+taper_index*2)=total_variance/(n_draws_used*NSE_taper^2);
eval(['results_struct.nse_taper_',num2str(taper),'= NSE_taper;']);
eval(['results_struct.rne_taper_',num2str(taper),'= total_variance/(n_draws_used*NSE_taper^2);']);
end; % end of for mm loop

View File

@ -577,6 +577,10 @@ options_.osr.verbose=2;
% use GPU
options_.gpu = 0;
%Geweke convergence diagnostics
options_.convergence.geweke.taper_steps=[4 8 15];
options_.convergence.geweke.geweke_interval=[0.2 0.5];
% initialize persistent variables in priordens()
priordens([],[],[],[],[],[],1);
% initialize persistent variables in dyn_first_order_solver()

View File

@ -113,7 +113,7 @@ class ParsingDriver;
%token LYAPUNOV_FIXED_POINT_TOL LYAPUNOV_DOUBLING_TOL LYAPUNOV_SQUARE_ROOT_SOLVER_TOL LOG_DEFLATOR LOG_TREND_VAR LOG_GROWTH_FACTOR MARKOWITZ MARGINAL_DENSITY MAX MAXIT
%token MFS MH_DROP MH_INIT_SCALE MH_JSCALE MH_MODE MH_NBLOCKS MH_REPLIC MH_RECOVER MIN MINIMAL_SOLVING_PERIODS SOLVE_MAXIT
%token MODE_CHECK MODE_CHECK_NEIGHBOURHOOD_SIZE MODE_CHECK_SYMMETRIC_PLOTS MODE_CHECK_NUMBER_OF_POINTS MODE_COMPUTE MODE_FILE MODEL MODEL_COMPARISON MODEL_INFO MSHOCKS ABS SIGN
%token MODEL_DIAGNOSTICS MODIFIEDHARMONICMEAN MOMENTS_VARENDO DIFFUSE_FILTER SUB_DRAWS
%token MODEL_DIAGNOSTICS MODIFIEDHARMONICMEAN MOMENTS_VARENDO DIFFUSE_FILTER SUB_DRAWS TAPER_STEPS GEWEKE_INTERVAL
%token <string_val> NAME
%token NAN_CONSTANT NO_STATIC NOBS NOCONSTANT NODISPLAY NOCORR NODIAGNOSTIC NOFUNCTIONS
%token NOGRAPH NOMOMENTS NOPRINT NORMAL_PDF
@ -1555,7 +1555,9 @@ estimation_options : o_datafile
| o_ar
| o_endogenous_prior
| o_use_univariate_filters_if_singularity_is_detected
| o_qz_zero_threshold
| o_qz_zero_threshold
| o_taper_steps
| o_geweke_interval
;
list_optim_option : QUOTED_STRING COMMA QUOTED_STRING
@ -2388,6 +2390,8 @@ o_noprint : NOPRINT { driver.option_num("noprint", "1"); };
o_xls_sheet : XLS_SHEET EQUAL symbol { driver.option_str("xls_sheet", $3); };
o_xls_range : XLS_RANGE EQUAL range { driver.option_str("xls_range", $3); };
o_filter_step_ahead : FILTER_STEP_AHEAD EQUAL vec_int { driver.option_vec_int("filter_step_ahead", $3); };
o_taper_steps : TAPER_STEPS EQUAL vec_int { driver.option_vec_int("taper_steps", $3); };
o_geweke_interval : GEWEKE_INTERVAL EQUAL vec_value { driver.option_num("geweke_interval",$3); };
o_constant : CONSTANT { driver.option_num("noconstant", "0"); };
o_noconstant : NOCONSTANT { driver.option_num("noconstant", "1"); };
o_mh_recover : MH_RECOVER { driver.option_num("mh_recover", "1"); };

View File

@ -222,6 +222,8 @@ string eofbuff;
<DYNARE_STATEMENT>presample {return token::PRESAMPLE;}
<DYNARE_STATEMENT>lik_algo {return token::LIK_ALGO;}
<DYNARE_STATEMENT>lik_init {return token::LIK_INIT;}
<DYNARE_STATEMENT>taper_steps {return token::TAPER_STEPS;}
<DYNARE_STATEMENT>geweke_interval {return token::GEWEKE_INTERVAL;}
<DYNARE_STATEMENT>graph {return token::GRAPH;}
<DYNARE_STATEMENT>nograph {return token::NOGRAPH;}
<DYNARE_STATEMENT>nodisplay {return token::NODISPLAY;}