www.delorie.com/gnu/docs/mcsim/mcsim_42.html | search |
Buy GNU books! | |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Statistical models are defined in the simulation specification file, rather than in the model definition file. It is necessary to define a statistical model (with parameter dependencies, prior distributions and likelihood) if you want to use MCMC sampling. MCMC sampling will then give you in output a sample of parameters drawn from their joint posterior distribution. Take for example the simple linear regression model:
y_i = N(Mu_i, Sigma^2), Mu_i = Alpha + Beta * (x_i - x_bar).
where the observed (x,y) pairs are (1,1), (2,3), (3,3), (4,3) and (5,5). The quantities Alpha and Beta are given N(0,10000) priors, and 1/Sigma^2 is given a Gamma(1e-2,1e-2) prior. x_bar is the average of the above values for x. We want the posterior distributions of Alpha, Beta, and Sigma^2.
The first thing to do is to define a model to compute y as a function of x. Here is such a model (quite similar to the one distributed with MCSim source code (see section B.1 `linear.model'):
# --------------------------------------------- # Model definition file for a linear model # --------------------------------------------- Outputs = {y}; # Model parameters Alpha = 0; Beta = 0; Sigma2 = 1; x_bar = 0; CalcOutputs { y = Alpha + Beta * (t - x_bar); } # --------------------------------------------- |
The parameters' initialization values are arbitrary, and could be
anything reasonable. They will be changed or sampled through the input
file. Note that
Sigma2is not used in the model equations, but still
needs to be defined here in order to be part of the statistical
model. On the other hand,
Mu is not defined, since we do not really need it.Finally x is replaced by the time, t
, for convenience.
An alternative would be to define an input `x' and use it instead of
t
.
We now need to write an input file specifying the distribution of y (i.e., the likelihood), and the prior distributions of the various parameters. Here is what such a file could look like:
# --------------------------------------------------------------- # Simulation input file for a linear regression # --------------------------------------------------------------- SimType (MCMC); MCMC ("linear.MCMC.out", "", "", 50000, 0, 5, 40000, 63453.1961); Level { Distrib(Alpha, Normal_v, 0, 10000); Distrib(Beta, Normal_v, 0, 10000); Distrib(Sigma2, InvGamma, 0.01, 0.01); Distrib(y, Normal_v, Prediction(y), Sigma2); Experiment { x_bar = 3.0; PrintStep (y, 1, 5, 1); Data (y, 1, 3, 3, 3, 5); } } # end Level End # --------------------------------------------------------------- |
The file begins the obvious SimType()
(see section 6.3.1 SimType()
specification) and MCMC()
(see section 6.3.6 MCMC()
specification)
keywords. The keyword Level
comes next. Level
is used to
specify the dependence between model parameters in a hierarchy. There
should be at least one Level
in every MCMC input file, even for a
non-hierarchical model like the one above (actually, "non-hierarchical"
models can be thought of has having only one level of hierarchy). See
below for further discussion of the Level
keyword. You can also
look at the MCMC input files provided as examples with MCSim source
code. The Distrib()
statements define the parameter priors.
Normal_v
specifications are used since we use variances instead
of standard deviations. The inverse-Gamma distribution is used for the
variance component, since the precision is supposed to be
Gamma-distributed. The likelihood is the distribution of the data,
given the model: it is also specified by a Distrib()
statement,
valid for every y data point. Again, note that the
Muvariable is not used. Instead the Prediction(y)
specification is used to signify the linear model output. These
distributions are in effect for every sub-level or every
Experiment
included in the current level.
The "simulations" to perform, and the corresponding data values, are
specified by the Experiment
section. Only one Experiment
is needed here, but several could be specified. In this section, the
value of
x_baris provided. The different values of x (time in our model) can be
specified via a PrintStep()
specification (see section 6.4.4 PrintStep()
specification), since they are equally spaced. More generally, a
Print()
specification could have been used (see section 6.4.3 Print()
specification). The data values are given in a Data()
statement.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
webmaster | delorie software privacy |
Copyright © 2003 by The Free Software Foundation | Updated Jun 2003 |