34.0 PISPLINE Command
PISPLINE allows the user to estimate an underlying smooth function
of M variables (x(1),...,x(m)) using noisy data based on methods
suggested by Leo Breiman. Basic references are:
- Breiman, Leo, "The PI Method for Estimating Multivariate
Functions From Noisy Data," Technometrics, May 1991, Vol. 33, No. 2.
pp 125 - 160. Note that this citation includes comments by Friedman,
Gu, Hastie & Tibshirani and a reply by Breiman.
The PISPLINE command can be run in 'batch mode' as is documented here or
as part of the MATRIX command.
The general form of the batch PISPLINE command is:
B34SEXEC PISPLINE options parameters$
MODEL Yvar = Xvar1 Xvar2 $
BISPEC options parameters$
TRISPEC options parameters$
POLYSPEC options parameters$
REVERSE options parameters$
FORECAST Xvar1( ) Xvar2( ) $
B34SEEND$
MODEL is a required sentence.
The PISPLINE command allows the user to optionally save or reread an
estimated model. The advantage of saving models is that forecasts
can be calculated without having to estimate the model again if in
subsequent steps the getmodel option is used. In order to preserve
variable storage, the order and number of the variables on the MODEL
statement and estimation options MUST be the same as the initially
saved model for a saved model to be used.
Options for PISPLINE sentence.
OUTPUTXG If present places XG on unit ICUNIT. XG(N,M,IT) where
N is in the range (1,NG), M is in the range 1 to # of
exogenous variables (MV) and IT goes from 1 to number of
products selected. Each right hand side variable is
discretized into NG equispaced values. XG(i,j,k) gives
the value at the ith point of the transform of the
jth variable in the kth product. yhat =
prod(XG(i1,1,1 )*XG(i2,2,1 ),..,XG(in,MV,1)) + ... +
prod(XG(i1,1,NG)*XG(i2,2,NG),..,XG(in,MV,NG)). XG
is written in a form that can be read by HRGRAPHICS
MARSPISP command. File name is PISPXG
IRES List Y, predicted Y (Yhat) and the residual.
SPUNCHRES Saves OBSNUM, Y, YHAT, RESIDUAL in SCA FSAVE format
on unit 44 having file name RESIDUAL.
GRAPHRES Produces a space saving graph of the residual
PLOTRES Produces a lined up plot of Y, Yhat, and the Residual
OUTPUTYG Produces the NG by NG YG matrix on unit ISUNIT by
rows. YG measures the surface fit. This option is
only possible if there are exactly 2 variables on
the right. This matrix can be read by the
HRGRAPHICS MARSPISP command. File name is PISPYG.
PMODEL Produces model description matrices.
SAVEMODEL Saves the estimated model on unit MODELUNIT.
MUREWIND Rewinds MODELUNIT before the model is saved.
GETMODEL Rereads a saved model off unit MODELUNIT.
PISPLINE parameters
IBEGIN=n1 Sets beginning observation. Defaults to 1.
IEND=n2 Sets ending observation. Defaults to NOOB.
CENTER=r1 This value is substracted from each Y-value before the
the fitting process and added back in later in the
evaluation. If CENTER is not set, the mean of Y is used.
KMB=n3 Lower bound on number of knots to try fitting. Must
be > 1. Default=2.
KMT=n4 Upper bound on knots to try fitting. default=KMB+5.
MNFIT=n5 Maximum number of products to be fitted. Default=3.
NG=n6 Number of equispaced values at which the unidimensional
fits are evaluated. Default = 50. Minimum = 20. The
larger NG, the better the forecast approximation.
JRDF=n7 Deletion is terminated when the remaining degrees of
freedom falls below or is equal to JRDF. Default=-1.
TH=r2 Parameter in the criterion for convergence of the
iteration. A smaller TH leads to more iterations.
Default = .02D+00.
EDTH=r3 A parameter used in deletion. The smaller EDTH the less
likely multiple knots will be deleted in one pass.
Default = .1D+00.
CPTH=r4 A parameter used in selecting models. Must be in range
0-10. A higher CPTH causes more deleted models to be
selected. Default = 0.0D+00.
RADD=r5 A parameter that governs how many products are
selected. Larger values favor selection of fewer
products. Default = 1.0D+00.
ICUNIT=n8 Sets unit for XG output if OUTPUTXG set. Default = 6.
ISUNIT=n9 Sets unit for YG output if OUTPUTYG set. Default = 6.
MODELUNIT=n10 Sets save/get model unit. Default = 60.
SMODELN=k1 Sets the model name. Default = 'PISPMODEL'. A
max of 10 characters can be supplied.
MCOMMENTS= (' ',
(' ')
Allows user to set model comments when the model is saved.
A maximum of 10, lines of a max of 80 characters is
allowed.
BISPEC sentence.
The BISPEC sentence performs various nonlinearity, gaussianity and
matringale tests suggested by Hinich. The form of the BISP sentence in
the BTIDEN, BTEST and MARS commands is the same. To save space, detail
for this sentence is only given under the BTIDEN command help file. If
the BISPEC sentence is given with no options or parameters, gaussianity
and nonlinearity tests will be performed using default settings. The
setting
BISPEC IAUTO ITURNO $
will perform tests for gaussianity and nonlinearity over a grid of
admissable values for the bandwidth.
TRISPEC sentence
The TRISPEC command performs 4th order nonlinearity tests suggested
by Hinich. Further detail on this sentence is listed under the BTIDEN
command.
POLYSPEC sentence
The POLYSPEC command performs various nonlinearity tests suggested
by Hinich within the sample. Further detail on this sentence is listed
under the BTIDEN command.
FORECAST sentence.
The FORECAST sentence allows users to supply observations on the
right hand side variables outside the sample period so that forecasts
can be calculated. The same number of observations must be supplied for
all right hand series. Due to the way that splines are calculated, it is
imperative that any values on the x variables NOT lie outside the ranges
of the original data. Forecasts on the right hand variables can be read
off FOREIUNIT or inputted directly via variable name.
Forecast sentence options.
FUREWIND Rewinds forecast OUTPUT unit FOREOUNIT.
NOINTERPOL The default setting is to interpolate the XG(N,M,IT)
values before the products indicated in the
discussion of OUTPUTXG are performed. If NOINTERPOL
is specified, then no interpolation is performed.
In general the larger NG, the less interpolation is
needed. Since forecasts are produced from the XG
matrix, if actual values are supplied, the
"forecasts" will differ from the "residuals" for the
same observation because of the use of the XG matrix.
NOCORNER The default is to set right-hand side variables
outside their ranges for the training dataset to
their upper or lower bounds, give a message and
calculate a forecast. If NOCORNER is set, a
message is given and forecast is not done.
Forecast sentence parameters.
FOREIUNIT=n1 Sets forecast input unit. If this parameter is
passed forecasts cannot be inputted directly. The
number of forecasts produced = the number of obs
on the SCA FSAVE file. The data on this file must
be in SCA FSAVE format.
FOREOUNIT=n2 Sets forecast output unit. If this parameter is
passed forecasts will be placed on the indicated
unit using the SCA FSAVE format.
FNAME = k1 Sets forecast variable name. Default = 'FORECAST'.
SCAFNAME=k2 Sets SCA FSAVE file name for input forecasts.
Default = 'INFORE'.
SCAFONAME=k3 Sets SCA FSAVE file name for output forecasts.
Default = 'PISPFORE'.
Direct forecast input syntax options.
Xvar1(r1, r2, r3,.....)
......
Xvark(r1, r2, r3,.....)
Sample job using a PISPLINE model with 3 exogenous variables.
Hinich tests are performed and forecasts for 4 periods are produced.
b34sexec pispline graphres ires $
model y = x1 x2 x3$
bispec iauto iturno $
trispec $
forecast x1( 10. 11. 9. 7. )
x2(.55 .77 .88 .66)
x3(.01 .11 .15 .70 )$
b34seend$
The job TESTMARS_P in b34stest.mac illustrates rereading models and
other advanced capability.
The job SIMPISP illustrates simulation of PISPLINE forecasts.