34.0 PISPLINE Command PISPLINE allows the user to estimate an underlying smooth function of M variables (x(1),...,x(m)) using noisy data based on methods suggested by Leo Breiman. Basic references are: - Breiman, Leo, "The PI Method for Estimating Multivariate Functions From Noisy Data," Technometrics, May 1991, Vol. 33, No. 2. pp 125 - 160. Note that this citation includes comments by Friedman, Gu, Hastie & Tibshirani and a reply by Breiman. The PISPLINE command can be run in 'batch mode' as is documented here or as part of the MATRIX command. The general form of the batch PISPLINE command is: B34SEXEC PISPLINE options parameters$ MODEL Yvar = Xvar1 Xvar2 $ BISPEC options parameters$ TRISPEC options parameters$ POLYSPEC options parameters$ REVERSE options parameters$ FORECAST Xvar1( ) Xvar2( ) $ B34SEEND$ MODEL is a required sentence. The PISPLINE command allows the user to optionally save or reread an estimated model. The advantage of saving models is that forecasts can be calculated without having to estimate the model again if in subsequent steps the getmodel option is used. In order to preserve variable storage, the order and number of the variables on the MODEL statement and estimation options MUST be the same as the initially saved model for a saved model to be used. Options for PISPLINE sentence. OUTPUTXG If present places XG on unit ICUNIT. XG(N,M,IT) where N is in the range (1,NG), M is in the range 1 to # of exogenous variables (MV) and IT goes from 1 to number of products selected. Each right hand side variable is discretized into NG equispaced values. XG(i,j,k) gives the value at the ith point of the transform of the jth variable in the kth product. yhat = prod(XG(i1,1,1 )*XG(i2,2,1 ),..,XG(in,MV,1)) + ... + prod(XG(i1,1,NG)*XG(i2,2,NG),..,XG(in,MV,NG)). XG is written in a form that can be read by HRGRAPHICS MARSPISP command. File name is PISPXG IRES List Y, predicted Y (Yhat) and the residual. SPUNCHRES Saves OBSNUM, Y, YHAT, RESIDUAL in SCA FSAVE format on unit 44 having file name RESIDUAL. GRAPHRES Produces a space saving graph of the residual PLOTRES Produces a lined up plot of Y, Yhat, and the Residual OUTPUTYG Produces the NG by NG YG matrix on unit ISUNIT by rows. YG measures the surface fit. This option is only possible if there are exactly 2 variables on the right. This matrix can be read by the HRGRAPHICS MARSPISP command. File name is PISPYG. PMODEL Produces model description matrices. SAVEMODEL Saves the estimated model on unit MODELUNIT. MUREWIND Rewinds MODELUNIT before the model is saved. GETMODEL Rereads a saved model off unit MODELUNIT. PISPLINE parameters IBEGIN=n1 Sets beginning observation. Defaults to 1. IEND=n2 Sets ending observation. Defaults to NOOB. CENTER=r1 This value is substracted from each Y-value before the the fitting process and added back in later in the evaluation. If CENTER is not set, the mean of Y is used. KMB=n3 Lower bound on number of knots to try fitting. Must be > 1. Default=2. KMT=n4 Upper bound on knots to try fitting. default=KMB+5. MNFIT=n5 Maximum number of products to be fitted. Default=3. NG=n6 Number of equispaced values at which the unidimensional fits are evaluated. Default = 50. Minimum = 20. The larger NG, the better the forecast approximation. JRDF=n7 Deletion is terminated when the remaining degrees of freedom falls below or is equal to JRDF. Default=-1. TH=r2 Parameter in the criterion for convergence of the iteration. A smaller TH leads to more iterations. Default = .02D+00. EDTH=r3 A parameter used in deletion. The smaller EDTH the less likely multiple knots will be deleted in one pass. Default = .1D+00. CPTH=r4 A parameter used in selecting models. Must be in range 0-10. A higher CPTH causes more deleted models to be selected. Default = 0.0D+00. RADD=r5 A parameter that governs how many products are selected. Larger values favor selection of fewer products. Default = 1.0D+00. ICUNIT=n8 Sets unit for XG output if OUTPUTXG set. Default = 6. ISUNIT=n9 Sets unit for YG output if OUTPUTYG set. Default = 6. MODELUNIT=n10 Sets save/get model unit. Default = 60. SMODELN=k1 Sets the model name. Default = 'PISPMODEL'. A max of 10 characters can be supplied. MCOMMENTS= (' ', (' ') Allows user to set model comments when the model is saved. A maximum of 10, lines of a max of 80 characters is allowed. BISPEC sentence. The BISPEC sentence performs various nonlinearity, gaussianity and matringale tests suggested by Hinich. The form of the BISP sentence in the BTIDEN, BTEST and MARS commands is the same. To save space, detail for this sentence is only given under the BTIDEN command help file. If the BISPEC sentence is given with no options or parameters, gaussianity and nonlinearity tests will be performed using default settings. The setting BISPEC IAUTO ITURNO $ will perform tests for gaussianity and nonlinearity over a grid of admissable values for the bandwidth. TRISPEC sentence The TRISPEC command performs 4th order nonlinearity tests suggested by Hinich. Further detail on this sentence is listed under the BTIDEN command. POLYSPEC sentence The POLYSPEC command performs various nonlinearity tests suggested by Hinich within the sample. Further detail on this sentence is listed under the BTIDEN command. FORECAST sentence. The FORECAST sentence allows users to supply observations on the right hand side variables outside the sample period so that forecasts can be calculated. The same number of observations must be supplied for all right hand series. Due to the way that splines are calculated, it is imperative that any values on the x variables NOT lie outside the ranges of the original data. Forecasts on the right hand variables can be read off FOREIUNIT or inputted directly via variable name. Forecast sentence options. FUREWIND Rewinds forecast OUTPUT unit FOREOUNIT. NOINTERPOL The default setting is to interpolate the XG(N,M,IT) values before the products indicated in the discussion of OUTPUTXG are performed. If NOINTERPOL is specified, then no interpolation is performed. In general the larger NG, the less interpolation is needed. Since forecasts are produced from the XG matrix, if actual values are supplied, the "forecasts" will differ from the "residuals" for the same observation because of the use of the XG matrix. NOCORNER The default is to set right-hand side variables outside their ranges for the training dataset to their upper or lower bounds, give a message and calculate a forecast. If NOCORNER is set, a message is given and forecast is not done. Forecast sentence parameters. FOREIUNIT=n1 Sets forecast input unit. If this parameter is passed forecasts cannot be inputted directly. The number of forecasts produced = the number of obs on the SCA FSAVE file. The data on this file must be in SCA FSAVE format. FOREOUNIT=n2 Sets forecast output unit. If this parameter is passed forecasts will be placed on the indicated unit using the SCA FSAVE format. FNAME = k1 Sets forecast variable name. Default = 'FORECAST'. SCAFNAME=k2 Sets SCA FSAVE file name for input forecasts. Default = 'INFORE'. SCAFONAME=k3 Sets SCA FSAVE file name for output forecasts. Default = 'PISPFORE'. Direct forecast input syntax options. Xvar1(r1, r2, r3,.....) ...... Xvark(r1, r2, r3,.....) Sample job using a PISPLINE model with 3 exogenous variables. Hinich tests are performed and forecasts for 4 periods are produced. b34sexec pispline graphres ires $ model y = x1 x2 x3$ bispec iauto iturno $ trispec $ forecast x1( 10. 11. 9. 7. ) x2(.55 .77 .88 .66) x3(.01 .11 .15 .70 )$ b34seend$ The job TESTMARS_P in b34stest.mac illustrates rereading models and other advanced capability. The job SIMPISP illustrates simulation of PISPLINE forecasts.