3.0 REGRESSION Command
The B34S REGRESSION command supports a comprehensive regression
option which includes GLS models and various BLUS and other tests of
equation specification. If only OLS is desired, look at REG, RR or QR
commands. The REG command gives a number of options, especially with
pooled data. If large number of lags are desired, the REG command
does not require that the lags be explicitly built. Balanced error
component models can be estimated with the ECOMP command or, for fixed
effects models, with the panel_lib routines in staging2.mac.
A basic reference for some of the equation specification testing
in the REGRESSION commandis:
- Theil, H., Principles of Econometrics, Wiley 1971.
Form of the REGRESSION command:
B34SEXEC REGRESSION options parameters $
MODEL Yvar = Xvar1 Xvar2 Xvar3 $
COMMENT=(' ') $
ORDER Varn1 Varn2 ... $
RA options parameters $
DM var1 var2 $
B34SEEND$
The MODEL sentence is required.
REGRESSION sentence options.
NOINT - Will estimate model without an intercept, otherwise
an intercept will be used.
STEPWISE - Gives stepwise output. If stepwise is not given, only
the last step is printed.
RESIDUALA - Gives residual analysis. This is recommended.
RESIDUALP - Gives residual analysis and plots. If GLS and or BLUS
residuals are calculated, the options NOBLUSPLOT and
NOGLSPLOT will turn off plots for BLUS and GLS
residuals respectively.
NOBLUSPLOT - Turns off BLUS residual plots if RESIDUALP has been
set.
NOGLSPLOT - Turns off GLS plots if RESIDUALP has been set.
PUNCHRES - Gives residual analysis and plot and punches
on unit 37 RESIDUAL, NPROB, NSUB, KOUNT in format
(E15.8,5X,3I5)
SPUNCHRES - Places residuals on unit 44 in SCA FSAVE
format to a file with name RESIDUAL. Series listed
are OBSNUM, Y, YHAT, RESIDUAL. NPROB and NSUB are
listed as comments in the file. File is not rewound
prior to saving. Use SCAINPUT to rename this file if
SPUNCHRES is used for subsequent REGRESSION
commands. If this is not done, only first file will be
read. The regression model is listed in comment cards.
ZPUNCHRES - This option is no longer supported.
MANYDIGITS - Will give extra digits for regression coefficient
printing.
PCOEF - Will output coefficients on unit 37. Note: MANYDIGITS
option must not have been set if PCOEF is set. Format
used is: 'REGRESSION COEFFICIENTS',I5,2I6 which passes
NPROB,NSUB,IGLS. Next is placed the subheader card
YNAME ICODE VAL1 VAL2 ESS TSS DFDR TDF NPROB NSUB
using format (A8,I3,4E12.5,I4,I7,I3,I6) where ICODE=0,
VAL1=VAL2 = 0.0, ESS = explained sum of squares,
TSS = total sum of squares, DFDR = degrees of freedom
regression, TDF = total degrees of freedom.
Subsequent cards contain XNAME ICODE COEFF SE PCOR
ELAST ID1 ID2 NPROB NSUB using same format. ID1=ID2=0.
If multiple REGRESSION command have the PCOEF command,
all coefficients will be saved in the same file. NPROB
and NSUB can be used to determine which coefficients
go with which regression. If the constant is forced
into the regression the values of PCOR and ELAST
will be set to 0.0. After the coefficients are
saved, another header/trailer card is placed in the
file. By the use of header/trailer cards unit 37
can be concurrently be used to save coefficients
covariance matrices for many regressions in one
file. Users are encouraged to see the GENMOD
command for an example where the unit 37 file is
used.
PCOV - Will output covariance matrix of regression
coefficients on unit 37. Header and trailer card of
the form 'id. text' NPROB,NSUB,IGLS using format
('COV. MATRIX OF COEF ',I5,2I6)
The covariance matrix is punched by rows in
lower triangular form using format (3G24.16).
NOCOV - Will delete covariance matrix of regression
coefficients.
BRTEST - Prints sum of adjusted residuals, mean adjusted
residual, and other tests useful in BLUS analysis.
REGRESSION sentence parameters.
MAXGLS = n - Sets maximum order of GLS estimation using Goldberger
method. B34S will estimate up to order n, depending
on value of TOLG. The GLSGRID command is an
alternative to the MAXGLS approach. If Bayes analysis
is performed and the max value of MAXGLS is greater
than 1, it will be reset to 1.
TOLG = r1 - Sets the convergence tolerence for smoothing data.
This tolerence is applied to the maximum of the
absolute value of the first n autocorrelation
coefficients. If this tolerence is not specified, the
program sets it to (1.0 / sqrt(NOOB)) which is an
estimate of the SE of the autocorrelation
coefficient. If in core BLUS analysis is done,
the last set of BLUS residuals are used in
place of the OLS residuals to check if GLS should
be done. If the heteroskedasticity BLUS base was
used, this may not be appropriate. To force GLS
set TOLG=.1E-9.
NTAC=n1 - Set the number of terms in the autocorrelation
function of the residuals. The range is 1-30. The
default is MAX(4,(MAXGLS+1))
BLUS= key - Sets out of core BLUS residual option.
FIRST - Use the first K observations as BLUS base.
MIDDLE - Use middle K observations as the base (this is
the best base to test for heteroskedasticity).
LAST - Use the last K observations as the base.
(N1,N2,...,NK) specifies the BLUS observation
base. User must specify K distinct observations.
BEST - Allow the program to choose from among the K+1
possible adjacent bases by choosing that base
which maximizes the sum of square roots of the
eigenvalues. The K+1 possible bases are
1 First K and last 0 observations.
2 First K-1 and last 1 observations.
.
HET - Choose K observations at equal intervals from the
middle third of the sample.
BOTH - Calculate BLUS residuals first using option 5,
then option 2.
Note: The RA card allows for incore BLUS options
which support sorted data. Currently both
incore and out of core BLUS options have
a limit of 20 variables on the right of
any equation.
TOLL=r2 Checks prospective variables for multicollinearity
with variables presently in the equation (via
inspection of the reduced diagonal element) by
calculating a regression of prospective variables
against all presently included variables. If 1 - this
Rsquare is less that TOLL (whose default = .00001),
the prospective variable will not enter. In addition
a computational error estimate is presented. Users
lower TOLL at their own risk.
EFIN=r3 Minimum F level for inclusion of a variable.
Default=.01.
FOUT=r4 Minimum F level for variables in the equation before
they will be thrown out. Default=.005.
BAYES=key Will give Bayesian regression output.
key=BAYPLOT Will plot Bayesian output.
key=BAYLISTP Will list and plot Bayesian output.
NBE=n2 Sets number of points in plotting grid for Bayes.
Default = 5. Max = 100.
NRO=n3 Sets number of points for plotting Bayes estimate of
p. Default = 0 (marginal is suppressed). Max =100.
RLO=r5 Lower limit of integration for estimating p by
Bayesian methods. Default = p - 4(T-K)**(-.5)
RHI=r6 Upper limit of integration for estimating p by
Bayesian methods. Default = p + 4(T-K)**(-.5)
NRS=n4 # of points in plotting grid for R**2 estimated with
Bayesian methods. Max= 100. If the cumulative
density does not sum to 1.0, increase NRS.
ILDPV=Variable Sets the name of the lagged dependent variable to
be used in the calculation of the corrected DW test.
Note: The Bayesian regression options are very computer intensive and
should be used with care. The R**2 calculation takes time. The
number of points used increases accuracy and computational cost.
If MAXGLS > 0 and BAYES is set, Bayesian analysis will be done
on GLS only and the maximum of MAXGLS will be 1. The Bayes
option is experimental. Uses are warned to use caution in the
interpretation and use of the results.
GLSGRID=n5 Perform a GLS grid search in n5+1 steps between PHO
and PHI. The maximum number allowed is 99. If n5=0,
GLS is performed using PHO.
PHO=r6 Lower limit for GLS grid search. Default=.4. The
maximum number of digits is 3.
PHI=r7 Upper limit for GLS grid seaerch. Default=.95. The
maximum number of digits is 3.
NUMPROB=n6 Sets problem # for identification purposes only.
Valid values must be in the range 0-999.
MODEL sentence.
MODEL Y = X1 X2 X3 $
Where Y is the left hand variable and X1 X2 X3 are right hand variables.
Y, X1, X2 X3 must have been passed to B34S. The maximum number of right
hand variables = 68.
ORDER sentence.
ORDER Xi Xj $
Specifies the order of variables that must be in the equation. This
option is useful only with the STEPWISE option. If it is desired to
force in the constant, use the name CONSTANT on the ORDER sentence.
COMMENT Sentence.
COMMENT=(' ') $
The COMMENT sentence allows printing of a regression comment. Place
comment (up to 72 characters) between (' and ') . The delimiter
character ($) or the keywords B34SEXEC or B34SEEND must not be placed in
the comment. There is no limit on the number of COMMENT sentences.
RA sentence.
The RA sentence allows calculation of specialized equation
specification tests to test for the correct functional form of the
equation.
Options on the RA sentence.
GRAPH - Graphs the residual against appropriate X variable.
LIST - Lists resorted residual against first X variable only.
LISTA - Lists resorted residual against all X variables.
CROSS - Performs Cross correlation analysis on OLS residuals
only. For this option the number of series listed with
the VARS parameter must be even.
AUTO - Performs Autocorrelation analysis with OLS residuals.
NONONLIN - Turns off the nonlinearity tests for OLS residuals.
Parameters on RA sentence (Required VARS)
VARS=(X1,X2,...,Xk) specifies the variables to test for
misspecification. Max = 8.
RESID=KEY where KEY is set to determine what residuals are to be
used. Options for KEY include.
OUTBLUS = BLUS residuals from REGRESSION out of core BLUS
procedure. OUTBLUS would be used with cross
section data that would not fit in core
OLS = OLS out of core residuals used. This is the
default value for KEY.
ALL = Both OUTBLUS and OLS residuals used.
Note: the following 4 options instruct the RA option to
calculate INCORE BLUS residuals for various tests. These
tests are the most powerful available in the RA option.
For further detail, see Theil (1971) Chapter 5. Currently
there is a limit of 20 variables on the right on any
equation for which incore BLUS tests are requested.
CONVEX = Calculate BLUS residuals for convexity test
using the MVN ratio tests.
HET = Calculate BLUS residuals for heteroskedasticity
test.
PARAB = Calculates the parabola convexity test.
ALLBLUS = Calculates CONVEX, HET and PARAB tests.
DIF=n Sets differencing. This option should only be used if BLUS
residuals are not calculated (RESID=OLS).
n=0 No differencing. This is the default.
n=1 Up to first differencing.
n=2 Up to second differencing.
n=3 Only first differencing.
n=4 Only second differencing.
PERIOD=j Number of periods of autocorrelations and cross
correlations. The max value of PERIOD = 60. If PERIOD is
not set, B34S sets to maximum of 60 and (NOOBS/4).
Examples of RA card.
Battery of BLUS tests done on Model
b34sexec regression residuala$
comment=('Incore blus tests done on Model')$
model y = x1 x2$
ra resid=allblus vars=(x1,x2)$
b34seend$
Residuals Autocorrelated
b34sexec regression residuala$
comment=('Incore blus tests done on Model')$
model y = x1 x2$
ra resid=ols auto vars=(x1,x2)$
b34seend$
DM sentence:
The DM sentence is optionally used with the BAYES parameter to
delete the marginals for selected variables. The form of the DM
sentence is DM Var1 Var2 $. A maximum of 30 variables can be specified.
Sample setup.
B34SEXEC REGRESSION$
MODEL Y = X1 X2 X3 $ * runs 3 variable model $
B34SEEND$
More complex setup showing GLS
B34SEXEC REGRESSION MAXGLS=3 STEPWISE$
MODEL Y = X1 X2 X3 X4 $
COMMENT=('Main Model') $
ORDER CONSTANT X1 $
B34SEEND$