44.0 REG Command The REG command allows estimation of OLS models where lags of the variables do not have to be explicitly set. Unlike the REGRESSION command, the REG command loads data into memory. The size of the largest problem is limited by the size of memory that can be allocated. The REG command allows panel data models which are not rectangular to be estimated by use of an identifier variable that may be a character variable. The REG command allows saving of the estimated coefficients, t scores, e'e, DW, number of observations and R**2 in a DMF file along with an identifier variable. Residuals can also be saved in an SCA FSAV file. The REG command allows estimation of models for the complete sample in two important situations: without (usual case) and with panel data. With panel data, B34S will automatically handle the deletion of the appropriate number of observations to handle lags as the estimation moves across the panel. The panel options in the REG command are NOT designed to perform fixed and random effects error component models. This can be done in the ECOMP command or by using the SAS or RATS systems. The REG panel options are designed to investigate systematically the pattern of the coefficients and t's inside a panel before any fixed or random effects models are estimated. Balanced error component models can be estimated with the ECOMP command or, for fixed effects, with the panel_lib routines in staging2.mac. Often times a "rogue" panel will result in a substantial bias being placed on the final estimated models that is unknown to the user. In the example shown below the Grunfeld data is used to illustrate this issue. The REG command first estimates models and saves results in a formatted dmf file and a fsave file. The dmf file is read back into b34s and selected series are loaded and passed to the MATRIX command for further processing. As noted, one and two way fixed effects model can be estimated for balanced panel data with the routines in the panel_lib member of staging2.mac for modest sized datasets. These commands illustrate the calculations involved. For unbalanced datasets the RATS PREGRESS command offers both fixed and random effects options and is quite fast. Unlike fixed effect models, there and a large number of ways to estimate random effects models. The SAS TSCSREG options provides a number of these options for balanced data. A number of sample jobs are shown. If high accuracy and PC models are desired, use the QR command or use the call olsq( ) command in the MATRIX command which can do QR estimation without setting up lags. If very large datasets are run and specialized diagnostic tests are desired, use REGRESSION. If simple regressions are desired where recursive residuals are needed, use the RR command of the RR option in the call olsq( ) matrix command. The RR command can also run a simple OLS model where all the variables are explicitly built. The ROBUST command can be used to test models with L1, MINIMAX and OLS. Simular capability is in call olsq( ). The ROBUST command is similar to the REG command except that the TEST sentence is not supported. If just OLS is desired, the REG and REGRESSION commands should be used unless the matrix command is employed. Under the MATRIX command call olsq( ) allows RR models, minimax and L1 models and well as GLS with various options. The advantage is that the output of the estimation can easily be further processed with the capability built into the matrix programming language. The general form of the REG command is: B34SEXEC REG options parameters$ MODEL Yvar = Xvar1 Xvar2 $ TEST xvar $ BISPEC options parameters$ TRISPEC options parameters$ POLYSPEC options parameters$ REVERSE options parameters$ B34SEEND$ REG options: NOINT - Suppress constant. PRINT - Print panel OLS results. CPRINT- Prints panel OLS results without a new page for each panel to save space. RESIDUALP-List residuals with lineprinter plot for complete sample. PANEL - Data is in panel form. If data is in rectangular form, NREG must be set or SUBKEY must be set. It is assumed that the data is in the form of observations for subset1 , subset2 ... If this is not the case, use the SORT command to put the data in the correct form prior to running REG command. If the data is NOT in rectangular form, SUBKEY must be used to delinate the panels. SAVERES - Saves the residuals is an SCA FSAVE file on unit FSAVUNIT. For the complete sample the FSAV dataset name is RESIDUAL. For panels, the default is RES0001... The residual is saved as RESIDUAL along with OBSNUM Y and YHAT. The file is not rewound prior to saving. Use SCAINPUT command to rename these files. The keyword SPUNCHRES can be used in place of SAVERES. For panel data SAVERES takes a great deal of time doing I/O. SAVECOEF- Saves Panel coefficients and associated statistics in a DMF file. The default dataset name is PCOEF. The panel regression number is saved in IDENT. If a SUBKEY is specified, it is saved. The DMF unit is COEFUNIT. The coefficients are saved with names BETA0001 BETA0002 BETA0003. The t scores are saved with names TSTAT001. Linkages between these names, which are needed because of the possibility of lags, and the underlying variables are listed in variable labels. e'e, R**2 N, variance Y and the durbin watson values are saved with names EPE, RSQ, NOOB, VARY and DW. ONLYSUB - Specifies that only subsample regressions will be calculated. This option is only used with PANEL data and will save space since the complete dataset will not be loaded. ONLYFULL- Specifies that OLS models on the complete dataset are to be run for panel data but that panel regressions are not going to be run. DMF - Sets the DMF save format as UNFORMATTED. This is the default. This can also be set as FILEF=DMF. Note that is DMF is used, must allocate the DMF file as unformatted. FDMF - Sets the DMF save format as FORMATTED. This makes a more portable file but requires more time and makes files that are 3 times bigger. This can also be set as FILEF=FMDF. ACOV - Same as WHITE command. WHITE - If set uses the White (1980) formula to calculate the SE. For further detail see Greene (2003) page 199. This option is similar to the SAS ACOV option on SAS PROC REG or the the RATS ROBUSTERRORS on the RATS command LINREG. The command ACOV can be used in place of WHITE. Davidson-MacKinnon (2004) pages 199-200 show alternative formulas. These are implemented in the matrix command call olsq as :white1, :white2 and :white3. See also Greene (2003) page 220. Note: The REG command will start writing DMF files at the position of the file. If you wish to add to files already on the DMF file, use the POSITION( ) parameter which is documented in the OPTIONS command. If the desire is to reuse the file, the CLEAN( ) command should be used. REG parameters: IBEGIN=n1 - If the dataset is not panel, sets the first observation to use in the analysis. If the dataset is a panel, sets first observation to use in the panel. IEND=n2 - If the dataset is not panel, sets the last observation to use in the analysis. If the dataset is a panel, sets last observation to use in the panel. NREG=n3 - Number of observations in each region (sub regression). SUBKEY=Vname - Sets variable, possible character, that identifies the subregression. DMFUNIT=n4 - Sets the DMF coefficient save unit number. The default is 60. DMFNAME=k - Sets the DMF coefficient dataset name. The default is PCOEF. The keyword DMFMEMBER can be used in place of DMFNAME. Up to 10 characters can be specified. Note: The following parameters set frequency and starting dates for DMF files SETFREQ(R) - Sets base frequency. 1. = annual data. .1 = data once per decade. R can be set as real OR integer. If SETFREQ is passed -1, the Julian internal date is reset to unused. SETYEAR(NN) - Sets base year for annual data. Frequency assumed =1. SETMY(M1,Y1) - Sets base year for monthly data. Frequency assumed =12. SETQY(Q1,Y1) - Sets base year for quarterly data. Frequency assumed = 4. SETDMY(D1,M1,Y1) Sets base year for daily data. Frequency assumed =365. FSAVUNIT=n5 - Sets the SCA FSAV residual save unit. The default is 44. DMFUNIT and FSAVUNIT cannot be set to the same unit. FSAVNAME=k - Sets the SCA FSAVE residual dataset name. For the complete sample, the name is RESIDUAL. For panels the default is RES0001. The keywork FSAVMEMBER can be used in place of FSAVNAME. CCOMMENTS(' ',' ') - Sets comments for the DMF file saving coefficients. Any number of 72 col comments can be supplied. RCOMMENTS(' ',' ') - Sets comments for the FSAV file saving residuals. Any number of 72 col comments can be supplied. The MODEL sentence is required. If PANEL is not in effect, the Hinich tests which are called by the BISPEC, TRISPEC and POLYSPEC commands can be used. MODEL sentence. MODEL Y = X1 X2 X3 X4$ The MODEL sentence lists the left hand variable and the right hand side variables. Unless NOINT is supplied, a constant will be automatically added to the model. In addition to the usual specification, the MODEL sentence in the REG command allows the lags to be set in the command. The command MODEL Y = Y{1} X{0 to 3} Z{1}$ is the same as MODEL Y = LAGY X LAG1X LAG2X LAG3X LAG1Z$ except that in the former case the lag variables do not have to be built. The advantage of this setup is that the 98 variable limit of B34S is effectively lifted if the added variables are lags. TEST sentence The test sentence allows user to specify coefficients set to zero so that exclusion restrictions can be tested. There can be up to 99 TEST sentences. Given the setup B34SEXEC REG$ MODEL Y = LAGY X LAG1X LAG2X LAG3X LAG1Z$ TEST X LAG1X$ The two test sentences test exclusion restrictions of setting the coefficient of X and LAG1X to zero. If the sentence TEST X$ were given, the sqrt of the F value would be the t of the X coefficient. Let u be the original error term and v the restricted error term and there be g restrictions. F = (g,n-k) = ((v'v-u'u)/g) / ((u'u/(n-k)) BISPEC sentence. The BISPEC sentence performs various nonlinearity, gaussianity and matringale tests suggested by Hinich. The form of the BISP sentence in the BTIDEN, BTEST and MARS commands is the same. To save space, detail for this sentence is only given under the BTIDEN command help file. If the BISPEC sentence is given with no options or parameters, gaussianity and nonlinearity tests will be performed using default settings. The setting BISPEC IAUTO ITURNO $ will perform tests for gaussianity and nonlinearity over a grid of admissable values for the bandwidth. TRISPEC sentence The TRISPEC command performs 4th order nonlinearity tests suggested by Hinich. Further detail on this sentence is listed under the BTIDEN command. POLYSPEC sentence The POLYSPEC command performs various nonlinearity tests suggested by Hinich within the sample. Further detail on this sentence is listed under the BTIDEN command. REVERSE sentence The REVERSE sentence performs various Time reversability tests suggested by Hinich and Rothman. Further detail in this sentence is listed under the BTIDEN command. References useful using TSCSREG especially: Baltagi, B. H. and Chang, Y. (1994), "Incomplete Panels: A Comparative Study of Alternative Estimators for the Unbalanced One-way Error Component Regression Model," Journal of Econometrics, 62(2), 67-89. Da Silva, J.G.C. (1975), "The Analysis of Cross-Sectional Time Series Data," Ph.D. dissertation, Department of Statistics, North Carolina State University. SAS Institute Inc. (1979), SAS Technical Report S-106, TSCSREG: A SAS Procedure for the Analysis of Time-Series Cross-Section Data, Cary, NC: SAS Institute Inc. Fuller, W.A. and Battese, G.E. (1974), "Estimation of Linear Models with Crossed-Error Structure," Journal of Econometrics, 2, 67-78. Hausman, J.A. (1978), "Specification Tests in Econometrics," Econometrica, 46, 1251-1271. Hausman, J.A. and Taylor, W.E. (1982), "A Generalized Specification Test," Economics Letters, 8, 239-245. Hsiao, C. (1986), Analysis of Panel Data, Cambridge: Cambridge University Press. Parks, R.W. (1967), "Efficient Estimation of a System of Regression Equations when Disturbances Are Both Serially and Contemporaneously Correlated," Journal of the American Statistical Association, 62, 500-509. Baltagi, Badi. (2005) "Econometric Analysis of Panel Data." Third Edition, John Wiley and Son: Hoboken, NJ. Examples. 1. User wants to run a regression on the complete sample and do nonlinearity tests. Autocorrelations of the residuals are performed using the ACF( ) parameter of the BISPEC sentence. b34sexec reg$ model y= x z{1 to 20}$ bispec iturno iauto acf(24)$ b34seend$ 2. User wants to run regression subsamples that are marked by the variable FN. Output of the regression is saved in DMF file myruns.dmf with name of runone. A formated dmf file is being used and any data in the file is erased prior to the run. The saved betas are reread into b34s and the results are sorted and listed. Residuals are also saved. b34sexec options ginclude('panel_data.mac') member(grunfeld); b34srun; b34sexec options open('myruns.dmf') unit(60) disp=unknown$ b34seend$ b34sexec options clean(60)$ b34seend$ b34sexec options open('myres.fsv') unit(44) disp=unknown$ b34seend$ b34sexec options clean(44)$ b34seend$ /; simple case shows list /; b34sexec reg panel subkey=fn print; model invest = f c; b34srun; b34sexec reg panel subkey=fn dmfunit=60 dmfmember=runone fdmf fsavunit=44 fsavname=rone savecoef saveres$ model invest = f c; b34srun; /; browse dmf; b34sexec dmf; browse listnames; b34srun; b34sexec data filef=fdmf dmfmember=runone unit(60)$ input ident beta0001 beta0002 tstat001 tstat002 rsq epe dw n$ b34seend$ b34sexec sort$ by beta0001$ b34seend$ b34sexec list $ b34seend$ b34sexec matrix; call loaddata; call graph(beta0001 :heading 'Plot of f coef'); call graph(beta0002 :heading 'Plot of c coef'); call graph(tstat001, tstat002 :nolabel :heading 'Plot of t stats inside the panels'); b34srun; 3. User wants to run a regression on the complete sample and test if gasout{5 to 6} gasin{3} and gasin{1 to 6} are significant using three tests. b34sexec options ginclude('gas.b34'); b34srun; b34sexec reg$ model gasout= gasin{1 to 6} gasout{1 to 6}$ test gasout{5 to 6} $ test gasin{3} $ test gasin{1 to 6} $ b34seend$ 4. Error Component Example using a range of software. %b34slet runb34s1=1; %b34slet runb34s2=1; %b34slet runrats =1; %b34slet runsas =1; b34sexec options ginclude('panel_data.mac') member(grunfeld); b34srun; /$ Shows ECOMP and REG %b34sif(&runb34s1.ne.0)%then; b34sexec reg $ model invest = f c$ b34srun; b34sexec ecomp regfirst bothp nreg=10 nper=20$ model invest = f c$ b34srun$ b34sexec ecomp regfirst bothp iswith nreg=10 nper=20$ model invest = f c$ b34srun$ %b34sendif; /; /; Matrix Command Options /; %b34sif(&runb34s2.ne.0)%then; b34sexec matrix; call loaddata; call load(panel_lib :staging); call olsq(invest f c :print :savex); /; /; hold data /; call echooff; %xx=%x; %yy=%y; /; call names; itest1=1; itest2=1; itest3=1; if(itest1.eq.1)then; call print(' ':); call print('+++++++++++++++++++++++++++++++++++++++++++++++':); call print(' ':); call panel2fe(%yy,%xx,10,20,%ynew,%xnew,2); call print('Two Way Fixed Effects Estimator panel2fe=2':); call deletecol(%xnew,nocols(%xnew)); call olsq(%ynew %xnew :print :noint); endif; if(itest2.eq.1)then; call print(' ':); call print('+++++++++++++++++++++++++++++++++++++++++++++++':); call print(' ':); call print('Data sorted Before Time fixed effect':); call panel_t( %yy,%xx,10,20,%ynew1,%xnew1); call panel2fe(%ynew1,%xnew1,20,10,%ynew,%xnew,0); call print('Fixed Effects Estimator time correction panel2fe=0':); call deletecol(%xnew,nocols(%xnew)); call olsq(%ynew %xnew :print :noint); call print(' ':); call print('+++++++++++++++++++++++++++++++++++++++++++++++':); call print(' ':); call print('Data not sorted Time Fixed Effect. panel_t not used':); call panel2fe(%yy,%xx,10,20,%ynew,%xnew,1); call print('Fixed Effects Estimator time correction panel2fe=1':); call deletecol(%xnew,nocols(%xnew)); call olsq(%ynew %xnew :print :noint); endif; if(itest3.eq.1)then; call print(' ':); call print('+++++++++++++++++++++++++++++++++++++++++++++++':); call print(' ':); call print('Using panel2fe code':); call panel2fe(%yy,%xx,10,20,%ynew,%xnew,0); call print('Fixed Effects Estimator individual lsvd':); call deletecol(%xnew,nocols(%xnew)); call olsq(%ynew %xnew :print :noint); call print(' ':); call print('+++++++++++++++++++++++++++++++++++++++++++++++':); call print(' ':); call print('Using panel_fe code':); call panel_fe(%yy,%xx,10,20,%ynew,%xnew); call print('Fixed Effects Estimator individual lsvd':); call deletecol(%xnew,nocols(%xnew)); call olsq(%ynew %xnew :print :noint); endif; b34srun; %b34sendif; %b34sif(&runrats.ne.0)%then; b34sexec options open('rats.dat') unit(28) disp=unknown$ b34srun$ b34sexec options open('rats.in') unit(29) disp=unknown$ b34srun$ b34sexec options clean(28)$ b34srun$ b34sexec options clean(29)$ b34srun$ b34sexec pgmcall$ rats passasts pcomments('* ', '* Data passed from B34S system to RATS', '* ', "display @1 %dateandtime() @33 ' Rats Version ' %ratsversion()" '* ') $ PGMCARDS$ * cal(panel=20) 1935 1 1 * print linreg invest / resids # constant f c pstats(tests,effect=indiv) resids pstats(tests,effect=time ) resids pstats(spread) resids pregress(method=fixed,effects=both) invest # f c pregress(method=fixed,effects=individual) invest # f c pregress(method=fixed,effects=time) invest # f c pregress(method=fixed,effects=both) invest # constant f c pregress(method=random,effects=individual) invest # constant f c pregress(method=random,effects=time) invest # constant f c pregress(method=fd,effects=individual) invest # constant f c pregress(method=fd,effects=time) invest # constant f c * pregress(method=sur) invest * # constant f c b34sreturn$ b34srun $ b34sexec options close(28)$ b34srun$ b34sexec options close(29)$ b34srun$ b34sexec options /$ dodos(' rats386 rats.in rats.out ') dodos('start /w /r rats32s rats.in /run') dounix('rats rats.in rats.out')$ B34SRUN$ b34sexec options npageout WRITEOUT('Output from RATS',' ',' ') COPYFOUT('rats.out') dodos('ERASE rats.in','ERASE rats.out','ERASE rats.dat') dounix('rm rats.in','rm rats.out','rm rats.dat') $ B34SRUN$ %b34sendif; %b34sif(&runsas.ne.0)%then; b34sexec options open('testsas.sas') unit(29) disp=unknown$ b34srun$ b34sexec options clean(29) $ b34seend$ b34sexec pgmcall idata=29 icntrl=29$ sas $ * sas commands next ; pgmcards$ proc reg; model invest= f c; run; proc tscsreg; id fn yr; model invest = f c /fixone ; model invest = f c /fixtwo ; model invest = f c /ranone ; model invest = f c /rantwo ; model invest = f c /fuller ; model invest = f c /parks ; model invest = f c /dasilva; model invest = f c /dasilva m=1; model invest = f c /dasilva m=2; run; b34sreturn$ b34srun $ b34sexec options close(29)$ b34srun$ /$ the next card has to be modified to point to sas location /$ be sure and wait until sas gets done before letting b34s resume /$ *************************************************************** b34sexec options dodos('start /w /r sas testsas' ) dounix('sas testsas' ) $ b34srun$ b34sexec options npageout noheader writeout(' ','output from sas',' ',' ') writelog(' ','output from sas',' ',' ') copyfout('testsas.lst') copyflog('testsas.log') dodos('erase testsas.sas','erase testsas.lst','erase testsas.log') dounix('rm testsas.sas','rm testsas.lst','rm testsas.log') $ b34srun$ b34sexec options header$ b34srun$ %b34sendif;