Economic Forecasting Questions

posted in: Research Paper | 0
  1. Remind me if Xs are highly correlated (scatterplot, VIF, Correlations) if you have a sign switch, correct the situation by throwing one of the variables out of the model. Consider R-squared or adj R-squared when making the decision.

My Y = Revenue, my company name is Amazon

X variables = Earnings, Revenue, and Industry Revenue.

As shown in the first graph above, The X variables are significant and show some linear pattern however it is not a perfect linear line so we can say it is non-linear. The P- values are zero which means there is an evidence of a relationship between the variables. X variables are also statistically significant and co-related as the correlation matrix table shows the amount of correlation between the variables. Only Industry Revenue and Earnings has a low correlation as it has 0.447.

Comparing my tables I do see a sign switch in the Employment as it has a negative sign (-0.000573).

Interpreting the regression analysis in my second graph, we want there to be a relationship between our X and Y variable. X is the independent variable and Y is our dependent variable. So, the relation between X and Y is if the X changes our Y changes as well. The constants are not statistically significant as the P-values are greater than 1. Let’s assume that the unit is 1, My coefficient in Earnings is 114.5 that means is if my Earnings increases by $1, the Revenue will increase by $114.5. In my second X variable, if my Industry Revenue is increased by $1 then my Revenue also increases by $370.1. Also, for the Employment it has a negative coefficient so that means they will move on the opposite direction. Which means if the company employs more than 1 person the revenue will decrease by 0.000573. There is a very low impact so for the company to see a huge difference in the decrease of the revenue the company will need to hire more than 10,000 employs which would decrease by$5.73.

GDP information and what are the biggest sources of revenue in the economy

WORKSHEET 1

Regression Analysis: Revenue versus Earnings

Regression Equation

Revenue = -67933 + 114.91 Earnings

Coefficients

Term Coef SE Coef T-Value P-Value VIF
Constant -67933 5923 -11.47 0.000  
Earnings 114.91 8.45 13.60 0.000 1.00

Model Summary

S R-sq R-sq(adj) R-sq(pred)
9225.39 67.27% 66.91% 65.08%

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value
Regression 1 15742520392 15742520392 184.97 0.000
  Earnings 1 15742520392 15742520392 184.97 0.000
Error 90 7659698627 85107763    
  Lack-of-Fit 84 7637913547 90927542 25.04 0.000
  Pure Error 6 21785079 3630847    
Total 91 23402219018      
Fits and Diagnostics for Unusual Observations
Obs Revenue Fit Resid Std Resid
88 60450 32387 28063 3.10 R
90 52890 34685 18205 2.02 R
91 56580 35145 21435 2.38 R
92 72380 35259 37121 4.12 R

R  Large residual

Durbin-Watson Statistic

Durbin-Watson Statistic = 0.186035

Considering the R-squared and the VIF table I have decided to take my Employment and industry revenue out. There was a low R-squared in all my three X variables when I ran the Regression analysis, so I decided to choose the one with the higher R-Square and R-sq(adj). The highest R-sq is the Earnings so I will be keeping only one X variable. Also, The P-value is O which is good and VIF is one which is low however this is the best model I have comparatively. Also looking at my VIF which is 1 there is no sign of multicollinearity as it is not greater than 5 or 10.

  1. Using the scatter plots you generated, identify any nonlinear relationships between Y and X variables.

Try to correct nonlinearity through transformation (page 233-237). If it works, keep the transformed version of the variable. Otherwise, use the original variable, acknowledge the nonlinearity, and move on to the next test. Use 2 different transformations (ex: Log X, 1/X, X^2 or SQRT(X)).

I ran a transformation to see if my nonlinearity changes. Transforming a data means to change its functional form, so I ran Y and X variable to see if my Scatterplot give me a linear or non-linear pattern. Transformation is one way to keep the information content, but we change the functional form also it is not necessary to always work. However, I do not see drastic changes. I ran Log X and SQRT (X) as my transformation test, but I got same results. The first graph is my Earnings and the second is transformation with Log X and third is SQRT (X). So, I have no choice to acknowledge it and to move on with my non-linearity.

3.Once you correct for nonlinearity and multicollinearity, check for autocorrelation using DW test. Do you have autocorrelation? Correct for autocorrelation if you have any.

Based on my data since I have decided to keep just one of my x variables that is Earnings. The DW = 0.186035 Durbin-Watson Statistic

Durbin-Watson Statistic = 0.186035

In my 90 row I have lower bound of 1.64 and the upper bound 1.69. My DW is lower than the lower bound based on my decision rule I am going to reject the null and I have an auto correlation. So now I am attempting to fix the auto correlation so I will have one more coefficient. I am going to add a lag value of Yt to attempt to replicate assuming it is coming from the Yt.

So my row is 90 and K=1DL

WORKSHEET 1

Regression Analysis: Revenue (yt) versus Earnings, Revenue(yt-1), trend

Method

Rows unused 1

Regression Equation

Revenue (yt) = -8952 + 16.4 Earnings + 1.0062 Revenue(yt-1) – 37 trend

Coefficients

Term Coef SE Coef T-Value P-Value VIF
Constant -8952 19638 -0.46 0.650  
Earnings 16.4 39.8 0.41 0.681 117.98
Revenue(yt-1) 1.0062 0.0517 19.47 0.000 3.39
trend -37 174 -0.21 0.833 123.53

Model Summary

S R-sq R-sq(adj) R-sq(pred)
3933.62 94.21% 94.01% 93.00%

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value
Regression 3 21921168164 7307056055 472.24 0.000
  Earnings 1 2638051 2638051 0.17 0.681
  Revenue(yt-1) 1 5864694800 5864694800 379.02 0.000
  trend 1 692178 692178 0.04 0.833
Error 87 1346180110 15473335    
Total 90 23267348275      

Fits and Diagnostics for Unusual Observations

Obs Revenue (yt) Fit Resid Std Resid
77 22720 30943 -8223 -2.16 R  
80 35750 27150 8600 2.23 R  
81 29130 37634 -8504 -2.22 R  
84 43740 34958 8782 2.28 R  
85 35710 46101 -10391 -2.75 R  
88 60450 46171 14279 3.77 R  
89 51049 63046 -11997 -3.41 R X
90 52890 53781 -891 -0.24   X
91 56580 55662 918 0.25   X
92 72380 59354 13026 3.64 R X

R  Large residual X Unusual X

Durbin-Watson Statistic

Durbin-Watson Statistic = 2.46865

I ran the regression adding a lag, Trend and Revenue (yt-1) which gave me a higher VIF which is more than 117.98 in my X variable. It fixed my DW which is 2.46865 which means there is no autocorrelation detected in the sample. However, I have detected Multicollinearity due to higher VIF. It shows the multicollinearity between my X variable and trend. The T-value from both Earnings and trend are not significant as we can see the P-values which are above 0.05 for both Earnings and trend. This is a consequence of multicollinearity between them two. If I drop the Trend the T-value for earnings will be significant. DW close to close to 0 indicates positive autocorrelation and close to 4 indicates negative autocorrelation. When its close to 2 there is no autocorrelation and that is exactly what we want in the model.

So again, I decided to remove trend from my data and run the regression analysis again. This gave me a perfect model which is shown in the graph below.

Regression Analysis: Revenue (yt) versus Earnings, Revenue(yt-1)

Method

Rows unused 1

Regression Equation

Revenue (yt) = -4892 + 8.14 Earnings + 1.0037 Revenue(yt-1)

Coefficients

Term Coef SE Coef T-Value P-Value VIF
Constant -4892 4117 -1.19 0.238  
Earnings 8.14 6.54 1.24 0.216 3.21
Revenue(yt-1) 1.0037 0.0500 20.07 0.000 3.21

Model Summary

S R-sq R-sq(adj) R-sq(pred)
3912.21 94.21% 94.08% 93.27%

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value
Regression 2 21920475987 10960237993 716.10 0.000
  Earnings 1 23720949 23720949 1.55 0.216
  Revenue(yt-1) 1 6165533565 6165533565 402.83 0.000
Error 88 1346872288 15305367    
Total 90 23267348275      

Fits and Diagnostics for Unusual Observations

Obs Revenue (yt) Fit Resid Std Resid
77 22720 31089 -8369 -2.17 R  
80 35750 27259 8491 2.20 R  
81 29130 37719 -8589 -2.25 R  
84 43740 34912 8828 2.30 R  
85 35710 46023 -10313 -2.73 R  
88 60450 46113 14337 3.80 R  
89 51049 62933 -11884 -3.36 R X
90 52890 53611 -721 -0.19   X
91 56580 55492 1088 0.30   X
92 72380 59203 13177 3.63 R X

R  Large residual X Unusual X

Durbin-Watson Statistic

Durbin-Watson Statistic = 2.46554
  1. Incorporate seasonal dummies and trend into your model. Identify if you have seasonality, trend by checking their significance? Is that consistent with your previous findings?

Dummy variable are also called the indicator variables it is how we incorporate non continuous or qualitative date in our analysis. So, incorporating trend and seasonality in our regression model, Yt which is our dependent variable and is also a function of X1 which is Earnings. The coefficient in front of X1 is the marginal effects and beta zero is the intercept and errors. Anything that is not explained in the model, that will go to the error term. So, let us incorporate it in the model so error can look much better. Trend is going to keep a track of the number of years. My sample size is 92 so the trend will increase in the increments of 1. Similarly, seasonality would be into four quarters Q1,Q2,Q3 and Q4.

WORKSHEET 1

Regression Analysis: Revenue versus Earnings, T, q1, q2, q3

Method

Categorical predictor coding (1, 0)

Regression Equation

q1 q2 q3
0 0 0 Revenue = 27041 – 71.5 Earnings + 800 T
           
0 0 1 Revenue = 22892 – 71.5 Earnings + 800 T
           
0 1 0 Revenue = 22473 – 71.5 Earnings + 800 T
           
0 1 1 Revenue = 18324 – 71.5 Earnings + 800 T
           
1 0 0 Revenue = 22774 – 71.5 Earnings + 800 T
           
1 0 1 Revenue = 18625 – 71.5 Earnings + 800 T
           
1 1 0 Revenue = 18206 – 71.5 Earnings + 800 T
           
1 1 1 Revenue = 14057 – 71.5 Earnings + 800 T

Coefficients

Term Coef SE Coef T-Value P-Value VIF
Constant 27041 44784 0.60 0.548  
Earnings -71.5 90.6 -0.79 0.432 120.16
T 800 388 2.06 0.042 120.14
q1          
  1 -4267 2664 -1.60 0.113 1.50
q2          
  1 -4568 2662 -1.72 0.090 1.50
q3          
  1 -4149 2661 -1.56 0.123 1.50

Model Summary

S R-sq R-sq(adj) R-sq(pred)
9022.42 70.09% 68.35% 64.94%

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value
Regression 5 16401476827 3280295365 40.30 0.000
  Earnings 1 50676370 50676370 0.62 0.432
  T 1 345468005 345468005 4.24 0.042
  q1 1 208856960 208856960 2.57 0.113
  q2 1 239764768 239764768 2.95 0.090
  q3 1 197858706 197858706 2.43 0.123
Error 86 7000742191 81403979    
Total 91 23402219018      

Fits and Diagnostics for Unusual Observations

Obs Revenue Fit Resid Std Resid
88 60450 35030 25420 2.92 R
89 51049 31134 19915 2.30 R
90 52890 30632 22258 2.60 R
91 56580 31565 25015 2.92 R
92 72380 36443 35937 4.17 R

R  Large residual

Durbin-Watson Statistic

Durbin-Watson Statistic = 0.126943
  1. Once you corrected for all possible problems, rewrite your final equation, INTERPRET the equation, and forecast y, for 35th in sample observation.
  2. Analyze the resulting residuals (4-in-11 plot in MINITAB)
  3. How does regression analysis perform compared to univariate methods you have learned? Create a table that includes the MSD for the univariate models (Trend, Smoothing, Decomposition) and MSE of regression model. (HINT 1: You don’t have to try ALL univariate models. Use your knowledge of your data. For example: If your revenue variable is trending, no need to run single smoothing. Or, if it is linear, no need to run nonlinear trend models etc. HINT 2: Your final is around the corner, no harm in reviewing the previous material ahead of time, either.)

Last Updated on November 16, 2020 by EssayPro