Economic Forecasting Questions

Remind me if Xs are highly correlated (scatterplot, VIF, Correlations) if you have a sign switch, correct the situation by throwing one of the variables out of the model. Consider R-squared or adj R-squared when making the decision.

My Y = Revenue, my company name is Amazon

X variables = Earnings, Revenue, and Industry Revenue.

As shown in the first graph above, The X variables are significant and show some linear pattern however it is not a perfect linear line so we can say it is non-linear. The P- values are zero which means there is an evidence of a relationship between the variables. X variables are also statistically significant and co-related as the correlation matrix table shows the amount of correlation between the variables. Only Industry Revenue and Earnings has a low correlation as it has 0.447.

Comparing my tables I do see a sign switch in the Employment as it has a negative sign (-0.000573).

Interpreting the regression analysis in my second graph, we want there to be a relationship between our X and Y variable. X is the independent variable and Y is our dependent variable. So, the relation between X and Y is if the X changes our Y changes as well. The constants are not statistically significant as the P-values are greater than 1. Let’s assume that the unit is 1, My coefficient in Earnings is 114.5 that means is if my Earnings increases by $1, the Revenue will increase by $114.5. In my second X variable, if my Industry Revenue is increased by $1 then my Revenue also increases by $370.1. Also, for the Employment it has a negative coefficient so that means they will move on the opposite direction. Which means if the company employs more than 1 person the revenue will decrease by 0.000573. There is a very low impact so for the company to see a huge difference in the decrease of the revenue the company will need to hire more than 10,000 employs which would decrease by$5.73.

GDP information and what are the biggest sources of revenue in the economy

WORKSHEET 1

Regression Analysis: Revenue versus Earnings

Regression Equation

Revenue

-67933 + 114.91 Earnings

Coefficients

Term	Coef	SE Coef	T-Value	P-Value	VIF
Constant	-67933	5923	-11.47	0.000
Earnings	114.91	8.45	13.60	0.000	1.00

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
9225.39	67.27%	66.91%	65.08%

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Regression	1	15742520392	15742520392	184.97	0.000
Earnings	1	15742520392	15742520392	184.97	0.000
Error	90	7659698627	85107763
Lack-of-Fit	84	7637913547	90927542	25.04	0.000
Pure Error	6	21785079	3630847
Total	91	23402219018

Fits and Diagnostics for Unusual Observations

Obs	Revenue	Fit	Resid	Std Resid
88	60450	32387	28063	3.10	R
90	52890	34685	18205	2.02	R
91	56580	35145	21435	2.38	R
92	72380	35259	37121	4.12	R

R Large residual

Durbin-Watson Statistic

Durbin-Watson Statistic =

0.186035

Considering the R-squared and the VIF table I have decided to take my Employment and industry revenue out. There was a low R-squared in all my three X variables when I ran the Regression analysis, so I decided to choose the one with the higher R-Square and R-sq(adj). The highest R-sq is the Earnings so I will be keeping only one X variable. Also, The P-value is O which is good and VIF is one which is low however this is the best model I have comparatively. Also looking at my VIF which is 1 there is no sign of multicollinearity as it is not greater than 5 or 10.

Using the scatter plots you generated, identify any nonlinear relationships between Y and X variables.

Try to correct nonlinearity through transformation (page 233-237). If it works, keep the transformed version of the variable. Otherwise, use the original variable, acknowledge the nonlinearity, and move on to the next test. Use 2 different transformations (ex: Log X, 1/X, X^2 or SQRT(X)).

I ran a transformation to see if my nonlinearity changes. Transforming a data means to change its functional form, so I ran Y and X variable to see if my Scatterplot give me a linear or non-linear pattern. Transformation is one way to keep the information content, but we change the functional form also it is not necessary to always work. However, I do not see drastic changes. I ran Log X and SQRT (X) as my transformation test, but I got same results. The first graph is my Earnings and the second is transformation with Log X and third is SQRT (X). So, I have no choice to acknowledge it and to move on with my non-linearity.

3.Once you correct for nonlinearity and multicollinearity, check for autocorrelation using DW test. Do you have autocorrelation? Correct for autocorrelation if you have any.

Based on my data since I have decided to keep just one of my x variables that is Earnings. The DW = 0.186035 Durbin-Watson Statistic

Durbin-Watson Statistic =

0.186035

In my 90 row I have lower bound of 1.64 and the upper bound 1.69. My DW is lower than the lower bound based on my decision rule I am going to reject the null and I have an auto correlation. So now I am attempting to fix the auto correlation so I will have one more coefficient. I am going to add a lag value of Yt to attempt to replicate assuming it is coming from the Yt.

So my row is 90 and K=1DL

WORKSHEET 1

Regression Analysis: Revenue (yt) versus Earnings, Revenue(yt-1), trend

Method

Rows unused

Regression Equation

Revenue (yt)

-8952 + 16.4 Earnings + 1.0062 Revenue(yt-1) – 37 trend

Coefficients

Term	Coef	SE Coef	T-Value	P-Value	VIF
Constant	-8952	19638	-0.46	0.650
Earnings	16.4	39.8	0.41	0.681	117.98
Revenue(yt-1)	1.0062	0.0517	19.47	0.000	3.39
trend	-37	174	-0.21	0.833	123.53

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
3933.62	94.21%	94.01%	93.00%

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Regression	3	21921168164	7307056055	472.24	0.000
Earnings	1	2638051	2638051	0.17	0.681
Revenue(yt-1)	1	5864694800	5864694800	379.02	0.000
trend	1	692178	692178	0.04	0.833
Error	87	1346180110	15473335
Total	90	23267348275

Fits and Diagnostics for Unusual Observations

Obs	Revenue (yt)	Fit	Resid	Std Resid
77	22720	30943	-8223	-2.16	R
80	35750	27150	8600	2.23	R
81	29130	37634	-8504	-2.22	R
84	43740	34958	8782	2.28	R
85	35710	46101	-10391	-2.75	R
88	60450	46171	14279	3.77	R
89	51049	63046	-11997	-3.41	R	X
90	52890	53781	-891	-0.24		X
91	56580	55662	918	0.25		X
92	72380	59354	13026	3.64	R	X

R Large residual X Unusual X

Durbin-Watson Statistic

Durbin-Watson Statistic =

2.46865

I ran the regression adding a lag, Trend and Revenue (yt-1) which gave me a higher VIF which is more than 117.98 in my X variable. It fixed my DW which is 2.46865 which means there is no autocorrelation detected in the sample. However, I have detected Multicollinearity due to higher VIF. It shows the multicollinearity between my X variable and trend. The T-value from both Earnings and trend are not significant as we can see the P-values which are above 0.05 for both Earnings and trend. This is a consequence of multicollinearity between them two. If I drop the Trend the T-value for earnings will be significant. DW close to close to 0 indicates positive autocorrelation and close to 4 indicates negative autocorrelation. When its close to 2 there is no autocorrelation and that is exactly what we want in the model.

So again, I decided to remove trend from my data and run the regression analysis again. This gave me a perfect model which is shown in the graph below.

Regression Analysis: Revenue (yt) versus Earnings, Revenue(yt-1)

Method

Rows unused

Regression Equation

Revenue (yt)

-4892 + 8.14 Earnings + 1.0037 Revenue(yt-1)

Coefficients

Term	Coef	SE Coef	T-Value	P-Value	VIF
Constant	-4892	4117	-1.19	0.238
Earnings	8.14	6.54	1.24	0.216	3.21
Revenue(yt-1)	1.0037	0.0500	20.07	0.000	3.21

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
3912.21	94.21%	94.08%	93.27%

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Regression	2	21920475987	10960237993	716.10	0.000
Earnings	1	23720949	23720949	1.55	0.216
Revenue(yt-1)	1	6165533565	6165533565	402.83	0.000
Error	88	1346872288	15305367
Total	90	23267348275

Fits and Diagnostics for Unusual Observations

Obs	Revenue (yt)	Fit	Resid	Std Resid
77	22720	31089	-8369	-2.17	R
80	35750	27259	8491	2.20	R
81	29130	37719	-8589	-2.25	R
84	43740	34912	8828	2.30	R
85	35710	46023	-10313	-2.73	R
88	60450	46113	14337	3.80	R
89	51049	62933	-11884	-3.36	R	X
90	52890	53611	-721	-0.19		X
91	56580	55492	1088	0.30		X
92	72380	59203	13177	3.63	R	X

R Large residual X Unusual X

Durbin-Watson Statistic

Durbin-Watson Statistic =

2.46554

Incorporate seasonal dummies and trend into your model. Identify if you have seasonality, trend by checking their significance? Is that consistent with your previous findings?

Dummy variable are also called the indicator variables it is how we incorporate non continuous or qualitative date in our analysis. So, incorporating trend and seasonality in our regression model, Yt which is our dependent variable and is also a function of X1 which is Earnings. The coefficient in front of X1 is the marginal effects and beta zero is the intercept and errors. Anything that is not explained in the model, that will go to the error term. So, let us incorporate it in the model so error can look much better. Trend is going to keep a track of the number of years. My sample size is 92 so the trend will increase in the increments of 1. Similarly, seasonality would be into four quarters Q1,Q2,Q3 and Q4.

WORKSHEET 1

Regression Analysis: Revenue versus Earnings, T, q1, q2, q3

Method

Categorical predictor coding

(1, 0)

Regression Equation

q1	q2	q3
0	0	0	Revenue	=	27041 – 71.5 Earnings + 800 T

0	0	1	Revenue	=	22892 – 71.5 Earnings + 800 T

0	1	0	Revenue	=	22473 – 71.5 Earnings + 800 T

0	1	1	Revenue	=	18324 – 71.5 Earnings + 800 T

1	0	0	Revenue	=	22774 – 71.5 Earnings + 800 T

1	0	1	Revenue	=	18625 – 71.5 Earnings + 800 T

1	1	0	Revenue	=	18206 – 71.5 Earnings + 800 T

1	1	1	Revenue	=	14057 – 71.5 Earnings + 800 T

Coefficients

Term	Coef	SE Coef	T-Value	P-Value	VIF
Constant	27041	44784	0.60	0.548
Earnings	-71.5	90.6	-0.79	0.432	120.16
T	800	388	2.06	0.042	120.14
q1
1	-4267	2664	-1.60	0.113	1.50
q2
1	-4568	2662	-1.72	0.090	1.50
q3
1	-4149	2661	-1.56	0.123	1.50

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
9022.42	70.09%	68.35%	64.94%

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Regression	5	16401476827	3280295365	40.30	0.000
Earnings	1	50676370	50676370	0.62	0.432
T	1	345468005	345468005	4.24	0.042
q1	1	208856960	208856960	2.57	0.113
q2	1	239764768	239764768	2.95	0.090
q3	1	197858706	197858706	2.43	0.123
Error	86	7000742191	81403979
Total	91	23402219018

Fits and Diagnostics for Unusual Observations

Obs	Revenue	Fit	Resid	Std Resid
88	60450	35030	25420	2.92	R
89	51049	31134	19915	2.30	R
90	52890	30632	22258	2.60	R
91	56580	31565	25015	2.92	R
92	72380	36443	35937	4.17	R

R Large residual

Durbin-Watson Statistic

Durbin-Watson Statistic =

0.126943

Once you corrected for all possible problems, rewrite your final equation, INTERPRET the equation, and forecast y, for 35th in sample observation.
Analyze the resulting residuals (4-in-11 plot in MINITAB)
How does regression analysis perform compared to univariate methods you have learned? Create a table that includes the MSD for the univariate models (Trend, Smoothing, Decomposition) and MSE of regression model. (HINT 1: You don’t have to try ALL univariate models. Use your knowledge of your data. For example: If your revenue variable is trending, no need to run single smoothing. Or, if it is linear, no need to run nonlinear trend models etc. HINT 2: Your final is around the corner, no harm in reviewing the previous material ahead of time, either.)

Last Updated on November 16, 2020