Create Residual and Y Predicted in Excel

Residual and predicted Y have an important role in linear regression analysis using the Ordinary Least Square (OLS) method. Several OLS linear regression method assumptions were conducted to test the residuals, including normally distributed residuals and constant residual variance.

To obtain the residual value, we must first find the value of Y predicted. When we calculate the predicted Y, we must also calculate the parameter estimates, including the intercept and the estimated parameter coefficients of each independent variable.

Therefore, calculating residual and predicted Y becomes important to understand well. On this occasion, Kanda Data will discuss how to create residual, and Y predicted in Excel.

Residual

As I conveyed earlier, this residual has an important function in linear regression. In the assumption test, both in simple linear regression and multiple linear regression, we will test the residual value.

Before going any further, let’s discuss what residual is? In statistics, you may have read that there are errors and residual values.

Actually, the two are almost the same but have differences. The difference is that the residual value is obtained from sample data, while the error value is obtained from population data.

After we understand the difference between error and residual, the next question is, what is the definition of residual? Residual is the difference between the actual observed value and the predicted value.

It should be underlined here for this residual value using sample data. Therefore, we can test the assumption of normality by testing whether the residuals are normally distributed or not.

Furthermore, the variance of the residuals is used to test the non-heteroscedasticity assumption. The non-heteroscedasticity assumption requires that the residual variance is constant.

The constant residual variance is often referred to as homoscedasticity. If, based on the test results, the residual variance assumption is not constant, it can be concluded that there is a heteroscedasticity problem.

Y Predicted

As I conveyed in the previous paragraph, we must first find the predicted Y value to obtain the residual value. We can say that the residual is the difference between the actual Y and the predicted Y.

To understand how to find the predicted value of Y, suppose we create a multiple linear regression equation with two independent variables.

In the multiple linear regression equation, we want to know how the influence of income and population on rice consumption. The variables used can be arranged in the regression equation as follows:

The dependent variable (Y) is rice consumption based on the above equation. The observation data from the rice consumption collected are named Y actual.

Next, to calculate the predicted value of Y, we need to find the intercept value (b0), the estimated parameter for the income variable (b1), and the estimated parameter for the population variable (b2).

To calculate the required parameter values according to the regression equation we have compiled, we must first compute multiple linear regression analysis. Analysis can be conducted either through manual calculations or using statistical software.

To get the predicted value of Y in the 1st observation, we need to enter the values of all the estimated parameters into the equation. The X1 and X2 that we enter into the equation come from the observed values of the independent variables (income and population variables).

Create residual and Y Predicted in Excel

To create the residual and Y predicted in the previous article, I have provided a tutorial on calculating it manually. I will use excel to create residuals and excel very quickly on this occasion.

I will use the data analysis tools provided in excel. Data analysis tools in excel can be seen in the “Data” menu, and then you will find “Data Analysis” in the upper right corner of your excel.

To enable “Data Analysis” in excel, you can follow the tutorial I wrote in the article entitled: “How to Activate and Load the Data Analysis Toolpak in Excel.

To create residual and Y predicted, you have to do multiple linear regression analysis like the article I wrote earlier with the title: “Multiple Linear Regression Analysis and Interpreting the Output in Excel.

Additional steps you need to do, include enabling “Residuals.” More clearly, you can see these stages in the image below:

After you activate the residuals, then you click OK. The residual output will immediately appear at the location you choose to save the result in excel.

You will find three columns in the residual output consisting of observations, Y predicted, and Residual. In the residual output, you will find the results of the calculations carried out for each observation.

To get the residual value for each of these observations, you can easily get it with a few clicks in Excel. This way is very helpful for those of you who need the residual and predicted Y values and don’t want to take a long time to get the residual and predicted Y values immediately.

Well, that’s all I can convey on this occasion. I hope this helps those of you here trying to find information on how to create residual and Y predicted in Excel. See you in the next article!

Leave a Comment