How to Calculate Y Predicted and Residual Values in Simple Linear Regression

February 18, 2022

8,335 views

The residual value in linear regression analysis needs to be calculated first before calculating the variance. In addition, the linear regression of the ordinary least square method must pass the assumption test that the residuals must be normally distributed. However, before calculating the residual value, you must first calculate the predicted Y value. Therefore, we will discuss how to calculate the predicted Y value and residual value on this occasion.

First, we will find out how to get the predicted Y value in simple linear regression. This model only consists of one dependent variable and one independent variable. In last week’s article, a tutorial was given on calculating the coefficients of the regression parameters, namely the intercept (bo) value and the b1 coefficient. These two values will be used to calculate the Y Predicted value.

As we already know, the general equation for simple linear regression is:

Y = bo + b1X

Where

Y = dependent variable

X = independent variable

bo = intercept

b1 = regression coefficient

The predicted Y value can be calculated for each observation based on this equation. The way to calculate it is by adding and multiplying each coefficient of the estimation result with the initial observation value of the independent variable.

For example, if the intercept value is 218.38 and the estimated coefficient value for the X variable is -0.0014. Furthermore, the first observation value for the variable X is 6000. So, how to calculate the predicted Y for this 1st observation is:

Y = bo + b1X

Y = 218.38 + (-0.0014)*6000

Y = 210.003

You can use Microsoft Excel to simplify and save time in calculating Y predicted. In the same way, you can calculate the predicted Y value for all existing observation data or sample data. Congratulations, you have successfully calculated the Y predicted value correctly.

Referring to the beginning of the paragraph, Y predicted is used to calculate the residual value in a simple linear regression analysis. Before calculating the residual value, you should know the definition of residual value in regression analysis. The residual value is the difference between the actual observed value of the dependent variable (Y) and the predicted Y value. The formula to calculate it can be seen in the following equation:

Residual = Y Actual – Y Predicted

For example, if the Actual Y value is 213, then you can calculate the residual value as follows:

Residual = Y Actual – Y Predicted

Residual = 213 – 210.003

Residual = 2.997

You have successfully calculated the residual value for the first observation/sample from these calculations. Next, you need to calculate residual values for all observations/samples in your study.

This regression estimation can use historical data/time series data and cross-section data. Based on the calculations, we can determine the difference between the actual and predicted values of the regression estimation results for the dependent variable. For example, based on two decades of annual sales data, we can forecast sales data for the next few years using this simple linear regression.

Based on the calculation of the first observation, we get a residual value of 2.997. After calculating all residual values, we can test for normality. One of the simple linear regression assumptions that must be met is that the residuals are normally distributed. This assumption must be met so that the regression estimation results produce the Best Linear Unbiased Estimator (BLUE). For those who are more interested in learning to use audio-visuals, “Kanda Data” has prepared a video tutorial. This video is delivered in Indonesian, and please use English subtitles:

Hopefully, the video you have watched is clear. If you still have questions, please leave them in the comments column, or you can also comment below the post of this article. Before we end our topic this time, allow me to recap. On this occasion, we have learned to find the residual value obtained from the difference between the actual Y value and the predicted Y. Predicted Y value is obtained by adding and multiplying each coefficient of the estimation result with the initial observation value of the independent variable.

For those who want to get updated video tutorials related to statistics, econometrics and data, please visit the “Kanda Data” youtube channel. See you in the next article!

2 COMMENTS

Djst org April 5, 2022 At 5:06 pm

I take pleasure in, lead to I discovered just what I used to be having a look for. You have ended my 4 day long hunt! God Bless you man. Have a nice day. Bye

Reply
- Kanda Data April 5, 2022 At 7:12 pm
  
  I’m glad to know this article was beneficial for you. Thank You
  
  Reply

How to Calculate Y Predicted and Residual Values in Simple Linear Regression

Interpreting Negative Intercept in Regression

Calculating Predicted Y and Residual Values in Simple Linear Regression

Calculation Formula for the Coefficient of Determination (R Square) in Simple Linear Regression

2 COMMENTS

LEAVE A REPLY Cancel reply

Most Popular

Assumptions of Multiple Linear Regression on Time Series Data

Analysis of Cobb-Douglas Production Function: Theoretical Basics and Case Study Examples

Understanding the Profit Formula in Financial Analysis and Examples of Its Calculation

What to Do If the Regression Coefficient Is Negative?

Why Should Data Transformation Be Done Only Once?

How to Find Residuals Using the Data Analysis ToolPak in Excel

Analyzing Rice Production Changes with a Paired t-Test Before and After Training Using Excel

Recent Comments

ABOUT US

FOLLOW US