Thursday, July 25, 2024
HomeMultiple Linear RegressionFinding Coefficients bo, b1, b2, and R Squared Manually in Multiple Linear...

Finding Coefficients bo, b1, b2, and R Squared Manually in Multiple Linear Regression

Researchers can choose to use multiple linear regression if the independent variables are at least 2 variables. On this occasion, Kanda Data will write a tutorial on manually calculating the coefficients bo, b1, b2, and the coefficient of determination (R Squared) in multiple linear regression.

In this article, I will write a calculation formula based on a book I have read and write how to calculate manually using Excel. For how to manually calculate the estimated coefficients in simple linear regression, you can read my previous article entitled: “Calculate Coefficients bo, b1, and R Squared Manually in Simple Linear Regression

In multiple linear regression, the number of independent variables can consist of 2, 3, 4 and > 4 independent variables. The researcher must test the required assumptions to obtain the best linear unbiased estimator.

This article does not write a tutorial on how to test assumptions on multiple linear regression using the OLS method but focuses more on calculating the estimated coefficients b0, b1, and b2 and the coefficient of determination manually using Excel.

Multiple linear regression analysis mini-research example

Manually calculating using multiple linear regression is different from simple linear regression. Calculating the estimated coefficient on multiple linear regression is more complex than simple linear regression.

However, researchers can still easily calculate the estimated coefficients manually with Excel. Therefore, because the calculation is conducted manually, the accuracy in calculating is still prioritized.

I have prepared a mini-research example of multiple linear regression analysis as exercise material. Data were collected over 15 quarters at a company. A researcher conducts observations to determine the influence of the advertising cost and marketing staff on product sales.

Data collection has been carried out every quarter on product sales, advertising costs, and marketing staff variables. The company has recorded the number of product unit sales for the last quarter. Based on this background, the specifications of the multiple linear regression equation created by the researcher are as follows:

Y = b0 + b1X1 + b2X2 + e

Description:

Y = product sales (units)

X1 = advertising cost (USD)

X2 = staff marketing (person)

b0, b1, b2 = regression estimation coefficient

e = disturbance error

Data has been collected from quarter 1 of 2018 to quarter 3 of 2021. In this case, the data used is quarterly time series data from product sales, advertising costs, and marketing staff. The data that researchers have collected can be seen in the table below:

The formula for calculating the estimated coefficients of bo, b1, and b2

Following what I have written in the previous paragraph, to avoid errors in calculating manually, I am here using Excel. Using Excel will avoid mistakes in calculations. There are two ways to calculate the estimated coefficients b0, b1 and b2: using the original sample observation and the deviation of the variables from their means.

To simplify the calculation of R squared, I use the variable’s deviation from their means. The formula used to calculate b0, b1 and b2 based on the book Koutsoyiannis (1977) can be seen as follows:

Finding the Estimation Coefficient of X1 Variable (b1)

Calculating the values of b0, b1 and b2 cannot be conducted simultaneously. We must calculate the estimated coefficients b1 and b2 first and then calculate the bo. On this occasion, I will first calculate the estimated coefficient of b1.

Furthermore, to calculate the value of b1, it is necessary to calculate the difference between the actual X1 variable and the average X1 variable and the actual Y variable and the average Y variable. In Excel, researchers can create a table consisting of components for calculating b1, as shown in the image below:

After creating a formula template in Excel, we need to calculate the average of the product sales variable (Y) and the advertising cost variable (X1). Furthermore, find the difference between the actual Y and the average Y and between the actual X1 and the average X1.

Next, you calculate according to the Excel table’s formula. In the next step, multiply x1y and square x1. In detail, the calculation stages can be seen in the image below:

Next, copy and paste the Excel formula from the 2nd quarter’s data to the last quarter’s data. To copy and paste formulas in Excel, you must pay attention to the absolute values of the average Y and the average X.

Absolute values can be applied by pressing F4 on the keyboard until a dollar sign appears. Next, please copy and paste the formula until you get the results as shown in the image below:

To find b1, use the formula I have written in the previous paragraph. The calculation results can be seen below:

Finding the Estimation Coefficient of X2 Variable (b2)

Furthermore, finding the estimation coefficient of the X2 variable (b2) is calculated the same as calculating the estimation coefficient of the X1 variable (b1). To find b2, use the formula I have written in the previous paragraph. The calculation results can be seen below:

Finding the Intercept Estimation Coefficient (b0)

Based on the order in which the estimation coefficients are calculated, finding the intercept estimation coefficient is carried out at the last stage. It is because to calculate bo, and it takes the values of b1 and b2. Based on the formula I wrote in the previous paragraph, finding the Intercept Estimation Coefficient (b0) can be seen as follows:

Finding the Coefficient of Determination (R Squared)

R Squared in multiple linear regression shows the goodness of fit of a model. Therefore, the calculation of R Squared is very important in multiple linear regression analysis. The higher R Squared indicates that the independent variable’s variance can explain the variance of the dependent variable well.

The value of R Squared is 0 to 1; the closer to 1, the better model can be. To manually calculate the R squared, you can use the formula that I cited from Koutsoyiannis (1977) as follows:

The last step is calculating the R squared using the formula I wrote in the previous paragraph. Based on the calculation results, the coefficient of determination value is 0.9285. In detail, it can be seen as follows:

Based on what has been calculated in the previous paragraphs, we have manually calculated the coefficients of bo, b1 and the coefficient of determination (R squared) using Excel.

We need to compare the analysis results using statistical software to crosscheck. If the output is similar, we can conclude that the calculations performed are correct.

Ok, this is the article I can write for you. Hopefully, it will be helpful for you. Thank you!

RELATED ARTICLES

Assumption of Residual Normality in Regression Analysis

1. […] It is necessary to calculate the predicted value of the dependent variable and the average value of the dependent variable to calculate the sum of squares. Therefore, it is necessary to calculate the value of the estimated regression coefficient first. To calculate the regression estimation coefficient, you can read the previous article entitled: “Finding Coefficients bo, b1, b2, and R Squared Manually in Multiple Linear Regression.“ […]