One of the assumptions in linear regression using the ordinary least square (OLS) method is that there is no strong correlation between independent variables. To get the Best Linear Unbiased Estimator in linear regression with ≥ 2 independent variables, you must be fulfilled the non-multicollinearity assumption.
On this occasion, Kanda Data will discuss an article about multicollinearity tests and interpreting the output in linear regression. The objective of the multicollinearity test is to determine whether there is a strong correlation or not between the independent variables.
In the multicollinearity test, you recommend formulating the hypothesis first. The hypothesis for the multicollinearity test can be created as follows:
Ho: There is no multicollinearity
H1: There is multicollinearity
On this occasion, I will test for multicollinearity using the Variance Inflation Factor (VIF). The criteria for testing the hypothesis are:
VIF ≤ 10: Ho is accepted
VIF > 10: Ho is rejected (H1 is accepted)
Multicollinearity Test Using Variance Inflation Factor (VIF) in STATA
An example of a mini-research used on this occasion aims to determine the effect of income and population on rice consumption. Here we will test whether there is a strong correlation between the income variable and the population variable or not. The data of our mini-research for exercise can be seen in the table below:
In the first step, you input all the data I have conveyed above by selecting the table icon with a pencil drawing (Data Editor) in STATA.
Data from the rice consumption variable (Y) is inputted in the first column, then data from the income (X1) and population (X2) variables are entered in the 2nd column and 3rd column.
To conduct the multicollinearity test, you type in the command in STATA as follows:
regress Y X1 X2
Next, you can press enter, and for the multicollinearity test using Variance Inflation Factor (VIF), then you type in the command in STATA as follows:
estat vif
Next, you can press enter, and the multicollinearity test results using Variance Inflation Factor (VIF) will appear.
Interpreting the Output of Multicollinearity Test
The output of the multicollinearity test using STATA can be seen in the table below:
Based on the multicollinearity test output, the VIF value of X1 and X2 is 2.91. Based on the hypothesis that has been created previously, the results of hypothesis testing indicate that the null hypothesis is accepted (VIF is lower than 10). It can be concluded that there is no multicollinearity.
Furthermore, we can conclude that the income and population variables do not strongly correlate. Because there is no multicollinearity, the regression model has fulfilled the OLS assumption. Well, that’s the article on this occasion from kanda data. See you in the following article!