Multicollinearity Test in R Studio for Multiple Linear Regression Using Time Series Data

In time series data analyzed using multiple linear regression with the ordinary least squares (OLS) method, it is also necessary to test for multicollinearity. The multicollinearity test is one of the assumption tests to ensure the best linear unbiased estimator.

The purpose of the multicollinearity test is to detect whether there is a strong correlation between independent variables. If there is strong correlation among the independent variables in the regression equation, the regression equation is said to have multicollinearity.

In the OLS assumption, it is assumed that there is no multicollinearity. Thus, the regression equation assumes no strong correlation among the independent variables.

To detect multicollinearity, we can perform a correlation test between independent variables. However, the most popular approach among researchers is to examine the Variance Inflation Factor (VIF) values.

In this article, I will share a tutorial on how to test for multicollinearity in a multiple linear regression equation using time series data in R Studio. We will use the same case study as in the previous article.

Example Case Study for Multicollinearity Testing with Time Series Data

As practice material for this article, I used a case study from research aimed at determining the effects of inflation and unemployment rates on economic growth.

The researcher has collected quarterly time series data from a country. The specification for the multiple linear regression equation can be formulated as follows:

𝑌=𝛽0+𝛽1𝑋1+𝛽2𝑋2+…+𝛽𝑛𝑋𝑛+𝜖

Where:

𝑌 is economic growth (%) as the dependent variable,

𝑋1 is the inflation rate (%) as the first independent variable,

𝑋2 is the unemployment rate (%) as the second independent variable,

𝛽0 is the intercept (constant),

𝛽1 and 𝛽2 are the regression coefficients,

𝜖 is the error or residual.

Based on the collected data, the data is then input and tabulated in Excel as shown in the following table:

Steps for Multicollinearity Testing in Multiple Linear Regression with Time Series Data

The initial step for multicollinearity testing in R Studio is to type the multiple linear regression analysis command in R Studio. The method for importing data from an Excel file into R Studio was covered in my previous article.

To conduct the analysis, the following command can be written in R Studio:

model <- lm(Economic_Growth ~ Inflation_Rate + Unemployment_Rate, data = data)

summary(model)

After typing the command correctly in R Studio, the analysis output will appear as follows:

The next step is to obtain the Variance Inflation Factor (VIF) values. In R Studio, you need to install the “car” package first if you are using it for the first time. If you have installed it previously, this step can be skipped.

The command to install the “car” package in R Studio is as follows:

install.packages(“car”)

Next, the command for the multicollinearity test in R Studio is as follows:

library(car)
vif(model)

After typing the command correctly and pressing Enter, the VIF output will appear as follows:

Inflation_Rate Unemployment_Rate          

1.58248           1.58248

Based on the analysis results, the VIF values for the Inflation Rate and Unemployment Rate variables are 1.58248. These VIF values are low and below 10, allowing us to conclude that the tested multiple linear regression equation with time series data does not exhibit multicollinearity.

The absence of multicollinearity indicates no strong correlation between independent variables. Thus, we have satisfied the assumptions required for multiple linear regression using the OLS method.

That concludes this tutorial on multicollinearity testing. I hope the information in this article is useful for you. Stay tuned for next week’s article update, which will discuss autocorrelation testing in multiple linear regression. See you!