The Role of Granger Causality in Identifying Causal Relationships for Time Series Forecasting

Member-only story David Andres Follow DataDrivenInvestor -- Share We introduced Vector Autoregression or VAR in a previous article. But, does it really make sense to use two (or more) different

Member-only story

David Andres

DataDrivenInvestor

We introduced Vector Autoregression or VAR in a previous article. But, does it really make sense to use two (or more) different variables to make a forecast? The answer is no, not always at least. It will only be really beneficial if there is some kind of relationship between them. Using unrelated variables could introduce noise into the model, worsening the predictions rather than enhancing them.

Disclaimer: some parts of this article have been enhanced with the assistance of AI, using AI-generated Python code as a base or paragraph rephrasing for clarity improvement purposes.

A straightforward way of checking whether there is any relationship could be by checking the correlation between the variables. But this is not the only type of relationship that two variables can have.

We could also check if there is some sort of causation relationship between them. However, there is something essential to understand first, that our tests show causation doesn’t necessarily mean that one is the cause of the other, it simply means that there is some kind of correlation in past data.

Let’s understand this concept better with an example that aims to find whether COVID cases and deaths are somehow related.

Let’s first import the data and do some basic processing. We will use COVID data from Germany. The data is split by state, county, age, and gender; but for the sake of simplicity, we will group it together.

Pet Supplies Plus to Host First

The Second Life of Electric Vehicle Batteries: Stationary Energy Storage

News

The Role of Granger Causality in Identifying Causal Relationships for Time Series Forecasting