Skip to main content

Exploring Gegenbauer Autoregressive Moving Average (GARMA) Models in Time Series Analysis: A Tool for Long Memory Data

 

Introduction

    With the vast amount of time series data being generated across the globe, from financial markets and cryptocurrencies to climate and environmental sciences, effective modeling has become essential. Time series models play a crucial role in identifying future patterns, forecasting values, and uncovering the factors that influence these dynamic processes.

    While modern machine learning and deep learning models are increasingly applied in time series analysis, their lack of interpretability often limits their practical use. As a result, classical time series models such as AR, MA, ARIMA, and SARIMA remain widely used due to their transparency and ease of interpretability. However, these traditional models typically fail to capture long-term dependencies in the data, that is, relationships between observations that are far apart in time. This is where long memory time series models become valuable. Among them, Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Generalized Autoregressive Moving Average (GARMA) models are particularly effective in modeling such dependencies.

    In this article, I will provide a simplified and accessible explanation of the theoretical foundation of GARMA models, making it easy for anyone with a basic understanding of time series to follow along. For the ARFIMA models, I will do a separate article.

Why Do We Need GARMA Models?

Figure 1 - ACF plot of a long memory time series with seasonality

If you have ever looked at an autocorrelation plot (ACF) of a long-memory time series, like Figure 1, you've seen a slow, decaying pattern with a periodicity of autocorrelation values. This represents a long-memory process with seasonality or cyclic pattern, where past observations highly correlate with the future values far from them and with a recurring pattern. Standard SARIMA models, designed for the quick drop-off of short-term dependencies and seasonality, cannot capture this gradual decay. They often misinterpret this long-range dependence as a non-stationary trend, leading to poor forecasts. This fundamental mismatch is the critical motivation for turning to GARMA models, which are explicitly designed to model this unique slow-decay and periodic behavior.

From SARIMA to GARMA: The Evolution

The SARIMA(p, d, q)(P, D, Q, s) model, short for Seasonal Autoregressive Integrated Moving Average, has long been a cornerstone of time series analysis. It captures both the dependence of current values on past observations (the AR part), the influence of past random shocks (the MA part), and the seasonality. However, real-world data often exhibit long-memory behavior, in which correlations decay slowly over time rather than vanishing quickly as SARIMA models assume. To address this, researchers introduced fractional differencing, giving rise to the GARMA model. GARMA allows the differencing parameter to take fractional rather than integer values, while avoiding over-differentiation. By incorporating Gegenbauer polynomials, it also models seasonal and cyclical patterns simultaneously, making it ideal for time series with both long memory and periodic behavior.


The GARMA Model Specification

A K-factor GARMA model equation can be specified as follows:

  • B is the lag operator.
  • μ : Mean of the time series (if the series is not mean-centered) 
  • ϕ(B) :  Represents the short-memory Autoregressive component of order p (to estimate p, PACF can be used as in ARIMA models)
  • θ(B) : Represents the short-memory Moving Average component of order q (to estimate p, ACF can be used as in ARIMA models)
  • (1 - 2uiB + B2)di : Gegenbauer factor (with ui = cos(λi)) where λi 's are Gegenbauer frequencies (These frequencies can be estimated using the spectral density function plot of the time series).
  • d: Represents the degree of  fractional differencing (This needs to be estimated by methods like Whittle likelihood approximation.) 
  • id : Represents the degree of integer differencing (Usually this is the order of the integration of the time series)
  • εt : Residual term 
In the above equation, you can see a K factor GARMA model where k is determined by the number of peaks in the spectral density function plot of the time series (you can see a sample spectral density function plot in Figure 2). Actually, this k represents the number of periodic patterns that can be identified in the time series plot.


Figure 2 - Spectral Density function Plot


Estimation and Implementation

    Basically, that is the theory behind GARMA models in a simple way. To estimate these parameters, there are advanced mathematical concepts are used, which were not discussed here. To apply the GARMA model, you need to know above above-discussed basic concepts.

    To estimate a GARMA model for a given time series, the "garma" R package can be used, and I can write an article on that where how you can fit a GARMA model step by step using the "garma" package in R. If you need that one, please comment on this article.

References / Further Reading

  • Hunt, M. R. (2025). Package ‘ garma .’ https://doi.org/10.1007/s00362-022-01290-3'

  • Peiris, S., Allen, D. E., & Hunt, R. (2025). Optimal Time Series Forecasting Through the GARMA Model. 1–23.

  • Ferrara, L., & Guégan, D. (2006). Fractional seasonality : Models and Application to Economic Activity in the Euro Area ∗. June, 1–24.

Comments

Popular posts from this blog

Introduction to Spatiotemporal Kriging: A Powerful Tool for Space-Time Data Imputation

  Introduction to Spatiotemporal Kriging: A Powerful Tool for Space-Time Data Imputation In many real-world applications, data are collected across both space and time, think of rainfall measurements across a network of weather stations over several months, or air pollution levels recorded hourly at various urban locations. Analyzing such data requires methods that can account for both spatial and temporal dependencies. This is where spatiotemporal kriging comes in, a geostatistical technique designed to interpolate or predict missing values in datasets that vary across space and time. In this article, I will guide you step by step through the process of performing spatio-temporal kriging for imputing missing values. Consider a spatiotemporal process denoted as: { Z ( s , t ) : ( s , t ) ∈ D ⊆ R d × R } Here, s ∈ R d \mathbf{s} \in \mathbb{R}^d , with d = 2 d = 2 , represents the spatial location (typically latitude and longitude), and t ∈ R denotes time.  This formulatio...