Reproducible Research with Gretl: Scripts, Output, and Best Practices

How to Run Time Series Models in Gretl: Step-by-Step TutorialTime series analysis is essential for understanding how variables evolve over time — forecasting GDP, modeling stock prices, or analyzing seasonal demand. Gretl (Gnu Regression, Econometrics and Time-series Library) is a free, open-source econometrics package well suited for teaching and applied work. This tutorial walks through the full workflow for time series modeling in Gretl: data import, visualization, transformation, stationarity testing, model selection (ARIMA, VAR, and basic state-space concepts), diagnostic checking, and forecasting. Each step includes practical tips and example commands.

Prerequisites and setup

Install the latest Gretl from gretl.sourceforge.net or your OS package manager.
Basic familiarity with time series concepts (stationarity, autocorrelation, lags) is helpful but not required.
Example dataset: monthly series of a hypothetical variable “y” (e.g., log GDP) and an explanatory series “x” (e.g., industrial production).

Open Gretl and create a new session or open a script window for reproducibility. You can run commands interactively or save them as a script (.inp) to reproduce results later.

1. Importing and preparing time series data

Gretl supports many formats: CSV, Excel, Stata, EViews, and its native gretl data files. For time series, specify the sample frequency (yearly, quarterly, monthly, etc.) so Gretl can handle dates and seasonal features correctly.

Example: importing a CSV with a monthly series where the first column is date (YYYY-MM) and subsequent columns are variables.

File -> Open data -> Import -> ASCII data (CSV) or use the menu: File → Open data → Import → CSV file.
In the import dialog, specify that the first column contains dates and choose the date format.
After import, set the time series frequency if needed: Dataset -> Dataset properties -> Periodicity.

Command-line import (script):

# assuming CSV with header, date column named "date" open data.csv --csv # set sample period manually if needed, e.g., monthly starting Jan 2000 for 312 obs: setobs 12 2000:1 --time-series

If dates aren’t parsed automatically, use setobs to define frequency and start period. For example, quarterly data starting in 1990Q1:

setobs 4 1990:1 --time-series

2. Visualizing the data

Plotting helps spot trends, seasonality, structural breaks, and outliers.

To plot a series: double-click the variable in the main window and choose “Graph” → “Time series plot”.
For multiple series: select variables, then Graph → Time series plot (multiple).

Gretl script example:

# plot single series gretlcli --remote 'gnuplot y --time-series' # or within a .inp script gnuplot y --time-series --output=screen

Look for trends (non-stationarity), cycles, and seasonal patterns. Consider logging or differencing if variance or mean is changing.

3. Transformations: logs, differences, seasonality

Common transformations:

Log: use when variance scales with level: genr ly = log(y)
First difference: to remove trend: genr dy = diff(y)
Seasonal difference (for monthly/quarterly): genr ds = y – y(-12) or use diff with lag: genr d12y = diff(y, 12)
Detrending via regression on time: genr t = $nobs; ols y 0 const t

Commands:

genr ly = log(y) genr dy = diff(y)         # y_t - y_{t-1} genr d12y = diff(y,12)    # seasonal difference

Always inspect the transformed series graphically and with summary stats.

4. Stationarity tests

Most time series models require stationarity. Use unit-root tests to decide.

Augmented Dickey-Fuller (ADF): Tools → Unit root tests → ADF, or script:
```
# ADF with constant and one lag adf y --c --test-down 
```
Phillips-Perron (PP): Tools → Unit root tests → PP
KPSS (trend/stationary test): Tools → Unit root tests → KPSS

Interpreting ADF: reject null (unit root) ⇒ series is stationary. If non-stationary, difference or detrend and retest.

Example script to test log series with trend:

adf ly --ct      # include constant and trend

5. Examining autocorrelation: ACF and PACF

Autocorrelation and partial autocorrelation plots guide ARIMA model selection.

Graph → Correlogram or use:
```
corrgram y --acs 
```
Look for:
AR(p): PACF cuts off after p lags, ACF tails off.
MA(q): ACF cuts off after q lags, PACF tails off.
ARMA: both tail off.

6. ARIMA modeling in Gretl

Gretl can estimate ARIMA/SARIMA models via the ARIMA menu or the arima command.

Basic ARIMA(p,d,q):

p = AR order, d = differencing order, q = MA order.

Example: estimate ARIMA(1,1,1) for y (first-differenced):

arima 1 1 1 --y=y

Seasonal ARIMA (SARIMA): include seasonal orders (P,D,Q,s):

# SARIMA(1,1,1)(1,1,1)[12] for monthly data arima 1 1 1 1 1 1 --y=y --season=12

Gretl GUI: Model → Time series → ARIMA. After estimation, check coefficients, standard errors, and information criteria (AIC/BIC) for model selection.

7. Model diagnostics

Essential diagnostics:

Residual autocorrelation: Ljung-Box Q-test (Tools → Serial correlation → Ljung-Box) or script:
```
mod1 = arima 1 1 1 --y=y # run residual diagnostics modtest --autocorr 12 mod1 
```
Residual normality: Jarque-Bera test (modtest –normality).
Heteroskedasticity: ARCH test (modtest –arch).

Aim for white-noise residuals: no autocorrelation, mean zero, constant variance.

Plot residuals, ACF of residuals, and histogram/QQ plot.

8. Forecasting with ARIMA

Use the forecast menu or the smpl/forc commands.

Example script to forecast 12 steps ahead:

# estimate model and forecast 12 periods arima 1 1 1 --y=y --out-of-sample=12 # or use the 'fcast' command: fcast 12 --print --conf=95

Gretl will produce point forecasts and confidence intervals. Always compare forecast performance using holdout samples and accuracy metrics (RMSE, MAE).

9. Multivariate time series: VAR models

Vector Autoregression (VAR) models capture dynamic interactions between multiple series (e.g., y and x).

Steps:

Ensure variables are stationary (difference if necessary).
Choose lag order via AIC/BIC (Model → Time series → VAR → Lag length selection).
Estimate VAR and run impulse response functions (IRF) and variance decomposition.

Script example:

# assume dy and dx are stationary var 2 dy dx              # VAR with 2 lags # lag order selection varlag dy dx --aic --bic # impulse responses irf 20 --orthogonal

Interpretation: IRFs show how one variable responds over time to a shock in another variable.

10. Cointegration and error-correction models (ECM)

If nonstationary series are I(1) but a linear combination is stationary, cointegration exists. Steps in Gretl:

Test for cointegration (Engle-Granger):

# regress y on x ols y const x # perform residual-based ADF on residuals adf $uhat --c

If cointegrated, estimate an ECM:

# generate lagged levels and differences, then OLS on ECM form genr dy = diff(y) genr dx = diff(x) genr lqy = y(-1) genr lqx = x(-1) ols dy const dx lqy lqx

Alternatively, use Johansen test for multiple cointegrating vectors: Model → Time series → Cointegration (Johansen).

11. State-space models and Kalman filter (intro)

Gretl has limited built-in state-space capabilities but supports some Kalman-filter estimation via user scripts or calling external libraries. For standard ARIMA state-space forms, use arima’s built-in functionality. For more advanced state-space work, consider exporting data to R (packages like dlm or KFAS) or Python.

12. Reproducibility: scripting and saving output

Save your work:

Save dataset: File → Save data as → gdt (Gretl data).
Save script: File → Save script (.inp).
Export output to text/HTML: Save log or use print/export functions in scripts.

Example of a reproducible script header:

open data.csv --csv setobs 12 2000:1 --time-series genr ly = log(y) genr dy = diff(ly) adf dy --c arima 1 0 1 --y=dy fcast 12 --conf=95

13. Practical tips and common pitfalls

Always check and set the correct periodicity with setobs.
Use plots and ACF/PACF to guide model choice — don’t rely purely on automatic selection.
Prefer simpler models when performance is similar (parsimony).
When forecasting, reserve a holdout sample for out-of-sample validation.
Beware of structural breaks; consider sub-sample analysis or dummy variables.

14. Example: end-to-end script (monthly data)

A concise example script from import to forecast:

open mydata.csv --csv setobs 12 2000:1 --time-series genr lny = log(y) genr dy = diff(lny) adf dy --c # look at ACF/PACF corrgram dy --acs # estimate ARIMA(1,0,1) mod1 = arima 1 0 1 --y=dy modtest --autocorr 18 mod1 fcast 12 --print --conf=95

15. Resources and next steps

Gretl user guide and examples included in the program (Help → Manuals).
For advanced modeling (state-space, advanced volatility models), combine Gretl with R or Python workflows.

If you want, I can convert the example script to your specific dataset (tell me frequency, start date, and variable names), or produce a step-by-step screencast-style checklist.