Measuring Accuracy of Time Series Forecasts Using the Forecast Package in R

Measuring Accuracy with Forecast Package in R

Introduction

The forecast package in R is a powerful tool for time series forecasting, providing various methods to predict future values based on historical data. One of the key aspects of evaluating forecast models is assessing their accuracy. In this article, we will delve into how to measure the accuracy of a forecasting model using the forecast package, specifically focusing on the Snaive function.

Background

Before diving into the solution, let’s briefly review the concepts involved:

Time series analysis: The process of analyzing and understanding the patterns in data that occur over time.
Forecasting: The practice of predicting future values or events based on past data and patterns identified during the analysis phase.
Accuracy measures: Various methods used to evaluate how well a forecasting model performs, such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and others.

Setting Up the Data

To demonstrate the process of measuring accuracy with the forecast package, we will use an example dataset. The dataset consists of artificial data representing sales for two products: Sales 1 and Sales 2 from January 2014 to November 2019.

library(dplyr)
library(ggplot2)
library(forecast)

# Create a data frame with the sales data
MY_DATA_input <- data.frame(
    Sales1 = rnorm(72, mean = 1000, sd = 500),
    Sales2 = rnorm(72, mean = 800, sd = 400)
)

# Convert the data into a time series object (ts)
MY_DATA <- ts(MY_DATA_input, start = c(2014, 1), end = c(2019, 11), frequency = 12)

Forecasting with Snaive Function

Next, we will define the forecasting function using the snaive method from the forecast package. The snaive function is a simple and intuitive method for time series forecasting that involves smoothing the data to make predictions.

# Set the forecast horizon (number of periods ahead)
hrz <- 13

# Split the data into training and testing sets
train <- window(MY_DATA, start = c(2014, 1), end = c(2017, 11))
test <- window(MY_DATA, start = c(2017, 12), end = c(2019, 11))

# Convert the seasonality period from numeric to integer
seasons <- as.integer(12)

# Define the forecasting function
FORECASTING_FUNCTION_SNAIVE <- function(Z, hrz) {
    # Create a time series object with seasonal component
    timeseries <- msts(Z, start = year_start, seasonal.periods = seasons)
    
    # Apply the snaive method for forecasting
    forecast <- snaive(timeseries, h = hrz)
    
    return(forecast)
}

# Apply the forecasting function to the training data
FORECASTING_LIST_SNAIVE <- lapply(X = train, FORECASTING_FUNCTION_SNAIVE)

# Calculate the forecast values for the testing set
forecast_test <- na.omit(lapply(FORECASTING_LIST_SNAIVE, function(x) x[12:36]))

# Compute accuracy measures (e.g., MAE, RMSE)
accuracy(forecast = forecast_test, x = test)

Note that we omit the first 11 observations from the testing set when applying the forecasting function to avoid overfitting.

Evaluating Accuracy Measures

The accuracy function in R provides a convenient way to compute various metrics for evaluating the accuracy of our forecast model. We will examine these measures using the data generated by our example code.

# Evaluate accuracy measures (ME, RMSE, MAE, MPE, MAPE, MASE, ACF1, and Theil's U)
accuracy(forecast = forecast_test, x = test)

This results in a summary of metrics that provide insight into how well our forecasting model performed on the testing set.

Conclusion

Measuring accuracy is an essential step when evaluating the performance of a time series forecasting model. In this article, we demonstrated how to use the forecast package in R to measure the accuracy of a model using the Snaive function. By applying various metrics and adjusting our model for overfitting, we can refine our approach to better predict future values.

By following these steps and experimenting with different methods, you will be able to develop robust forecasting models that help drive informed decision-making in your time series analysis applications.

Last modified on 2023-05-25