In this paper the tsfknn package for time series forecasting using KNN regression is described. The package allows, with only one function, specifying the KNN model and generating the forecasts. The user can choose among different multi-step ahead strategies and among different functions to aggregate the targets of the nearest neighbors. It is also possible to consult the model used in the prediction and to obtain a graph including the forecast and the nearest neighbors used by KNN.
Time series forecasting has been performed traditionally using statistical methods such as ARIMA models or exponential smoothing. However, the last decades have witnessed the use of computational intelligence techniques to forecast time series. Although artificial neural networks is the most prominent machine learning technique used in time series forecasting, other approaches, such as Gaussian Process or KNN, have also been applied. Compared with classical statistical models, computational intelligence methods exhibit interesting features, such as their nonlinearity or the lack of an underlying model, that is, they are non-parametric.
Statistical methodologies for time series forecasting are present in
CRAN as excellent packages. For example, the forecast
package includes implementations of ARIMA, exponential smoothing, the
theta method or basic techniques, such as the naive approach, that can
be used as benchmark methods. On the other hand, although a great
variety of computational intelligence approaches for regression are
available in R (see, for example, the caret package),
these approaches cannot be directly applied to time series forecasting.
Fortunately, some new packages are filling this gap. For example, the
nnfor package or the nnetar
function from
the forecast package allows us to predict time series
using artificial neural networks.
KNN is a very popular algorithm used in classification and regression. This algorithm simply stores a collection of examples. Each example consists of a vector of features (describing the example) and its associated class (for classification) or numeric value (for prediction). Given a new example, KNN finds its k most similar examples (called nearest neighbors), according to a distance metric (such as the Euclidean distance), and predicts its class as the majority class of its nearest neighbors or, in the case of regression, as an aggregation of the target values associated with its nearest neighbors. In this paper we describe the tsfknn R package for univariate time series forecasting using KNN regression.
The rest of the paper is organized as follows. Section 2 explains how KNN regression can be applied in a time series forecasting context using the tsfknn package. In Section 3 the different multi-step ahead strategies implemented in our package are explained. Section 4 discusses some additional feature of our package. Section 5 describes how the forecast accuracy of a KNN model can be assessed using a rolling origin evaluation. Finally, Section 6 draws some conclusions.
In this section we explain how KNN regression can be applied to forecast time series. To this end, we will use some functionality of the package tsfknn. Let us start with a simple time series: \(t = \{ 1, 2, 3, 4, 5, 6, 7, 8 \}\) and suppose that we want to predict its next future value. First, we have to determine how the KNN examples are built, that is, we have to decide what are the features and the targets associated with an example. The target of an example is a value of the time series and its features are lagged values of the target. For example, if we use lags 1-2 as features, the examples associated with the time series \(t\) are:
Features | Target |
---|---|
1, 2 | 3 |
2, 3 | 4 |
3, 4 | 5 |
5, 6 | 7 |
6, 7 | 8 |
In our package, you can consult the examples associated with a KNN
model used for time series forecasting with the
knn_examples
function:
library(tsfknn)
pred <- knn_forecasting(ts(1:8), h = 1, lags = 1:2, k = 2, transform = "none")
knn_examples(pred)
## Lag2 Lag1 H1
## [1,] 1 2 3
## [2,] 2 3 4
## [3,] 3 4 5
## [4,] 4 5 6
## [5,] 5 6 7
## [6,] 6 7 8
Before consulting the examples, you have to build the model. This is
done with the function knn_forecasting
that builds a model
associated with a time series and uses the model to predict the future
values of the time series. Let us see the main arguments of this
function:
timeS
: the time series to be forecast.h
: the forecast horizon, that is, the number of future
values to be predicted.lags
: an integer vector indicating the lagged values of
the target used as features in the examples (for instance, 1:2 means
that lagged values 1 and 2 should be used).k
: the number of nearest neighbors used by the KNN
model.transform
: set the kind of transformation applied to
the examples and their targets. In general, it is useful to forecast
time series with a trend. It will be explained later.knn_forecasting
is very handy because, as mentioned
above, it builds the KNN model and then uses the model to predict the
time series. This function returns a knnForecast
object
with information of the model and its prediction. As we have seen above,
you can use the function knn_examples
to see the examples
associated with the model. You can also consult the prediction or get a
plot through the knnForecast
object:
pred$prediction
## Time Series:
## Start = 9
## End = 9
## Frequency = 1
## [1] 7.5
plot(pred)
You can also consult how the prediction was made. That is, you can
consult the instance whose target was predicted and its nearest
neighbors. This information is obtained with the
nearest_neighbors
function applied to a
knnForecast
object:
nearest_neighbors(pred)
## [[1]]
## [[1]]$instance
## Lag 2 Lag 1
## 7 8
##
## [[1]]$nneighbors
## Lag 2 Lag 1 H1
## 1 6 7 8
## 2 5 6 7
Because we have used lags 1-2 as features, the features associated with the next future value of the time series are the last two values of the time series (vector \([7, 8]\)). The two most similar examples (nearest neighbors) of this instance are vectors \([6, 7]\) and \([5, 6]\), whose targets (8 and 7) are averaged to produce the prediction 7.5. You can obtain a nice plot including the instance, its nearest neighbors and the prediction:
library(ggplot2)
autoplot(pred, highlight = "neighbors")
As can be observed, each nearest neighbor has been plotted in a different plot (you can also select to get all the nearest neighbors in the same plot). The neighbors in the plots are sorted according to their distance to the instance, being the neighbor in the top plot the nearest neighbor.
By the way, this artificial example of a time series with a constant linear trend illustrates the fact that KNN is not suitable for predicting time series with a global trend. This is because KNN predicts an aggregation of historical values of the time series. Therefore, in order to predict a time series with global trend some detrending scheme should be used.
To recapitulate, because we use univariate time series, to specify a KNN model in our package you have to set:
the lags used to build the KNN examples. They determine the lagged values used as features or autoregressive explanatory variables.
k: the number of nearest neighbors used in the prediction.
In the previous section we have seen an example of one-step ahead prediction with KNN. Nonetheless, it is very common to forecast more than one value into the future. To this end, a multi-step ahead strategy has to be chosen. Our package implements two common strategies: the MIMO approach and the recursive or iterative approach (when only one future value is predicted both strategies are equivalent). Let us see how they work.
This strategy is commonly applied with KNN and it is characterized by the use of a vector of target values. The length of this vector is equal to the number of periods to forecast. For example, let us suppose that we are working with a time series of hourly electricity demand and we want to forecast the demand for the next 24 hours. In this situation, a good choice for the lags would be 1-24, that is, the demand of 24 consecutive hours. If the MIMO strategy is chosen, then an example consists of:
The new instance would be the demand in the last 24 hours of the time series. This way, we would look for the demands most similar to the last 24 hours in the time series and we would predict an aggregation of their subsequent 24 hours.
In the next example we predict the next 12 months of a monthly time series using the MIMO strategy:
pred <- knn_forecasting(USAccDeaths, h = 12, lags = 1:12, k = 2, msas = "MIMO")
autoplot(pred, highlight = "neighbors", faceting = FALSE)
The prediction is the average of the target vectors of the two nearest neighbors. As can be observed, we have chosen to see all the nearest neighbors in the same plot. Because we are working with a monthly time series, we have thought that lags 1-12 are a suitable choice for selecting the features of the examples. In this case, the last 12 values of the time series are the new instance whose target has to be predicted. The two sequences of 12 consecutive values most similar to this instance are found (in blue) and their subsequent 12 values (in green) are averaged to obtain the prediction (in red).
The recursive or iterative strategy is the approach used by ARIMA or exponential smoothing to forecast several periods ahead. Basically, a model that only forecasts one-step ahead is used, so that the model is applied iteratively to forecast all the future values. When historical observations to be used as features of the new instance are unavailable, previous predictions are used instead.
Because the recursive strategy uses a one-step ahead model, this means that, in the case of KNN, the target of an example only contains one value. For instance, let us see how the recursive strategy works with the following example in which the next two future quarters of a quarterly time series are predicted:
timeS <- window(UKgas, start = c(1976, 1))
pred <- knn_forecasting(timeS, h = 2, lags = 1:4, k = 2, msas = "recursive")
library(ggplot2)
autoplot(pred, highlight = "neighbors")
In this example we have used lags 1-4 to specify the features of an example. To predict the first future point the last 4 values of the time series are used as “its features”. To predict the second future point “its features” are the last three values of the time series and the prediction for the first future point. In the plot the prediction for the first future point can be seen. If you reproduce this code snippet you will also see the forecast for the second future point.
In this section several additional features of our package are described.
By default, the targets of the different nearest neighbors are
averaged. However, it is possible to combine the targets using other
aggregation functions. Currently, our package allows us to choose among
the mean, the median and a weighted mean using the cb
parameter of the knn_forecasting
function. In the
weighted mean the target are weighted by the inverse of their
distance. That is, closer neighbors of a query point will have a greater
influence than neighbors which are further away.
Regarding the distance function applied to compute the nearest neighbors, our package uses the Euclidean distance, although we can implement other distance metrics in the future.
In order to specify a KNN model the user has to select, among other things, the value of the k parameter. Several strategies can be used to choose this value. A first, fast, straightforward solution is to use some heuristic (it is recommended setting k to the square root of the number of training examples). Other approach is to select k using an optimization tool on a validation set. k should minimize a forecast accuracy measure. The optimization strategy is very time consuming.
A third strategy is to use several KNN models with different k values. Each KNN model generates its forecasts and the forecasts of the different models are averaged to produce the final forecast. This strategy is based on the success of model combination in time series forecasting. This way, the use of a time consuming optimization tool is avoided and the forecasts are not based on an unique, heuristic k value. In our package you can use of this strategy specifying a vector of k values:
pred <- knn_forecasting(ldeaths, h = 12, lags = 1:12, k = c(2, 4))
pred$prediction
## Jan Feb Mar Apr May Jun Jul Aug
## 1980 2736.719 2901.029 2610.875 2098.239 1765.176 1515.711 1402.958 1305.580
## Sep Oct Nov Dec
## 1980 1211.597 1428.876 1575.126 2256.334
plot(pred)
KNN is not suitable for forecasting a time series with a trend. The
reason is simple, KNN predicts an average of historical values of the
time series, so it cannot predict correctly values out of the range of
the time series. If your time series has a trend we recommend using the
parameter transform
to transform the training samples. Use
the value "additive"
if the trend is additive or
"multiplicative"
for exponential time series:
set.seed(5)
timeS <- ts(1:10 + rnorm(10, 0, .2))
pred <- knn_forecasting(timeS, h = 3, transform = "none")
plot(pred)
pred2 <- knn_forecasting(timeS, h = 3, transform = "additive")
plot(pred2)
After a lot of experimentation we have observed that, in general, the additive transformation works better than the multiplicative transformation. The additive transformation works this way:
It is easy to see an example of additive transformation using the API of the package. For example, let us see the examples of a model with no transformation:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
model_n <- knn_forecasting(timeS, h = 1, lags = 1:2, k = 2, transform = "none")
knn_examples(model_n)
## Lag2 Lag1 H1
## [1,] 1 3 7
## [2,] 3 7 9
## [3,] 7 9 10
## [4,] 9 10 12
plot(model_n)
And now let us see the effect of the additive transformation:
model_a <- knn_forecasting(timeS, h = 1, lags = 1:2, k = 2, transform = "additive")
knn_examples(model_a)
## Lag2 Lag1 H1
## [1,] -1.0 1.0 5.0
## [2,] -2.0 2.0 4.0
## [3,] -1.0 1.0 2.0
## [4,] -0.5 0.5 2.5
plot(model_a)
The forecast of the additive model is 14.5:
model_a$pred
## Time Series:
## Start = 7
## End = 7
## Frequency = 1
## [1] 14.5
Let us see how this forecast is built. The last two values of the
series c(10, 12)
are the instance or query point. This
instance is transform to c(-1, 1)
by subtracting its mean
value. Its two nearest neighbors are the first and third examples. Their
targets are 5 and 2 respectively. These target are averaged obtaining
3.5. Finally, we add 3.5 to the mean of the query point, 11, getting the
final forecast 14.5.
The multiplicative transformation is similar to the additive transformation:
Sometimes a great number of time series have to be forecast. In that situation, an automatic way of generating the forecasts is very useful. Our package is able to automatically choose all the KNN parameters. If the user only specifies the time series and the forecasting horizon the KNN parameters are selected as follows:
frequency(ts) == f
where ts
is the time
series to be forecast and \(f > 1\)
then the lags used as autoregressive features are 1:f. For
example, the lags for quarterly data are 1:4 and for monthly data
1:12.frequency(ts) == 1
, then:
The function rolling_origin
uses the rolling origin
technique to assess the forecast accuracy of a KNN model. In order to
use this function a KNN model has to be built previously. Let us see how
rolling_origin
works with the following artificial time
series:
pred <- knn_forecasting(ts(1:20), h = 4, lags = 1:2, k = 2)
ro <- rolling_origin(pred, h = 4)
The function rolling_origin
uses the model generated by
a knn_forecasting
call to apply rolling origin evaluation.
The object returned by rolling_origin
contains the results
of the evaluation. For example, the test sets can be seen this way:
print(ro$test_sets)
## h=1 h=2 h=3 h=4
## [1,] 17 18 19 20
## [2,] 18 19 20 NA
## [3,] 19 20 NA NA
## [4,] 20 NA NA NA
Every row of the matrix contains a different test set. The first row
is a test set with the last h
values of the time series,
the second row a test set with the last h
- 1 values of the
time series and so on. Each test set has an associated training test
with all the data in the time series preceding the test set. For every
training set a KNN model with the parameters associated with the
original model is built and the test set is predicted. You can see the
predictions as follows:
print(ro$predictions)
## h=1 h=2 h=3 h=4
## [1,] 17 18 19 20
## [2,] 18 19 20 NA
## [3,] 19 20 NA NA
## [4,] 20 NA NA NA
and also the errors in the predictions:
print(ro$errors)
## h=1 h=2 h=3 h=4
## [1,] 0 0 0 0
## [2,] 0 0 0 NA
## [3,] 0 0 NA NA
## [4,] 0 NA NA NA
Several forecasting accuracy measures applied to all the errors in the different test sets can be consulted:
ro$global_accu
## RMSE MAE MAPE
## 0 0 0
It is also possible to consult the forecasting accuracy measures for every forecasting horizon:
ro$h_accu
## h=1 h=2 h=3 h=4
## RMSE 0 0 0 0
## MAE 0 0 0 0
## MAPE 0 0 0 0
Finally, a plot with the predictions for a given forecast horizon can be generated:
plot(ro, h = 4)
The rolling origin technique is very time-consuming, if you want to get a faster assessment of the model you can disable this feature:
ro <- rolling_origin(pred, h = 4, rolling = FALSE)
print(ro$test_sets)
## h=1 h=2 h=3 h=4
## [1,] 17 18 19 20
print(ro$predictions)
## h=1 h=2 h=3 h=4
## [1,] 17 18 19 20
In R, just a few packages apply regression methods based on computational intelligence to time series forecasting. In this paper we have presented the tsfknn package that allows forecasting a time series using KNN regression. The interface of the package is quite simple, with only one function the user can specify a KNN model and predict a time series. Furthermore, several graphs can be generated illustrating how the prediction has been computed and the forecasting accuracy of the model can be assessed using hold-out data.
If you want to learn more about this package or univariate time series forecasting using KNN we suggest: