This tutorial focuses on, plot_time_series()
, a
workhorse time-series plotting function that:
plotly
plots (great for exploring
& shiny apps)ggplot2
&
plotly
codeplotly
to static
ggplot2
plotsRun the following code to setup for this tutorial.
Let’s start with a popular time series, taylor_30_min
,
which includes energy demand in megawatts at a sampling interval of
30-minutes. This is a single time series.
taylor_30_min
#> # A tibble: 4,032 × 2
#> date value
#> <dttm> <dbl>
#> 1 2000-06-05 00:00:00 22262
#> 2 2000-06-05 00:30:00 21756
#> 3 2000-06-05 01:00:00 22247
#> 4 2000-06-05 01:30:00 22759
#> 5 2000-06-05 02:00:00 22549
#> 6 2000-06-05 02:30:00 22313
#> 7 2000-06-05 03:00:00 22128
#> 8 2000-06-05 03:30:00 21860
#> 9 2000-06-05 04:00:00 21751
#> 10 2000-06-05 04:30:00 21336
#> # ℹ 4,022 more rows
The plot_time_series()
function generates an interactive
plotly
chart by default.
.date_var
) and the numeric variable (.value
)
that changes over time as the first 2 arguments.interactive = TRUE
, the
.plotly_slider = TRUE
adds a date slider to the bottom of
the chart.Next, let’s move on to a dataset with time series groups,
m4_daily
, which is a sample of 4 time series from the M4
competition that are sampled at a daily frequency.
m4_daily %>% group_by(id)
#> # A tibble: 9,743 × 3
#> # Groups: id [4]
#> id date value
#> <fct> <date> <dbl>
#> 1 D10 2014-07-03 2076.
#> 2 D10 2014-07-04 2073.
#> 3 D10 2014-07-05 2049.
#> 4 D10 2014-07-06 2049.
#> 5 D10 2014-07-07 2006.
#> 6 D10 2014-07-08 2018.
#> 7 D10 2014-07-09 2019.
#> 8 D10 2014-07-10 2007.
#> 9 D10 2014-07-11 2010
#> 10 D10 2014-07-12 2002.
#> # ℹ 9,733 more rows
Visualizing grouped data is as simple as grouping the data set with
group_by()
prior to piping into the
plot_time_series()
function. Key points:
group_by()
or by
using the ...
to add groups..facet_ncol = 2
returns a 2-column faceted plot.facet_scales = "free"
allows the x and y-axis of each
plot to scale independently of the other plotsLet’s switch to an hourly dataset with multiple groups. We can showcase:
.value
.color_var
to highlight sub-groups.m4_hourly %>% group_by(id)
#> # A tibble: 3,060 × 3
#> # Groups: id [4]
#> id date value
#> <fct> <dttm> <dbl>
#> 1 H10 2015-07-01 12:00:00 513
#> 2 H10 2015-07-01 13:00:00 512
#> 3 H10 2015-07-01 14:00:00 506
#> 4 H10 2015-07-01 15:00:00 500
#> 5 H10 2015-07-01 16:00:00 490
#> 6 H10 2015-07-01 17:00:00 484
#> 7 H10 2015-07-01 18:00:00 467
#> 8 H10 2015-07-01 19:00:00 446
#> 9 H10 2015-07-01 20:00:00 434
#> 10 H10 2015-07-01 21:00:00 422
#> # ℹ 3,050 more rows
The intent is to showcase the groups in faceted plots, but to
highlight weekly windows (sub-groups) within the data while
simultaneously doing a log()
transformation to the value.
This is simple to do:
.value = log(value)
Applies the Log Transformation.color_var = week(date)
The date column is transformed
to a lubridate::week()
number. The color is applied to each
of the week numbers.All of the visualizations can be converted from interactive
plotly
(great for exploring and shiny apps) to static
ggplot2
visualizations (great for reports).
taylor_30_min %>%
plot_time_series(date, value,
.color_var = month(date, label = TRUE),
# Returns static ggplot
.interactive = FALSE,
# Customization
.title = "Taylor's MegaWatt Data",
.x_lab = "Date (30-min intervals)",
.y_lab = "Energy Demand (MW)",
.color_lab = "Month") +
scale_y_continuous(labels = scales::label_comma())
The plot_time_series_boxplot()
function can be used to
make box plots.
.period
argument.A time series regression plot,
plot_time_series_regression()
, can be useful to quickly
assess key features that are correlated to a time series.
formula
to the
stats::lm()
function.show_summary = TRUE
.Timetk is part of the amazing Modeltime Ecosystem for time series forecasting. But it can take a long time to learn:
Your probably thinking how am I ever going to learn time series forecasting. Here’s the solution that will save you years of struggling.
Become the forecasting expert for your organization
High-Performance Time Series Course
Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.
High-Performance Forecasting Systems will save companies by improving accuracy and scalability. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. You will learn:
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random
Forest, & many more)GluonTS
(Competition Winners)Become the Time Series Expert for your organization.