anomalize
R package is now available in
timetk
:
anomlize()
: 1 function that breaks down, identifies,
and cleans anomaliesplot_anomalies()
: Visualize the anomalies and anomaly
bandsplot_anomalies_decomp()
: Visualize the time series
decomposition. Make adjustments as needed.plot_anomalies_cleaned()
: Visualize the before/after of
cleaning anomalies.Note - anomalize(.method)
: Only the
.method = "stl"
is supported at this time. The
"twitter"
method is also planned.
Update forecasting vignette: Use glmnet
for time series
forecasting.
CRAN Fixes: - tzdata
time zone fixes: - GB ->
Europe/London - NZ -> Pacific/Auckland
- US/Eastern -> America/New_York
- US/Pacific -> America/Los_Angeles - Add @aliases
to
timetk-package
robets
tidyquant
from examplestidyverse
from examplesFANG
dataset to timetk
(port from
tidyquant
)New Features
plot_time_series()
: Gets new arguments to specify
.x_intercept
and .x_intercept_color
. #131Fixes
plot_time_series()
when
.group_names
is not found. #121recipes >= 1.0.3
#132facet_trelliscope()
plotting parameters.
plot_time_series()
plot_time_series_boxplot()
plot_anomaly_diagnostics()
New Features
Many of the plotting functions have been upgraded for use with
trelliscopejs
for easier visualization of many time
series.
plot_time_series()
:
trelliscope
: Used for visualizing
many time series..facet_strip_remove
to remove facet
strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with
trelliscope.facet_collapse = TRUE
was
changed to FALSE
for better compatibility with Trelliscope
JS. This may cause some plots to have multiple groups take up extra
space in the strip.plot_time_series_boxplot()
:
trelliscope
: Used for visualizing
many time series..facet_strip_remove
to remove facet
strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with
trelliscope..facet_collapse = TRUE
was
changed to FALSE
for better compatibility with Trelliscope
JS. This may cause some plots to have multiple groups take up extra
space in the strip.plot_anomaly_diagnostics()
:
trelliscope
: Used for visualizing
many time series..facet_strip_remove
to remove facet
strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with
trelliscope..facet_collapse = TRUE
was
changed to FALSE
for better compatibility with Trelliscope
JS. This may cause some plots to have multiple groups take up extra
space in the strip.Updates & Bug Fixes
Recipes steps (e.g. step_timeseries_signature()
) use
the new recipes::print_step()
function. Requires
recipes >= 0.2.0
. #110
Offset parameter in step_log_interval()
was not
working properly. Now works. #103
Potential Breaking Changes
.facet_collapse = TRUE
was
changed to FALSE
for better compatibility with Trelliscope
JS. This may cause some plots to have multiple groups take up extra
space in the strip.New Features
tk_tsfeatures()
: A new function that makes it easy
to generate time series feature matrix using tsfeatures
.
The main benefit is that you can pipe time series data in
tibbles
with dplyr
groups. The features will
be produced by group. #95 #84
plot_time_series_boxplot()
: A new function that
makes plotting time series boxplots simple using a .period
argument for time series aggregation.
New Vignettes
Time
Series Clustering: Uses the new tk_tsfeatures()
function to perform time series clustering. #95 #84
Time
Series Visualization: Updated to include
plot_time_series_boxplot()
and
plot_time_series_regression()
.
Improvements
Improvements for point forecasting when the target is n-periods into the future.
time_series_cv()
, time_series_split()
: New
parameter point_forecast
. This is useful for testing /
assessing the n-th prediction in the future. When set to
TRUE
, will return a single point that returns on the last
value in assess
.Fixes
plot_time_series()
: Smoother no longer fails when time
series has 1 observation #106Improvements
summarize_by_time()
: Added a
.week_start
argument to allow specifying
.week_start = 1
for Monday start. Default is 7 for Sunday
Start. This can also be changed with the lubridate
by
setting the lubridate.week.start
option.
Plotting Functions:
.facet_dir
argument for adjusting the direction of facet_wrap(dir)
.
#94plot_acf_diagnostics()
): Change
default parameter to .show_white_noise_bars = TRUE
.
#85plot_timeseries_regression()
: Can now
show_summary
for group-wise models when visualizing
groupsTime Series CV (time_series_cv()
): Add Label for
tune_results
Improve speed of pad_by_time()
. #93
Bug Fixes
tk_make_timeseries()
and
tk_make_future_timeseries()
are now able to handle end of
months. #72
tk_tbl.zoo()
: Fix an issue when
readr::type_convert()
produces warning messages about not
having character columns in inputs. #89
plot_time_series_regression()
: Fixed an issue when
lags are added to .formula
. Pads lags with NA.
step_fourier()
and fourier_vec()
: Fixed
issue with step_fourier failing with one observation. Added scale_factor
argument to override date sequences with the stored scale factor.
#77
Improvements
tk_augment_slidify()
, tk_augment_lags()
,
tk_augment_leads()
, tk_augment_differences()
:
Now works with multiple columns (passed via .value
) and
tidyselect
(e.g. contains()
).Fixes
#> New names:
#> * NA -> ...1
lazyeval
. #24select_()
used with
tk_xts_()
. #52New Functions
filter_period()
(#64): Applies filtering expressions
within time-based periods (windows).slice_period()
(#64): Applies slices within time-based
periods (windows).condense_period()
(#64): Converts a periodicity from a
higher (e.g. daily) to lower (e.g. monthly) frequency. Similar to
xts::to.period()
and
tibbletime::as_period()
.tk_augment_leads()
and lead_vec()
(#65):
Added to make it easier / more obvious on how to create leads.Fixes
time_series_cv()
: Fix bug with Panel Data. Train/Test
Splits only returning 1st observation in final time stamp. Should return
all observations.future_frame()
and
tk_make_future_timeseries()
: Now sort the incoming index to
ensure dates returned go into the future.tk_augment_lags()
and
tk_augment_slidify()
: Now overwrite column names to match
the behavior of tk_augment_fourier()
and
tk_augment_differences()
.Improvements
time_series_cv()
: Now works with time series groups.
This is great for working with panel data.future_frame()
: Gets a new argument called
.bind_data
. When set to TRUE
, it performs a
data binding operation with the incoming data and the future frame.Miscellaneous
step_slidify_augment()
- A variant of step slidify that
adds multiple rolling columns inside of a recipe.Bug Fixes
%+time%
and %-time%
return missing valuestk_make_timeseries()
and
tk_make_future_timeseries()
providing odd results for
regular time series. GitHub Issue
60New Functionality
tk_time_series_cv_plan()
- Now works with k-fold
cross validation objects from vfold_cv()
function.
pad_by_time()
- Added new argument
.fill_na_direction
to specify a tidyr::fill()
strategy for filling missing data.
Bug Fixes
tk_augment_lags()
) - Fix bug
with grouped functions not being exportedts
classNew Functions
step_log_interval_vec()
- Extends the
log_interval_vec()
for recipes
preprocessing.Parallel Processing
tune
and
recipes
Bug Fixes
log_interval_vec()
- Correct the messagingcomplement.ts_cv_split
- Helper to show time series
cross validation splits in list explorer.New Functions
mutate_by_time()
: For applying mutates by time
windowslog_interval_vec()
&
log_interval_inv_vec()
: For constrained interval
forecasting.Improvements
plot_acf_diagnostics()
: A new argument,
.show_white_noise_bars
for adding white noise bars to an
ACF / PACF Plot.pad_by_time()
: New arguments .start_date
and .end_date
for expanding/contracting the padding
windows.New Functions
plot_time_series_regression()
: Convenience function to
visualize & explore features using Linear Regression
(stats::lm()
formula).time_series_split()
: A convenient way to return a
single split from time_series_cv()
. Returns the split in
the same format as rsample::initial_time_split()
.Improvements
summarise_by_time()
, filter_by_time()
,
tk_summary_diagnostics
tk_time_series_cv_plan()
: Allow a single resample from
rsample::initial_time_split
or
timetk::time_series_split
modeltime
and tidymodels
.Plotting Improvements
plot_time_series()
:
.legend_show
to toggle on/off legends.Breaking Changes
...
with
.facet_vars
or .ccf_vars
. This change is
needed to improve tab-completion. It affects :
plot_time_series()
plot_acf_diagnostics()
plot_anomaly_diagnostics()
plot_seasonal_diagnostics()
plot_stl_diagnostics()
Bug Fixes
fourier_vec()
and step_fourier_vec()
: Add
error if observations have zero difference. Issue
#40.New Interactive Plotting Functions
plot_anomaly_diagnostics()
: Visualize Anomalies for One
or More Time SeriesNew Data Wrangling Functions
future_frame()
: Make a future tibble from an existing
time-based tibble.New Diagnostic / Data Processing Functions
tk_anomaly_diagnostics()
- Group-wise anomaly detection
and diagnostics. A wrapper for the anomalize
R package
functions without importing anomalize
.New Vectorized Functions:
ts_clean_vec()
- Replace Outliers & Missing Values
in a Time Seriesstandardize_vec()
- Centers and scales a time series to
mean 0, standard deviation 1normalize_vec()
- Normalizes a time series to Range:
(0, 1)New Recipes Preprocessing Steps:
step_ts_pad()
- Preprocessing for padding time series
data. Adds rows to fill in gaps and can be used with
step_ts_impute()
to interpolate going from low to high
frequency!step_ts_clean()
- Preprocessing step for cleaning
outliers and imputing missing values in a time series.New Parsing Functions
parse_date2()
and parse_datetime2()
: These
are similar to readr::parse_date()
and
lubridate::as_date()
in that they parse character vectors
to date and datetimes. The key advantage is SPEED.
parse_date2()
uses anytime
package to process
using C++ Boost.Date_Time
library.Improvements:
plot_acf_diagnostics()
: The .lags
argument
now handles time-based phrases
(e.g. .lags = "1 month"
).time_series_cv()
: Implements time-based phrases
(e.g. initial = "5 years"
and
assess = "1 year"
)tk_make_future_timeseries()
: The n_future
argument has been deprecated for a new length_out
argument
that accepts both numeric input (e.g. length_out = 12
) and
time-based phrases (e.g. length_out = "12 months"
). A major
improvement is that numeric values define the number of timestamps
returned even if weekends are removed or holidays are removed. Thus, you
can always anticipate the length. (Issue
#19).diff_vec
: Now reports the initial values used in the
differencing calculation.Bug Fixes:
plot_time_series()
:
.value = .value
.tk_make_future_timeseries()
:
time_series_cv()
:
skip = 1
default. skip = 0
does not
make sense.skip
adding 1 to stops.plot_time_series_cv_plan()
&
tk_time_series_cv_plan()
:
tk_make_future_timeseries()
:
period()
returns NA
. Fix implemented with
ceiling_date()
.pad_by_time()
:
pad_value
so only inserts pad values where new row
was inserted.step_ts_clean()
, step_ts_impute()
:
lambda = NULL
Breaking Changes:
These should not be of major impact since the 1.0.0 version was just released.
impute_ts_vec()
to ts_impute_vec()
for consistency with ts_clean_vec()
step_impute_ts()
to
step_ts_impute()
for consistency with underlying
functionroll_apply_vec()
to slidify_vec()
for consistency with slidify()
& relationship to
slider
R packagestep_roll_apply
to step_slidify()
for consistency with slidify()
& relationship to
slider
R packagetk_augment_roll_apply
to
tk_augment_slidify()
for consistency with
slidify()
& relationship to slider
R
packageplot_time_series_cv_plan()
and
tk_time_series_cv_plan()
: Changed argument from
.rset
to .data
.New Interactive Plotting Functions:
plot_time_series()
- A workhorse time-series
plotting function that generates interactive
plotly
plots, consolidates 20+ lines of
ggplot2
code, and scales well to many time series using
dplyr groups.plot_acf_diagnostics()
- Visualize the ACF, PACF, and
any number of CCFs in one plot for Multiple Time Series. Interactive
plotly
by default.plot_seasonal_diagnostics()
- Visualize Multiple
Seasonality Features for One or More Time Series. Interactive
plotly
by default.plot_stl_diagnostics()
- Visualize STL Decomposition
Features for One or More Time Series.plot_time_series_cv_plan()
- Visualize the Time Series
Cross Validation plan made with time_series_cv()
.New Time Series Data Wrangling:
summarise_by_time()
- A time-based variant of
dplyr::summarise()
for flexible summarization using common
time-based criteria.filter_by_time()
- A time-based variant of
dplyr::filter()
for flexible filtering by time-ranges.pad_by_time()
- Insert time series rows with regularly
spaced timestamps.slidify()
- Make any function a rolling / sliding
function.between_time()
- A time-based variant of
dplyr::between()
for flexible time-range detection.add_time()
- Add for time series index. Shifts an index
by a period
.New Recipe Functions:
Feature Generators:
step_holiday_signature()
- New recipe step for adding
130 holiday features based on individual holidays, locales, and stock
exchanges / business holidays.step_fourier()
- New recipe step for adding fourier
transforms for adding seasonal features to time series datastep_roll_apply()
- New recipe step for adding rolling
summary functions. Similar to recipes::step_window()
but is
more flexible by enabling application of any summary function.step_smooth()
- New recipe step for adding Local
Polynomial Regression (LOESS) for smoothing noisy time seriesstep_diff()
- New recipe for adding multiple
differenced columns. Similar to recipes::step_lag()
.step_box_cox()
- New recipe for transforming
predictors. Similar to step_BoxCox()
with improvements for
forecasting including “guerrero” method for lambda selection and
handling of negative data.step_impute_ts()
- New recipe for imputing a time
series.New Rsample Functions
time_series_cv()
- Create rsample
cross
validation sets for time series. This function produces a sampling plan
starting with the most recent time series observations, rolling
backwards.New Vector Functions:
These functions are useful on their own inside of
mutate()
and power many of the new plotting and recipes
functions.
roll_apply_vec()
- Vectorized rolling apply function -
wraps slider::slide_vec()
smooth_vec()
- Vectorized smoothing function - Applies
Local Polynomial Regression (LOESS)diff_vec()
and diff_inv_vec()
- Vectorized
differencing function. Pads NA
’s by default (unlike
stats::diff
).lag_vec()
- Vectorized lag functions. Returns both lags
and leads (negative lags) by adjusting the .lag
argument.box_cox_vec()
, box_cox_inv_vec()
, &
auto_lambda()
- Vectorized Box Cox transformation.
Leverages forecast::BoxCox.lambda()
for automatic lambda
selection.fourier_vec()
- Vectorized Fourier Series
calculation.impute_ts_vec()
- Vectorized imputation of missing
values for time series. Leverages
forecast::na.interp()
.New Augment Functions:
All of the functions are designed for scale. They respect
dplyr::group_by()
.
tk_augment_holiday_signature()
- Add holiday features
to a data.frame
using only a time-series index.tk_augment_roll_apply()
- Add multiple columns of
rolling window calculations to a data.frame
.tk_augment_differences()
- Add multiple columns of
differences to a data.frame
.tk_augment_lags()
- Add multiple columns of lags to a
data.frame
.tk_augment_fourier()
- Add multiple columns of fourier
series to a data.frame
.New Make Functions:
Make date and date-time sequences between start and end dates.
tk_make_timeseries()
- Super flexible function for
creating daily and sub-daily time series.tk_make_weekday_sequence()
- Weekday sequence that
accounts for both stripping weekends and holidaystk_make_holiday_sequence()
- Makes a sequence of dates
corresponding to business holidays in calendars from
timeDate
(common non-working days)tk_make_weekend_sequence()
- Weekday sequence of dates
for Saturday and Sunday (common non-working days)New Get Functions:
tk_get_holiday_signature()
- Get 100+ holiday features
using only a time-series index.tk_get_frequency()
and tk_get_trend()
-
Automatic frequency and trend calculation from a time series index.New Diagnostic / Data Processing Functions
tk_summary_diagnostics()
- Group-wise time series
summary.tk_acf_diagnostics()
- The data preparation function
for plot_acf_diagnostics()
tk_seasonal_diagnostics()
- The data preparation
function for plot_seasonal_diagnostics()
tk_stl_diagnostics()
- Group-wise STL Decomposition
(Season, Trend, Remainder). Data prep for
plot_stl_diagnostics()
.tk_time_series_cv_plan
- The data preparation function
for plot_time_series_cv_plan()
New Datasets
Improvements: *
tk_make_future_timeseries()
- Now accepts
n_future
as a time-based phrase like “12 seconds” or “1
year”.
Bug Fixes:
lubridate::tz<-
which now returns POSIXct when used Date
objects. Fixed in PR32 by @vspinu.Potential Breaking Changes:
tk_augment_timeseries_signature()
- Changed from
data
to .data
to prevent name collisions when
piping.New Features:
recipes
Integration - Ability to apply time
series feature engineering in the tidymodels
machine learning workflow.
step_timeseries_signature()
- New
step_timeseries_signature()
for adding date and date-time
features.Bug Fixes:
xts::indexTZ
is deprecated. Use tzone
instead.arrange_
with arrange
.tidyquant
1.0.0 upagrade
(single stocks now return an extra symbol column).tidyquant
v0.5.7 - Removed
dependency on tidyverse
timeSeries
to Suggests to satisfy a CRAN
issue.timetk
. Was formerly
timekit
.robets