When graphing financial data, I prefer a continuous axis scale where weekends and holidays don’t exist. Did the price remain unchanged for three days, or am I looking at labor-day weekend?
Load past NYSE trading dates, taken from Yahoo’s S&P prices:
data(nyse)
Then create some fake prices, put them into a data.frame
alongside the dates:
set.seed(12345)
df <- data.frame(date=nyse, price=cumsum(rnorm(length(nyse))) + 100)
df <- subset(df, as.Date('2014-08-01') <= date & date <= as.Date('2014-10-08'))
date | price | |
---|---|---|
16293 | 2014-10-02 | 87.78927 |
16294 | 2014-10-03 | 88.20406 |
16295 | 2014-10-06 | 88.13665 |
16296 | 2014-10-07 | 88.46860 |
16297 | 2014-10-08 | 88.81567 |
Create a plot:
plot <- ggplot(df, aes(x=date, y=price)) + geom_step() +
theme(axis.title.x=element_blank(), axis.title.y=element_blank())
Note the large gap at the beginning of September, because Labor Day was on the 1st:
plot + ggtitle('calendar dates')
Plotting against scale_x_bd
instead removes weekends and holidays from the graph:
plot + scale_x_bd(business.dates=nyse, labels=date_format("%b '%y")) +
ggtitle('business dates, month breaks')
In the previous chart, the major breaks are pretty far apart.
The package determined that breaks on the first trading day of each month gives me the largest number of breaks weakly less than maximum. I didn’t specify a max, so it defaulted to 5
.
If I tell it to use more, it can put breaks on the first day of each week:
plot + scale_x_bd(business.dates=nyse, max.major.breaks=10, labels=date_format('%d %b')) +
ggtitle('business dates, week breaks')
Say I wanted to put vertical lines on option expiration dates.
Calling as.numeric(...)
on my dates translates them into the the number of calendar days after unix epoch, which is what scale_x_date(...)
uses (see scales:::from_date
):
options <- as.Date(c('2014-08-15', '2014-09-19'))
plot +
geom_vline(xintercept=as.numeric(options), size=2, alpha=0.25) +
ggtitle('calendar dates, option expiry')
This doesn’t work for business-day space because the x-axis now represents the number of business days after the first date in your business.dates
vector:
scale_x_date |
scale_x_bd |
|
---|---|---|
origin | unix epoch: 1970-01-01 |
first date in your business.dates vector |
axis values | number of calendar days after origin | number of business days after origin |
conversion | as.numeric(...) |
bd2t(..., business.dates) |
Instead, use the bdscale::bd2t(...)
function to translate into business-day space:
plot +
geom_vline(xintercept=bd2t(options, business.dates=nyse), size=2, alpha=0.25) +
scale_x_bd(business.dates=nyse) +
ggtitle('business dates, option expiry')