require(neatR)
#> Loading required package: neatR
We use R extensively not just for intensive computation, but also for presentation. Javascript visualization libraries in R and elegant ways to present data using R markdown makes R one stop shop for analytics and data science.
We spend most of the time preparing, cleaning, analyzing and modeling the data. However, the last leg of analytics, which is presentation of results don’t get enough attention most of the times.
neatR package helps in formatting the results by providing simple utility functions covering common use cases.
Often, we encounter dates which are either in mm/dd/yyyy
or dd/mm/yyyy
format and wondering what is the month or
what is the date especially if there are no date values after 12th day
of a month. An unambiguous approach would be to show the date in
mmm dd, yyyy
format with day of week which is easier to
grasp.
ndate(Sys.Date() - 3)
#> [1] "Mar 19, 2023 (Sun)"
ndate(Sys.Date() - 1)
#> [1] "Mar 21, 2023 (Tue)"
ndate(Sys.Date())
#> [1] "Mar 22, 2023 (Wed)"
ndate(Sys.Date() + 1)
#> [1] "Mar 23, 2023 (Thu)"
ndate(Sys.Date() + 4)
#> [1] "Mar 26, 2023 (Sun)"
To just get the date without the day of week, set
display.weekday
to FALSE
ndate(Sys.Date(), display.weekday = FALSE)
#> [1] "Mar 22, 2023"
When we are looking at the monthly data, abbreviating the date to mmm’yy is an elegant way to show the date and often helpful for charts.
ndate(Sys.Date(), display.weekday = FALSE, is.month = TRUE)
#> [1] "Mar'23"
To see the context of the date with respect to current date
(referring dates within 1 week before or after current date), use the
nday
function.
Day of week with context based on current date,
reference.alias
can be directly used on dates or
timestamps.
nday(Sys.Date(), reference.alias = FALSE)
#> [1] "Wed"
nday(Sys.Date(), reference.alias = TRUE)
#> [1] "Today, Wed"
nday(Sys.time(), reference.alias = TRUE)
#> [1] "Tomorrow, Wed"
Below is another example with context based on current date.
<- seq(Sys.Date() - 10, Sys.Date() + 10, by = '1 day')
x nday(x, reference.alias = TRUE)
#> [1] "Sun" "Mon" "Last Tue" "Last Wed"
#> [5] "Last Thu" "Last Fri" "Last Sat" "Last Sun"
#> [9] "Last Mon" "Yesterday, Tue" "Today, Wed" "Tomorrow, Thu"
#> [13] "Coming Fri" "Coming Sat" "Coming Sun" "Coming Mon"
#> [17] "Coming Tue" "Coming Wed" "Coming Thu" "Fri"
#> [21] "Sat"
Timestamps are feature rich representation of date and time.
ntimestamp(Sys.time())
#> [1] "Mar 22, 2023 09H 12M 09S PM PDT (Wed)"
To format only date from the timestamp, we can use ndate
function.
ndate(Sys.time())
#> [1] "Mar 22, 2023 (Wed)"
To extract and format only the time from timestamp, we can do the following,
ntimestamp(Sys.time(), display.weekday = FALSE,
include.date = FALSE, include.timezone = FALSE)
#> [1] "09H 12M 09S PM"
Note: Hours are shown based on 12H clock format with AM / PM suffix.
Components of time can be toggled on or off based on preference.
ntimestamp(Sys.time(), include.date = FALSE, display.weekday = FALSE,
include.hours = TRUE, include.minutes = TRUE,
include.seconds = FALSE, include.timezone = FALSE)
#> [1] "09H 12M PM"
Timezone can be toggled on or off using include.timezone
parameter.
ntimestamp(Sys.time(), include.timezone = FALSE)
#> [1] "Mar 22, 2023 09H 12M 09S PM (Wed)"
Most of the times, we deal with large numbers which are shown in
scientific format from the output of a statistical model or just the raw
data itself. nnumber
can format the numeric data and show
them in easily readable way.
By default, the numbers are formatted in a more appropriate unit that best represents individual values. See the below example,
<- c(10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
x nnumber(x)
#> [1] "10.0" "100.0" "1.0 K" "10.0 K" "100.0 K" "1.0 Mn" "10.0 Mn"
#> [8] "100.0 Mn" "1.0 Bn"
nnumber(x, digits = 0)
#> [1] "10" "100" "1 K" "10 K" "100 K" "1 Mn" "10 Mn" "100 Mn"
#> [9] "1 Bn"
nnumber
can automatically determine best single unit to
display all the numbers by setting unit = 'auto'
. In the
below example the unit of thousand seem to best fit most of the numbers.
Any number lower than 0.1K are displayed as ‘<0.1K’ for easier
reference.
<- c(1e6, 99e3, 76e3, 42e3, 12e3, 789, 53)
x nnumber(x, unit = 'auto')
#> [1] "1,000 K" "99 K" "76 K" "42 K" "12 K" "0.8 K" "0.1 K"
We can specify the units in which the number to be formatted,
nnumber(123456789.123456, unit = 'Mn')
#> [1] "123.5 Mn"
Default units are, ‘K’ for thousand, ‘Mn’ for million, ‘Bn’ for
billion, ‘Tn’ for trillions. The unit labeling can be customized using
unit.labels
which is a list encompassing values and
labels.
nnumber(123456789.123456, unit = 'M', unit.labels = list(million = 'M'))
#> [1] "123.5 M"
Below example, gives customization of all units.
<- c(10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
x nnumber(x, unit.labels =
list(thousand = 'K', million = 'M', billion= 'B', trillion = T))
#> [1] "10.0" "100.0" "1.0 K" "10.0 K" "100.0 K" "1.0 M" "10.0 M"
#> [8] "100.0 M" "1.0 B"
Along with the formatted number, we can (optionally) add a prefix or suffix.
nnumber(123456789.123456, unit = 'M', unit.labels = list(million = 'M'),
prefix = '$ ')
#> [1] "$ 123.5 M"
nnumber(123456789.123456, unit = 'M', unit.labels = list(million = 'M'),
suffix = ' CAD')
#> [1] "123.5 M CAD"
nnumber(123456789.123456, unit = 'M', unit.labels = list(million = 'M'),
prefix = '$ ', suffix = ' CAD')
#> [1] "$ 123.5 M CAD"
Sometimes, we are interested in showing the number as it is, which
can be done by setting unit = ''
.
thousand.separator
parameter is useful in separating the
thousands which makes it easy to read the numbers.
nnumber(123456789.123456, digits = 2, unit = '',
thousand.separator = ',')
#> [1] "123,456,789.1"
thousand.separator
can take the following values
",", ".", "'", " ", "_", ""
The parameter unit
can take any of the following
values,
custom
: Unit is customized for each individual values.
This is the default value to the unit
parameter.
auto
: A single unit that best represents the overall
data is automatically detected and applied based on majority of the
values.
K
: The numbers are displayed in thousands.
Mn
: The number are displayed in millions.
Bn
: The number are displayed in billions.
Tn
: The number are displayed in trillions.
If the unit labels are customized and provided via a list, for an
example: unit.labels = list(thousand = 'k')
then this
string k
to be provided for the unit
.
Percentage data can come in two types, with or without multiplied by 100. For an example, 22.8% can be stored as 22.8 or 0.228
npercent(22.8, is.decimal = FALSE)
#> [1] "+22.8%"
npercent(0.228, is.decimal = TRUE)
#> [1] "+22.8%"
By default, is.decimal
is set as TRUE and decimal digits
is set to 1.
It is also useful to show if the percent is a positive number by adding a prefix of plus sign. This is the default behavior of the npercent function, which can be set to FALSE
npercent(0.228, plus.sign = TRUE)
#> [1] "+22.8%"
npercent(0.228, plus.sign = FALSE)
#> [1] "22.8%"
When the percentages are high (especially while calculating growth from time A to time B), it would be easy to read this as ‘nX’.
<- 20
tesla_2017 <- 200
tesla_2023 <- (tesla_2023 - tesla_2017)/tesla_2017
g npercent(g, plus.sign = TRUE)
#> [1] "+900.0%"
npercent(g, plus.sign = TRUE, factor.out = TRUE)
#> [1] "+900.0% (9x growth)"
Formatting character vectors or string can be done with case type, options to remove special characters and selecting only english characters and numbers from the string.
Below are the available case
conversions,
lower
: converts string to lower case.
upper
: converts string to upper case.
title
: converts string to title case (first letter of
each word is capitalized except stop words. Based on
tools::toTitleCase
).
start
: converts string to start case (first letter of
each word is capitalized and rest of the letters are in lower case).
initcap
: converts string to initcap case (first letter
of first word is capitalized and rest of the letters are in lower
case).
nstring(' All MOdels are wrong. some ARE useful!!! â',
case = 'title', remove.specials = TRUE)
#> [1] "all Models are Wrong some are Useful â"
To exclude any special characters and retain only numbers and english
alphabets, we can set en.only
parameter to
TRUE
nstring(' All MOdels are wrong. some ARE useful!!! â ',
case = 'title', remove.specials = TRUE, en.only = TRUE)
#> [1] "all Models are Wrong some are Useful"
By default, Trailing and leading white spaces are removed and extra white spaces are reduced to single white space.