anytime

Convert Any Input to Parsed Date or Datetime

Motivation

R excels at computing with dates, and times. Using typed representation for your data is highly recommended not only because of the functionality offered but also because of the added safety stemming from proper representation.

But there is a small nuisance cost in interactive work as well as in programming. Users must have told as.POSIXct() about a million times that the origin is (of course) the epoch. Do we really have to say it a million more times? Similarly, when parsing dates that are some recogniseable form of YYYYMMDD format, do we really have to manually convert from integer or numeric or factor or ordered to character first? Having one of several common separators and/or date / time month forms (YYYY-MM-DD, YYYY/MM/DD, YYYYMMDD, YYYY-mon-DD and so on, with or without times), do we really need a format string?

Or could a smart converter function do this? anytime() aims to be that general purpose converter returning a proper POSIXct (or Date) object nomatter the input (provided it was somewhat parseable), relying on Boost date_time for the (efficient, performant) conversion.

Examples

From Integer or Numeric or Factor or Ordered

library(anytime)
options(digits.secs=6)                ## for fractional seconds below
Sys.setenv(TZ=anytime:::getTZ())      ## helper function to try to get TZ

anytime(20160101L + 0:2)
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

anytime(20160101 + 0:2)
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

anytime(as.factor(20160101 + 0:2))
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

anytime(as.ordered(20160101 + 0:2))
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

Character: Simple

## Dates: Character
anytime(as.character(20160101 + 0:2))
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

anytime(c("20160101", "2016/01/02", "2016-01-03"))
[1] "2016-01-01 CST" "2016-01-02 CST" "2016-01-03 CST"

Character: ISO

## Datetime: ISO with/without fractional seconds
anytime(c("2016-01-01 10:11:12", "2016-01-01 10:11:12.345678"))
[1] "2016-01-01 10:11:12.000000 CST" "2016-01-01 10:11:12.345678 CST"

anytime(c("20160101T101112", "20160101T101112.345678"))
[1] "2016-01-01 10:11:12.000000 CST" "2016-01-01 10:11:12.345678 CST"

Character: Textual month formats

## ISO style
anytime(c("2016-Sep-01 10:11:12", "Sep/01/2016 10:11:12", "Sep-01-2016 10:11:12"))
[1] "2016-09-01 10:11:12 CDT" "2016-09-01 10:11:12 CDT" "2016-09-01 10:11:12 CDT"

anytime(c("Thu Sep 01 10:11:12 2016", "Thu Sep 01 10:11:12.345678 2016"))
[1] "2016-09-01 10:11:12.000000 CDT" "2016-09-01 10:11:12.345678 CDT"

Character: Dealing with DST

This shows an important aspect. When not working localtime (by overriding to UTC) the changing difference UTC is correctly covered (which the underlying Boost Date_Time library does not by itself).

## Datetime: pre/post DST
anytime(c("2016-01-31 12:13:14", "2016-08-31 12:13:14"))
[1] "2016-01-31 12:13:14 CST" "2016-08-31 12:13:14 CDT"
anytime(c("2016-01-31 12:13:14", "2016-08-31 12:13:14"), tz="UTC")  # important: catches change
[1] "2016-01-31 18:13:14 UTC" "2016-08-31 17:13:14 UTC"

Technical Details

The heavy lifting is done by a combination of Boost lexical_cast to go from anything to string representation which is then parsed by Boost Date_Time. We use the BH package to access Boost, and rely on Rcpp for a seamless C++ interface to and from R.

History

The code and functionality started (in a less complete or polished form) in the function parsePOSIXt, later renamed to toPOSIXt, in the RcppBDT package.

Status

Works as expected. A few extensions planned.

Installation

The package is now on CRAN and can be installed via a standard

install.packages("anytime")

Repository

Code, issue tickets, ... are at the GitHub repo.

Author

Dirk Eddelbuettel

License

GPL (>= 2)

Initially created: Sun Sep 11 11:39:17 CDT 2016
Last modified: Thu Dec 15 08:28:43 CST 2016