Sat, 25 Jul 2015

Rcpp 0.12.0: Now with more Big Data!

big-data image

A new release 0.12.0 of Rcpp arrived on the CRAN network for GNU R this morning, and I also pushed a Debian package upload.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 423 packages on CRAN depend on Rcpp for making analyses go faster and further. Note that this is 60 more packages since the last release in May! Also, BioConductor adds another 57 packages, and casual searches on GitHub suggests many more.

And according to Andrie De Vries, Rcpp has now page rank of one on CRAN as well!

And with this release, Rcpp also becomes ready for Big Data, or, as they call it in Texas, Data.

Thanks to a lot of work and several pull requests by Qiang Kou, support for R_xlen_t has been added.

That means we can now do stunts like

R> library(Rcpp)
R> big <- 2^31-1
R> bigM <- rep(NA, big)
R> bigM2 <- c(bigM, bigM)
R> cppFunction("double getSz(LogicalVector x) { return x.length(); }")
R> getSz(bigM)
[1] 2147483647
R> getSz(bigM2)
[1] 4294967294
R>

where prior versions of Rcpp would just have said

> getSz(bigM2)
Error in getSz(bigM2) :
  long vectors not supported yet: ../../src/include/Rinlinedfuns.h:137
>

which is clearly not Texas-style. Another wellcome change, also thanks to Qiang Kou, adds encoding support for strings.

A lot of other things got polished. We are still improving exception handling as we still get the odd curveballs in a corner cases. Matt Dziubinski corrected the var() computation to use the proper two-pass method and added better support for lambda functions in Sugar expression using sapply(), Qiang Kou added more pull requests mostly for string initialization, and Romain added a pull request which made data frame creation a little more robust, and JJ was his usual self in tirelessly looking after all aspects of Rcpp Attributes.

As always, you can follow the development via the GitHub repo and particularly the Issue tickets and Pull Requests. And any discussions, questions, ... regarding Rcpp are always welcome at the rcpp-devel mailing list.

Last but not least, we are also extremely pleased to annouce that Qiang Kou has joined us in the Rcpp-Core team. We are looking forward to a lot more awesome!

See below for a detailed list of changes extracted from the NEWS file.

Changes in Rcpp version 0.12.0 (2015-07-24)

  • Changes in Rcpp API:

    • Rcpp_eval() no longer uses R_ToplevelExec when evaluating R expressions; this should resolve errors where calling handlers (e.g. through suppressMessages()) were not properly respected.

    • All internal length variables have been changed from R_len_t to R_xlen_t to support vectors longer than 2^31-1 elements (via pull request 303 by Qiang Kou).

    • The sugar function sapply now supports lambda functions (addressing issue 213 thanks to Matt Dziubinski)

    • The var sugar function now uses a more robust two-pass method, supports complex numbers, with new unit tests added (via pull request 320 by Matt Dziubinski)

    • String constructors now allow encodings (via pull request 310 by Qiang Kou)

    • String objects are preserving the underlying SEXP objects better, and are more careful about initializations (via pull requests 322 and 329 by Qiang Kou)

    • DataFrame constructors are now a little more careful (via pull request 301 by Romain Francois)

    • For R 3.2.0 or newer, Rf_installChar() is used instead of Rf_install(CHAR()) (via pull request 332).

  • Changes in Rcpp Attributes:

    • Use more robust method of ensuring unique paths for generated shared libraries.

    • The evalCpp function now also supports the plugins argument.

    • Correctly handle signature termination characters ('{' or ';') contained in quotes.

  • Changes in Rcpp Documentation:

    • The Rcpp-FAQ vignette was once again updated with respect to OS X issues and Fortran libraries needed for e.g. RcppArmadillo.

    • The included Rcpp.bib bibtex file (which is also used by other Rcpp* packages) was updated with respect to its CRAN references.

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link