Sun, 25 Jan 2015

RcppArmadillo 0.4.600.4.0

Conrad put up a maintenance release 4.600.4 of Armadillo a few days ago. As in the past, we tested this with number of pre-releases and test builds against the now over one hundred CRAN dependents of our RcppArmadillo package. The tests passed fine as usual, and results are as always in the rcpp-logs repository.

Changes are summarized below based on the NEWS.Rd file.

Changes in RcppArmadillo version 0.4.600.4.0 (2015-01-23)

  • Upgraded to Armadillo release Version 4.600.4 (still "Off The Reservation")

    • Speedups in the transpose operation

    • Small bug fixes

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sat, 24 Jan 2015

Rcpp 0.11.4

A new release 0.11.4 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package will be uploaded in due course.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 323 packages on CRAN depend on Rcpp for making analyses go faster and further; BioConductor adds another 41 packages, and casual searches on GitHub suggests dozens mores.

This release once again adds a large number of small bug fixes, polishes and enhancements. And like the last time, these changes were made by a group of seven different contributors (counting code commits) plus three more providing concrete suggestions. This shows that the Rcpp development and maintenance rests a large number of (broad) shoulders.

See below for a detailed list of changes extracted from the NEWS file.

Changes in Rcpp version 0.11.4 (2015-01-20)

  • Changes in Rcpp API:

    • The ListOf<T> class gains the .attr and .names methods common to other Rcpp vectors.

    • The [dpq]nbinom_mu() scalar functions are now available via the R:: namespace when R 3.1.2 or newer is used.

    • Add an additional test for AIX before attempting to include execinfo.h.

    • Rcpp::stop now supports improved printf-like syntax using the small tinyformat header-only library (following a similar implementation in Rcpp11)

    • Pairlist objects are now protected via an additional Shield<> as suggested by Martin Morgan on the rcpp-devel list.

    • Sorting is now prohibited at compile time for objects of type List, RawVector and ExpressionVector.

    • Vectors now have a Vector::const_iterator that is 'const correct' thanks to fix by Romain following a bug report in rcpp-devel by Martyn Plummer.

    • The mean() sugar function now uses a more robust two-pass method, and new unit tests for mean() were added at the same time.

    • The mean() and var() functions now support all core vector types.

    • The setequal() sugar function has been corrected via suggestion by Qiang Kou following a bug report by Søren Højsgaard.

    • The macros major, minor, and makedev no longer leak in from the (Linux) system header sys/sysmacros.h.

    • The push_front() string function was corrected.

  • Changes in Rcpp Attributes:

    • Only look for plugins in the package's namespace (rather than entire search path).

    • Also scan header files for definitions of functions to be considerd by Attributes.

    • Correct the regular expression for source files which are scanned.

  • Changes in Rcpp unit tests

    • Added a new binary test which will load a pre-built package to ensure that the Application Binary Interface (ABI) did not change; this test will (mostly or) only run at Travis where we have reasonable control over the platform running the test and can provide a binary.

    • New unit tests for sugar functions mean, setequal and var were added as noted above.

  • Changes in Rcpp Examples:

    • For the (old) examples ConvolveBenchmarks and OpenMP, the respective Makefile was renamed to GNUmakefile to please R CMD check as well as the CRAN Maintainers.

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

RcppGSL 0.2.4

A new version of RcppGSL is now on CRAN. This package provides an interface from R to the GNU GSL using our Rcpp package.

This follows on the heels on the recent RcppGSL 0.2.3 release and extends the excellent point made by Qiang Kou in a contributed section of the vignette: We now not only allow to turn the GSL error handler off (to not abort() on error) but do so on package initialisation.

No other user-facing changes were made.

The NEWS file entries follows below:

Changes in version 0.2.4 (2015-01-24)

  • Two new helper function to turn the default GSL error handler off (and to restore it) were added. The default handler is now turned off when the package is attached so that GSL will no longer abort an R session on error. Users will have to check the error code.

  • The RcppGSL-intro.Rnw vignette was expanded with a short section on the GSL error handler (thanks to Qiang Kou).

Courtesy of CRANberries, a summary of changes to the most recent release is available.

More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

RcppAnnoy 0.0.5

A new version of RcppAnnoy is now on CRAN. RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

This version contains a trivial one-character change requested by CRAN to cleanse the Makevars file of possible GNU Make-isms. Oh well. This release also overcomes an undefined behaviour sanitizer bug noticed by CRAN that took somewhat more effort to deal with. As mentioned recently in another blog post, it took some work to create a proper Docker container with the required compiler and subsequent R setup, but we have one now, and the aforementioned blog post has details on how we replicated the CRAN finding of an UBSAN issue. It also took Erik some extra efforts to set something up for his C++/Python side, but eventually an EC2 instance with Ubuntu 14.10 did the task as my Docker sales skills are seemingly not convincing enough. In any event, he very quickly added the right fix, and I synced RcppAnnoy with his Annoy code.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 18 Jan 2015

Running UBSAN tests via clang with Rocker

Every now and then we get reports from CRAN about our packages failing a test there. A challenging one concerns UBSAN, or Undefined Behaviour Sanitizer. For background on UBSAN, see this RedHat blog post for gcc and this one from LLVM about clang.

I had written briefly about this before in a blog post introducing the sanitizers package for tests, as well as the corresponding package page for sanitizers, which clearly predates our follow-up Rocker.org repo / project described in this initial announcement and when we became the official R container for Docker.

Rocker had support for SAN testing, but UBSAN was not working yet. So following a recent CRAN report against our RcppAnnoy package, I was unable to replicate the error and asked for help on r-devel in this thread.

Martyn Plummer and Jan van der Laan kindly sent their configurations in the same thread and off-list; Jeff Horner did so too following an initial tweet offering help. None of these worked for me, but further trials eventually lead me to the (already mentioned above) RedHat blog post with its mention of -fno-sanitize-recover to actually have an error abort a test. Which, coupled with the settings used by Martyn, were what worked for me: clang-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover.

This is now part of the updated Dockerfile of the R-devel-SAN-Clang repo behind the r-devel-ubsan-clang. It contains these settings, as well a new support script check.r for littler---which enables testing right out the box.

Here is a complete example:

docker                              # run Docker (any recent version, I use 1.2.0)
  run                               # launch a container 
    --rm                            # remove Docker temporary objects when dome
    -ti                             # use a terminal and interactive mode 
    -v $(pwd):/mnt                  # mount the current directory as /mnt in the container
    rocker/r-devel-ubsan-clang      # using the rocker/r-devel-ubsan-clang container
  check.r                           # launch the check.r command from littler (in the container)
    --setwd /mnt                    # with a setwd() to the /mnt directory
    --install-deps                  # installing all package dependencies before the test
    RcppAnnoy_0.0.5.tar.gz          # and test this tarball

I know. It is a mouthful. But it really is merely the standard practice of running Docker to launch a single command. And while I frequently make this the /bin/bash command (hence the -ti options I always use) to work and explore interactively, here we do one better thanks to the (pretty useful so far) check.r script I wrote over the last two days.

check.r does about the same as R CMD check. If you look inside check you will see a call to a (non-exported) function from the (R base-internal) tools package. We call the same function here. But to make things more interesting we also first install the package we test to really ensure we have all build-dependencies from CRAN met. (And we plan to extend check.r to support additional apt-get calls in case other libraries etc are needed.) We use the dependencies=TRUE option to have R smartly install Suggests: as well, but only one level (see help(install.packages) for details. With that prerequisite out of the way, the test can proceed as if we had done R CMD check (and additional R CMD INSTALL as well). The result for this (known-bad) package:

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.tar.gz 
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’

trying URL 'http://cran.rstudio.com/src/contrib/Rcpp_0.11.3.tar.gz'
Content type 'application/x-gzip' length 2169583 bytes (2.1 MB)
opened URL
==================================================
downloaded 2.1 MB

trying URL 'http://cran.rstudio.com/src/contrib/BH_1.55.0-3.tar.gz'
Content type 'application/x-gzip' length 7860141 bytes (7.5 MB)
opened URL
==================================================
downloaded 7.5 MB

trying URL 'http://cran.rstudio.com/src/contrib/RUnit_0.4.28.tar.gz'
Content type 'application/x-gzip' length 322486 bytes (314 KB)
opened URL
==================================================
downloaded 314 KB

trying URL 'http://cran.rstudio.com/src/contrib/RcppAnnoy_0.0.4.tar.gz'
Content type 'application/x-gzip' length 25777 bytes (25 KB)
opened URL
==================================================
downloaded 25 KB

* installing *source* package ‘Rcpp’ ...
** package ‘Rcpp’ successfully unpacked and MD5 sums checked
** libs
clang++-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover -I/usr/local/lib/R/include -DNDEBUG -I../inst/include/ -I/usr/local/include    -fpic  -pipe -Wall -pedantic -
g  -c Date.cpp -o Date.o

[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 ERROR
Running the tests in ‘tests/runUnitTests.R’ failed.
Last 13 lines of output:
  +     if (getErrors(tests)$nFail > 0) {
  +         stop("TEST FAILED!")
  +     }
  +     if (getErrors(tests)$nErr > 0) {
  +         stop("TEST HAD ERRORS!")
  +     }
  +     if (getErrors(tests)$nTestFunc < 1) {
  +         stop("NO TEST FUNCTIONS RUN!")
  +     }
  + }
  
  
  Executing test function test01getNNsByVector  ... ../inst/include/annoylib.h:532:40: runtime error: index 3 out of bounds for type 'int const[2]'
* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 2 WARNINGs, 1 NOTE
See/tmp/RcppAnnoy/..Rcheck/00check.logfor details.
root@a7687c014e55:/tmp/RcppAnnoy# 

The log shows that thanks to check.r, we first download and the install the required packages Rcpp, BH, RUnit and RcppAnnoy itself (in the CRAN release). Rcpp is installed first, we then cut out the middle until we get to ... the failure we set out to confirm.

Now having a tool to confirm the error, we can work on improved code.

One such fix currently under inspection in a non-release version 0.0.5.1 then passes with the exact same invocation (but pointing at RcppAnnoy_0.0.5.1.tar.gz):

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.1.tar.gz
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’
[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 OK
* checking PDF version of manual ... OK
* DONE

Status: 1 WARNING
See/mnt/RcppAnnoy.Rcheck/00check.logfor details.

edd@max:~/git$

This proceeds the same way from the same pristine, clean container for testing. It first installs the four required packages, and the proceeds to test the new and improved tarball. Which passes the test which failed above with no issues. Good.

So we now have an "appliance" container anybody can download from free from the Docker hub, and deploy as we did here in order to have fully automated, one-command setup for testing for UBSAN errors.

UBSAN is a very powerful tool. We are only beginning to deploy it. There are many more useful configuration settings. I would love to hear from anyone who would like to work on building this out via the R-devel-SAN-Clang GitHub repo. Improvements to the littler scripts are similarly welcome (and I plan on releasing an updated littler package "soon").

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Mon, 12 Jan 2015

RcppGSL 0.2.3

A new version of RcppGSL is now on CRAN today. This package provides an interface from R to the GNU GSL using our Rcpp.

Similar to the recent RcppClassic release, this update was triggered by the CRAN maintainers desire to keep the Makefile free of GNU make extensions. At the same time we simplified the configure file a little, and refreshed the look and feel of the vignette.

No user-facing changes were made.

The NEWS file entries follows below:

Changes in version 0.2.3 (2015-01-10)

  • The src/Makevars.in was pruned of GNU make features at the request of the CRAN Maintainers

  • configure.ac and configure were updated, and shortened

  • The RcppGSL-intro.Rnw vignette was updated for its look and feel.

Courtesy of CRANberries, a summary of changes to the most recent release is available.

More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 11 Jan 2015

rfoaas 0.1.1

A brand new and shiny version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service--which provides a modern, scalable and RESTful web service for the frequent need to tell someone to f$#@ off.

There are two (internal) changes of merit in this version. First off, as FOAAS was refactored upstream, we are now forced to supply an accept: text/plain http request header. Which, sadly enough, is not something the url() function in R can do---so we brought in more cavalry and now depend on the httr package by Hadley, and use its GET() method. A second change is that we added a (simple but effective enough) regression test which simply calls all foaas entry points available throug rfoaas, and compares this to the anticipated result. To run it, you need to set an environment variable RunFOAASTests=yes as eg out Travis script does. Finally, we aligned the version number with upstream to signal that we cover all available entry points of that version.

As usual, CRANberries provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Sat, 10 Jan 2015

RcppClassic 0.9.6

A maintenance release of RcppClassic, now at version 0.9.6, went out to CRAN today. This package provides a maintained version of the otherwise deprecated first Rcpp API; no new projects should use it.

No changes were in user-facing code. The Makevars file was change to accomodate a request by the CRAN Maintainer to keep it free of GNU Make extensions. At the same time, we overhauled the look and feel of the (very short) vignette. Build instructions were updated both in the vignette and in the included example package. Other accumulated changes since the last release were updates to the DESCRIPTION and NAMESPACE file as well two namespace-related R code updates.

Courtesy of CRANberries, there is the set of changes relative to the previous release.

Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 08 Jan 2015

random 0.2.3

A new release of my random package for truly (hardware-based) random numbers as provided by random.org is now on CRAN.

The main change is a switch to the curl() function from the eponymous package by Jeroen Ooms. This was caused by random.org now using https instead of http, annd the fact that te url() function from R does not cope well with the redirect. Besides this (enforced) change, everything else remains the same.

Courtesy of CRANberries comes a diffstat report for this release. Current and previous releases are available here as well as on CRAN.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/random | permanent link

Tue, 06 Jan 2015

RcppCNPy 0.2.4

A new release of the RcppCNPy package is now on CRAN.

This release mostly solidifies and fixes things. Support for saving integer objects, which was expanded in release 0.2.3, was not entirely correct. Operations on big-endian systems were not up to snuff either.

Wush Wu helped in getting this right with very diligent testing and patching particularly on big-endian hardware. We also got a pull request from Romain to reflect better const correctness at the Rcpp side of things. Last but not least we obliged by the CRAN Maintainers to not assume one could call gzip from system() call because, well, you guessed it.

Changes in version 0.2.4 (2015-01-05)

  • Support for saving integer objects was not correct and has been fixed.

  • Support for loading and saving on 'big endian' systems was incomplete, has been greatly expanded and corrected, thanks in large part to very diligent testing as well as patching by Wush Wu.

  • The implementation now uses const iterators, thanks to a pull request by Romain Francois.

  • The vignette no longer assumes that one can call gzip via system as the world's leading consumer OS may disagree.

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is may be the best place to start a discussion. GitHub issue tickets are also welcome.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 05 Jan 2015

BH release 1.55.0-3

Right on the heels of yesterday's BH release 1.55.0-2 bringing Boost Fusion, we now have release 1.55.0-3 bringing Boost Graph. To recap, BH is our CRAN package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R and of course Rcpp.

And as a small project I am working on--and which should now be so much closer to release--needed not only Boost Fusion but also Boost Graph, I had to bother the CRAN maintainers twice in two days. This new version closes issue ticket 9, and may be of interest to other packages such as the venerable RBGL formerly on CRAN and now a BioConductor package which includes its own copy of this graph library (plus depends).

A brief summary of changes from the NEWS file is below.

Changes in version 1.55.0-3 (2015-01-04)

  • Added Boost Graph requested in GH ticket #9 by Dirk for RcppStreams

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Sun, 04 Jan 2015

BH release 1.55.0-2

A new release of BH, our package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R, is now on CRAN.

This is a relatively minor change which expands the set of Boost libraries included in the package to Boost Fusion per issue ticket 7. Boost Fusion is a very clever library providing a fusion of both compile-time meta-programming and run-time programming to provide something similar to the STL (i.e. containers, algorithms, ...) for heterogenous tuples. I also added pointers to both the mailing list and the GitHub issue tracker to the DESCRIPTION file, README and main manual page.

A brief summary of changes from the NEWS file is below.

Changes in version 1.55.0-2 (2015-01-03)

  • Added Boost Fusion requested in GH ticket #7 by Dirk for RcppStreams

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Thu, 01 Jan 2015

(Belated) 2014 Running Recap

Having just blogged about today's 5k, I noticed that I did not blog at all about running in 2014. There wasn't much, but another Ragnar Relay in February, again as an Ultra team. I more-or-less live tweeted this though: Pre-race warning (and that was my route!), Team photo at start, Leg 1, Leg 2, Leg 3, Leg 4, Leg 5, Leg 6, and Chilling

I also ran the Philly Half-Marathon in November following up on the Penn R Workshop, and posted some tweets: Post race selfie, Strava result (thinking it was a 13.5 miler race--WTF), and Post-race chill.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/sports/running | permanent link

New Year's Run 2015

Nice, crisp and pretty cold morning for the 2015 New Year's 5k. I had run this before with friends: 2003, 2005, 2006 and 2007.

Needless to say, I am a little slower now: A hand-stopped 23:13 for a 7:27 min/mile pace is fine by me given the weather, lack of speedwork in the last few months, and generally reduced running---but also almost a minute slower than the last time I ran this.

In case you're a fellow running geek and into this, Strava has my result nicely aggregated.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/sports/running | permanent link

Wed, 31 Dec 2014

digest 0.6.8

Release 0.6.8 of digest package is now on CRAN and will get to Debian shortly.

This release opens the door to also providing the digest functionality at the C level to other R packages. Wush Wu is going to use the murmurHash C implementation in his recently-created FeatureHashing package.

We plan to export the other hashing function as well. Another small change attempts to overcome a build limitation on that other largely-irrelevant-but-still-check-by-CRAN OS.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

Mon, 29 Dec 2014

rfoaas 0.0.5

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service--which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

This version aligns the rfoaas version number with the (at long last) updated upstream version number, and brings a change suggested by Richie Cotton to set the encoding of the returned object.

As usual, CRANberries provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Sun, 28 Dec 2014

RcppArmadillo 0.4.600.0

Conrad produced another minor release 4.600 of Armadillo. As before, I had created a GitHub-only pre-release(s) of his pre-release(s), and tested a pre-release as well as the actual release against the now over one hundred CRAN dependents of our RcppArmadillo package. The tests passed fine as usual with less than a handful of checks not passing, all for known cases -- and results are as always in the rcpp-logs repository.

Changes are summarized below based on the NEWS.Rd file.

Changes in RcppArmadillo version 0.4.600.0 (2014-12-27)

  • Upgraded to Armadillo release Version 4.600 ("Off The Reservation")

    • added .head() and .tail() to submatrix views

    • faster matrix transposes within compound expressions

    • faster accu() and norm() when compiling with -O3 -ffast-math -march=native (gcc and clang)

    • workaround for a bug in GCC 4.4

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 25 Dec 2014

rfoaas 0.0.4.20141225 -- not on CRAN

A new version of rfoaas was prepared for CRAN, but refused on the grounds of having been updated within 24 hours. Oh well.

To recap, the rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

And having seen the Christmas Eve (ie December 24) update, upstream immediatly and rather kindly added a new xmas(name, from) function -- so now we could do rfoaas::xmas("CRAN Maintainers") to wish the CRAN Maintainers a very Merry Christmas.

So for once, there is no CRANberries report as the package is not on CRAN :-/ There is of course always GitHub.

Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Wed, 24 Dec 2014

rfoaas 0.0.4.20141224

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

This is minor update, affecting only the rfoaas side without changes at the FOAAS backend side. We documented that the result may need an encoding update (at least on the World's leading consumer OS), and now provide a default value for from so that many services can be called argument-less, or at least with one argument less than before. Both of these were suggested by Richie Cotton via issue tickets at the GitHub repo.

CRANberries also provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Tue, 23 Dec 2014

RcppEigen 0.3.2.3.0

A new release of RcppEigen is now on CRAN, and will get to Debian shortly.

This is thanks to a fabulous pull request by Yixuan Qiu who single-handedly managed to upgrade to the new minor release of Eigen -- after having expanded the unit tests a few weeks earlier in another pull request.

The NEWS file entry follows.

Changes in RcppEigen version 0.3.2.3.0 (2014-12-22)

  • Updated to version 3.2.3 of Eigen

  • Added a number of additional unit tests for wrap, transform and sparse

  • Updated one of the examples

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 22 Dec 2014

BH release 1.55.0-1

A new release of BH, our package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R, is now on CRAN.

There are two changes. First, we upgraded to Boost 1.55. While Boost 1.57 is now current, I am playing it somewhat safe and conservative here by relying on what is the current version in Debian (and Ubuntu). I even started from the Debian tarball. This ensures we include nothing not in accordance with the Debian Free Software Guidelines (as they are a good proxy for what is acceptable to CRAN). The second change was the addition of Boost.Geometry as requested in issue ticket 5.

This may be a good time to recall that best way to get in touch for desired additions in BH are the mailing list or the issue tracker at the GitHub repo.

A brief summary of changes from the NEWS file (which we failed to update for the release; the text below is however now in the repo):

Changes in version 1.55.0-1 (2014-12-21)

Courtesy of CRANberries, there is also a (very long!) diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Sun, 21 Dec 2014

If only there was a Romeo somewhere ...

Attention: rant coming. You have been warned, and may want to tune out now.

So the top of my Twitter timeline just had a re-tweet posting to this marvel on the state of Julia. I should have known better than to glance at it as it comes from someone providing (as per the side-bar) Thought leadership in Big Data, systems architecture and more. Reading something like this violates the first rule of airport book stores: never touch anything from the business school section, especially on (wait for it) leadership or, worse yet, thought leadership.

But it is Sunday, my first cup of coffee still warm (after finalising two R package updates on GitHub, and one upload to CRAN) and so I read on. Only to be mildly appalled by the usual comparison to R based on the same old Fibonacci sequence.

Look, I am as guilty as anyone of using it (for example all over chapter one of my Rcpp book), but at least I try to stress each and every time that this is kicking R where it is down as its (fairly) poor performance on functions calls (that is well-known and documented) obviously gets aggravated by recursive calls. But hey, for the record, let me restate those results. So Julia beats R by a factor of 385. But let's take a closer look.

For n=25, I get R to take 241 milliseconds---as opposed to his 6905 milliseconds---simply by using the same function I use in every workshop, eg last used at Penn in November, which does not use the dreaded ifelse operator:

fibR <- function(n) {
  if (n < 2) return(n)
  return(fibR(n-1) + fibR(n-2))
}

Switching that to the standard C++ three-liner using Rcpp

library(Rcpp)
cppFunction('int fibCpp(int n) { 
  if (n < 2) return(n);  
  return(fibCpp(n-1) + fibCpp(n-2));  
  }')

and running a standard benchmark suite gets us the usual result of

R> library(rbenchmark)
R> benchmark(fibR(25),fibCpp(25),order="relative")[,1:4]
        test replications elapsed relative
2 fibCpp(25)          100   0.048    1.000
1   fibR(25)          100  24.674  514.042
R> 

So for the record as we need this later: that is 48 milliseconds for 100 replications, or about 0.48 milliseconds per run.

Now Julia. And of my standard Ubuntu server running the current release 14.10:

edd@max:~$ julia
ERROR: could not open file /home/edd//home/edd//etc/julia/juliarc.jl
 in include at boot.jl:238

edd@max:~$ 

So wait, what? You guys can't even ensure a working release on what is probably the most popular and common Linux installation? And I get to that after reading a post on the importance of "Community, Community, Community" and you can't even make sure this works on Ubuntu? Really?

So a little bit of googling later, I see that julia -f is my friend for this flawed release, and I can try to replicate the original timing

edd@max:~$ julia -f
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.2.1 (2014-02-11 06:30 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-linux-gnu

julia> fib(n) = n < 2 ? n : fib(n - 1) + fib(n - 2)
fib (generic function with 1 method)

julia> @elapsed fib(25)
0.002299559

julia> 

Interestingly the posts author claims 18 milliseconds. I see 2.3 milliseconds here. Maybe someone is having a hard time comparing things to the right of the decimal point. Or maybe his computer is an order of magnitude slower than mine. The more important thing is that Julia is of course faster than R (no surprise: LLVM at work) but also still a lot slower than a (trivial to write and deploy) C++ function. Nothing new here.

So let's recap. Comparison to R was based on a flawed version of a function we only use when we deliberately want to put R down, can be improved significantly when using a better implementation, results are still off by order of magnitude to what was reported ("math is hard"), and the standard C / C++ way of doing things is still several times faster than our new saviour language---which I can't even launch on the current version of one of the more common free operating systems. Ok then. Someone please wake me up in a few years and I will try again.

Now, coming to the end of the rant I should really stress that of course I too hope that Julia succeeds. Every user pulled away from Matlab is a win for all us. We're in this together and the endless navel gazing between ourselves is so tiresome and irrelevant. And as I argue here, even more so when we among ourselves stick to unfair comparisons as well as badly chosen implementation details.

What matters are wins against the likes of Matlab, Excel, SAS and so on. Let's build on our joint strength. I am sure I will use Julia one day, and I am grateful for everyone helping with it---as a lot of help seems to be needed. In the meantime, and with CRAN at 6130 packages that just work I'll continue to make use of this amazing community and trying my bit to help it grow and prosper. As part of our joint community.

/computers/misc | permanent link

Sat, 20 Dec 2014

digest 0.6.7

Release 0.6.7 of digest package is now on CRAN and in Debian.

Jim Hester was at it again and added murmurHash. I cleaned up several sets of pedantic warnings in some of the source files, updated the test reference out, and that forms version 0.6.7.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

Fri, 19 Dec 2014

Rocker is now the official R image for Docker

big deal

Something happened a little while ago which we did not have time to commensurate properly. Our Rocker image for R is now the official R image for Docker itself. So getting R (via Docker) is now as simple as saying docker pull r-base.

This particular container is essentially just the standard r-base Debian package for R (which is one of a few I maintain there) plus a mininal set of extras. This r-base forms the basis of our other containers as e.g. the rather popular r-studio container wrapping the excellent RStudio Server.

A lot of work went into this. Carl and I also got a tremendous amount of help from the good folks at Docker. Details are as always at the Rocker repo at GitHub.

Docker itself continues to make great strides, and it has been great fun help to help along. With this post I achieved another goal: blog about Docker with an image not containing shipping containers. Just kidding.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Sat, 13 Dec 2014

rfoaas 0.0.4.20141212

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

The FOAAS backend gets updated in spurts, and yesterday a few pull requests were integrated, including one from yours truly. So with that it was time for an update to rfoaas. As the version number upstream did not change (bad, bad, practice) I appended the date the version number.

CRANberries also provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Thu, 11 Dec 2014

RProtoBuf 0.4.2

A new release 0.4.2 of RProtoBuf is now on CRAN. RProtoBuf provides R bindings for the Google Protocol Buffers ("Protobuf") data encoding library used and released by Google, and deployed as a language and operating-system agnostic protocol by numerous projects.

Murray and Jeroen did almost all of the heavy lifting. Many changes were triggered by two helpful referee reports, and we are slowly getting to the point where we will resubmit a much improved paper. Full details are below.

Changes in RProtoBuf version 0.4.2 (2014-12-10)

  • Address changes suggested by anonymous reviewers for our Journal of Statistical Software submission.

  • Make Descriptor and EnumDescriptor objects subsettable with "[[".

  • Add length() method for Descriptor objects.

  • Add names() method for Message, Descriptor, and EnumDescriptor objects.

  • Clarify order of returned list for descriptor objects in as.list documentation.

  • Correct the definition of as.list for EnumDescriptors to return a proper list instead of a named vector.

  • Update the default print methods to use cat() with fill=TRUE instead of show() to eliminate the confusing [1] since the classes in RProtoBuf are not vectorized.

  • Add support for serializing function, language, and environment objects by falling back to R's native serialization with serialize_pb and unserialize_pb to make it easy to serialize into a Protocol Buffer all of the more than 100 datasets which come with R.

  • Use normalizePath instead of creating a temporary file with file.create when getting absolute path names.

  • Add unit tests for all of the above.

CRANberries also provides a diff to the previous release. RProtoBuf page which has a draft package vignette, a a 'quick' overview vignette, and a unit test summary vignette. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rprotobuf | permanent link

Wed, 10 Dec 2014

digest 0.6.6 (and 0.6.5)

A new release 0.6.6 of the digest package is now on CRAN and in Debian.

This release brings the xxHash non-cryptographic hash function by Yann Collet, thanks to several pull requests by Jim Hester. After the upload of version 0.6.5 we uncovered another lovely non-standardness of Windoze: you cannot format unsigned long long via printf() format strings. Great. Luckily Jim found a quick (and portable) fix via the inttypes.h header, and that went into the 0.6.6 release.

The release also contains an earlier extension for hmac() to also cover crc32 hashes, kindly provided by Suchen Jin.

I also made a number of small internal changes such as

  • switching (compiled) function registration to package load via a the useDynLib() declaration in NAMESPACE,
  • (finally!!) formating code to proper four-space indentation,
  • adding some documentation around Jim's pull request, and
  • adding a few GPL copyright headers.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

RcppRedis 0.1.3

A very minor bugfix release of RcppRedis is now on CRAN. The zcount function now returns the correct type.

Changes in version 0.1.3 (2014-12-10)

  • Bug fix setting correct return type of zcount

Courtesy of CRANberries, there is also a diffstat report for the most recent release. More information is on the RcppRedis page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Tue, 09 Dec 2014

Wilco!!

Wilco at The Riviera

With a bit of luck due to a collegue having a spare ticket, I managed to make it to an awesome Wilco show at The Riviera in Uptown.

This concert was part of a set a six shows. Tweedy and the band were fast, and loose, and wonderful, and totally beloved by the home crowd. An truly outstanding show, and a great evening.

Also: I should get out more often. Last blog entry about Wilco was from 2005. Ouch.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/music/rock | permanent link

RcppAnnoy 0.0.4

A few weeks ago, RcppAnnoy had its initial release 0.0.2 and subsequent update in release 0.0.3. The latter brought Windows support, thanks to a neat pull request by Qiang Kou.

RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

In the 0.0.3 release, I overlooked one thing: that with builds on Windows, we would also get builds against what CRAN calls R-oldrel: the previous release, which cannot turn on C++11 via the simple CXX_STD = CXX11 declaration in src/Makevars (and which we need because use of Boost brings in long long which R can only cope with under C++11 ...).

So this new release 0.0.4 does nothing more than add a constraint in a Depends: R (>= 3.1.0) to avoid builds not being able to turn on C++11.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Tue, 02 Dec 2014

RQuantLib 0.4.0

A new major release of RQuantLib is now on CRAN and getting to Debian.

All C++ source files have been rewritten to take advantage of newer Rcpp features. Several Fixed Income functions have been added, or refreshed, thanks to contributions by Michele Salvadore. Calendar support was greatly expanded thanks to a contribution by Danilo Dias da Silva. All key changes are listed in detail below.

Changes in RQuantLib version 0.4.0 (2014-12-01)

  • Changes in RQuantLib code:

    • All function interfaces have been rewritten using Rcpp Attributes. No SEXP remain in the function signatures. This make the code shorter, more readable and more easily extensible.

    • The header files have been reorganized so that plugin use is possible. An impl.h files is imported once for each compilation unit: for RQuantLib from the file src/dates.cpp directory, from a sourced file via a #define set by the plugin wrapper.

    • as<>() and wrap() converters have added for QuantLib Date types.

    • Plugin support has been added, allowing more ad-hoc use via Rcpp Attributes.

    • Several Fixed Income functions have been added, and/or rewritten to better match the QuantLib signatures; this was done mostly by Michele Salvadore.

    • Several Date and Calendar functions have been added.

    • Calendar support has been greatly expanded thanks to Danilo Dias da Silva.

    • Exported curve objects are now more parsimonious and advance entries in the table object roughly one business month at a time.

    • The DiscountCurve and Bond curve construction has been fixed via a corrected evaluation date and omitted the two-year swap rate, as suggested by Luigi Ballabio.

    • The NAMESPACE file has a tighter rule for export of *.default functions, as suggested by Bill Dunlap

    • Builds now use OpenMP where available.

    • The package now depends on QuantLib 1.4.0 or later.

  • Changes in RQuantLib tests:

    • New unit tests for dates have been added.

    • C++ code for the unit tests has also been converted to Rcpp Attributes use; a helper function unitTestSetup() has been added.

    • Continuous Integration via Travis is now enabled from the GitHub repo.

  • Changes in RQuantLib documentation:

    • This NEWS file has been added. Better late than never, as they say.

Courtesy of CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the RQuantLib page. Questions, comments etc should go to the rquantlib-devel mailing list off the R-Forge page. Issue tickets can be filed at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rquantlib | permanent link

Sat, 29 Nov 2014

RcppArmadillo 0.4.550.1.0

A week ago, Conrad provided another minor release 4.550.0 of Armadillo which has since received one minor correction in 4.550.1.0. As before, I had created a GitHub-only pre-release of his pre-release which was tested against the almost one hundred CRAN dependents of our RcppArmadillo package. This passed fine as usual, and results are as always in the rcpp-logs repository.

Processing and acceptance at the CRAN took a little longer as around the same time a fresh failure in unit tests had become apparent on an as-of-yet unannounced new architecture (!!) also tested at CRAN. The R-devel release has since gotten a new capabilities() test for long double, and we now only run this test (for our rmultinom()) if the test asserts that the given R build has this capability. Phew, so with all that the new version in now on CRAN; Windows binaries have been built and I also uploaded new Debian binaries.

Changes are summarized below; our end also includes added support for conversion of Field types takes to short pull request by Romain.

Changes in RcppArmadillo version 0.4.550.1.0 (2014-11-26)

  • Upgraded to Armadillo release Version 4.550.1 ("Singapore Sling Deluxe")

    • added matrix exponential function: expmat()

    • faster .log_p() and .avg_log_p() functions in the gmm_diag class when compiling with OpenMP enabled

    • faster handling of in-place addition/subtraction of expressions with an outer product

    • applied correction to gmm_diag relative to the 4.550 release

  • The Armadillo Field type is now converted in as<> conversions

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Fri, 28 Nov 2014

CRAN Task Views for Finance and HPC now (also) on GitHub

The CRAN Task View system is a fine project which Achim Zeileis initiated almost a decade ago. It is described in a short R Journal article in Volume 5, Number 1. I have been editor / maintainer of the Finance task view essentially since the very beginning of these CRAN Task Views, and added the High-Performance Computing one in the fall of 2008. Many, many people have helped by sending suggestions or even patches; email continues to be the main venue for the changes.

The maintainers of the Web Technologies task view were, at least as far as I know, the first to make the jump to maintaining the task view on GitHub. Karthik and I briefly talked about this when he was in town a few weeks ago for our joint Software Carpentry workshop at Northwestern.

So the topic had been on my mind, but it was only today that I realized that the near-limitless amount of awesome that is pandoc can probably help with maintenance. The task view code by Achim neatly converts the very regular, very XML, very boring original format into somewhat-CRAN-website-specific html. Pandoc, being as versatile as it is, can then make (GitHub-flavoured) markdown out of this, and with a minimal amount of sed magic, we get what we need.

And hence we now have these two new repos:

Contributions are now most welcome by pull request. You can run the included converter scripts, it differs between both repos only by one constant for the task view / file name. As an illustration, the one for Finance is below.

#!/usr/bin/r
## if you do not have /usr/bin/r from littler, just use Rscript

ctv <- "Finance"

ctvfile  <- paste0(ctv, ".ctv")
htmlfile <- paste0(ctv, ".html")
mdfile   <- "README.md"

## load packages
suppressMessages(library(XML))          # called by ctv
suppressMessages(library(ctv))

r <- getOption("repos")                 # set CRAN mirror
r["CRAN"] <- "http://cran.rstudio.com"
options(repos=r)

check_ctv_packages(ctvfile)             # run the check

## create html file from ctv file
ctv2html(read.ctv(ctvfile), htmlfile)

### these look atrocious, but are pretty straight forward. read them one by one
###  - start from the htmlfile
cmd <- paste0("cat ", htmlfile,
###  - in lines of the form  ^<a href="Word">Word.html</a>
###  - capture the 'Word' and insert it into a larger URL containing an absolute reference to task view 'Word'
  " | sed -e 's|^<a href=\"\\([a-zA-Z]*\\)\\.html|<a href=\"http://cran.rstudio.com/web/views/\\1.html\"|' | ",
###  - call pandoc, specifying html as input and github-flavoured markdown as output
              "pandoc -s -r html -w markdown_github | ",
###  - deal with the header by removing extra ||, replacing |** with ** and **| with **:              
              "sed -e's/||//g' -e's/|\\*\\*/\\*\\*/g' -e's/\\*\\*|/\\*\\* /g' -e's/|$/  /g' ",
###  - make the implicit URL to packages explicit
              "-e's|../packages/|http://cran.rstudio.com/web/packages/|g' ",
###  - write out mdfile
              "> ", mdfile)

system(cmd)                             # run the conversion

unlink(htmlfile)                        # remove temporary html file

cat("Done.\n")

I am quite pleased with this setup---so a quick thanks towards the maintainers of the Web Technologies task view; of course to Achim for creating CRAN Task Views in the first place, and maintaining them all those years; as always to John MacFarlance for the magic that is pandoc; and last but not least of course to anybody who has contributed to the CRAN Task Views.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/snippets | permanent link

Tue, 25 Nov 2014

Rcpp now used by 300 CRAN packages

max-heap image

This morning, Rcpp reached another round milestone: 300 packages on CRAN now depend on it (as measured by Depends, Imports and LinkingTo declarations). The graph is on the left depicts the growth of Rcpp usage over time. There are 41 more on BioConductor (which is not included in the chart).

The first and less detailed part uses manually save entries, the second half of the data set was generated semi-automatically via a short script appending updates to a small file-based backend. A list of user package is kept on this page.

Also displayed in the graph is the relative proportion of CRAN packages using Rcpp. The four per-cent hurdle was cleared just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited talk. We may well hit five per-cent before the end of the year.

300 is a pretty humbling and staggering number. Also interesting that we we cleared 200 only at the end of April, and 250 in early August.

So from everybody behind Rcpp, a heartfelt Thank You! to all the users and of course other contributors.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 24 Nov 2014

YATORP -- Yet Another Tutorial on R Packaging

What the world needs right now is yet another tutorial on R packages and their creation. Luckily, this last Friday and Saturday, I had the opportunity to present in a workshop organized by Frank DiTraglia at Penn's shiny new Warren Center, and held at Wharton.

Given the Warren Center's focus, the workshop centered around Big Data and Open Science with R. Yihui Xie and myself alternated on delivering four units on an Introduction to R, Writing R packages, Dynamic Documents with R, and HPC with Rcpp and RcppArmadillo.

So I had to come up with a plan for teaching creating R packages -- and decided to do it from the very bottom up, clearly introducing the underlying R CMD ... commands and only then switching to taking advantage of an environment such as the RStudio IDE.

The resulting slides are now available on my presentations page. The code examples are in a repo subdirectory on GitHub as well. While both were designed to support the parallel live instruction offered in the workshop, I would be interested in feedback (preferably via email) about how useful the slides are by themselves.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/computers/R | permanent link

Tue, 18 Nov 2014

R / Finance 2015 Call for Papers

Earlier today, Josh send the text below to the R-SIG-Finance list, and I updated the R/Finance website, including its Call for Papers page, accordingly.

We are once again very excited about our conference, thrilled about the four confirmed keynotes, and hope that many R / Finance users will not only join us in Chicago in May 2015 -- but also submit an exciting proposal.

So read on below, and see you in Chicago in May!

Call for Papers:

R/Finance 2015: Applied Finance with R
May 29 and 30, 2015
University of Illinois at Chicago, IL, USA

The seventh annual R/Finance conference for applied finance using R will be held on May 29 and 30, 2015 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past six years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2015. This year will include invited keynote presentations by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please make your submission online at this link. The submission deadline is January 31, 2015. Submitters will be notified via email by February 28, 2015 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the R/Finance conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal,
Jeffrey Ryan, Joshua Ulrich

/computers/R | permanent link

RcppAnnoy 0.0.3

Hours after the initial blog post announcing the first release of the new package RcppAnnoy, Qiang Kou sent us a very nice pull request adding mmap support in Windows.

So a new release with Windows support is on now CRAN, and Windows binaries should be available by this evening as usual.

To recap, RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 16 Nov 2014

Introducing RcppAnnoy

A new package RcppAnnoy is now on CRAN.

It wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify.

While Annoy is setup for use by Python, RcppAnnoy offers the exact same functionality from R via Rcpp.

A new page for RcppAnnoy provides some more detail, example code and further links. See a recent blog post by Erik for a performance comparison of different approximate nearest neighbours libraries for Python.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 10 Nov 2014

BH release 1.54.0-5

A new release of BH, our package providing Boost headers for use by R is now on CRAN. This release was triggered via a request by Ben Goodrich as noted below who asked for Boost Circular Buffer which RStan uses.

No other changes were made.

Changes in version 1.54.0-5 (2014-11-09)

  • Added Boost Circular Buffer requested by Ben Goodrich for RStan

Courtesy of CRANberries, there is also a diffstat report for the most recent release

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Thu, 06 Nov 2014

RcppRedis 0.1.2

A new release of RcppRedis is now on CRAN. It contains additional commands for hashes and sets, all contributed by John Laing and Whit Armstrong.

Changes in version 0.1.2 (2014-11-06)

  • New commands execv, hset, hget, sadd, srem, and smembers contributed by John Laing and Whit Armstrong over several pull requests.

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppRedis page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Wed, 05 Nov 2014

A Software Carpentry workshop at Northwestern

On Friday October 31, 2014, and Saturday November 1, 2014, around thirty-five graduate students and faculty members attended a Software Carpentry workshop. Attendees came primarily from the Economics department and the Kellogg School of Management, which was also the host and sponsor providing an excellent venue with the Allen Center on the (main) Evanston campus of Northwestern University.

The focus of the two-day workshop was an introduction, and practical initiation, to working effectively at the shell, getting introduced and familiar with the git revision control system, as well as a thorough introduction to working and programming in R---from the basics all the way to advanced graphing as well as creating reproducible research documents.

The idea of the workshop had come out of discussion during our R/Finance 2014 conference. Bob McDonald of Northwestern, one of this year's keynote speakers, and I were discussing various topic related to open source and good programming practice --- as well as the lack of a thorough introduction to either for graduate students and researcher. And once I mentioned and explained Software Carpentry, Bob was rather intrigued. And just a few months later we were hosting a workshop (along with outstanding support from Jackie Milhans from Research Computing at Northwestern).

We were extremely fortunate in that Karthik Ram and Ramnath Vaidyanathan were able to come to Chicago and act as lead instructors, giving me an opportunity to get my feet wet. The workshop began with a session on shell and automation, which was followed by three session focusing on R: a core introduction, a session focused on function, and to end the day, a session on the split-apply-combine approach to data transformation and analysis.

The second day started with two concentrated session on git and the git workflow. In the afternoon, one session on visualization with R as well as a capstone-alike session on reproducible research rounded out the second day.

Things that worked

The experience of the instructors showed, as the material was presented and an effective manner. The selection of topics, as well as the pace were seen by most students to be appropriate and just right. Karthik and Ramnath both did an outstanding job.

No students experienced any real difficulties installing software, or using the supplied information. Participants were roughly split between Windows and OS X laptops, and had generally no problem with bash, git, or R via RStudio.

The overall Software Carpentry setup, the lesson layout, the focus on hands-on exercises along with instruction, the use of the electronic noteboard provided by etherpad and, of course, the tried-and-tested material worked very well.

Things that could have worked better

Even more breaks for exercises could have been added. Students had difficulty staying on pace in some of the exercise: once someone fell behind even for a small error (maybe a typo) it was sometimes hard to catch up. That is a general problem for hands-on classes. I feel I could have done better with the scope of my two session.

Even more cohesion among the topics could have been achieved via a single continually used example dataset and analysis.

Acknowledgments

Robert McDonald from Kellogg, and Jackie Milhans from Research Computing IT, were superb hosts and organizers. Their help in preparing for the workshop was tremendous, and the pick of venue was excellent, and allowed for a stress-free two days of classes. We could not have done this without Karthik and Ramnath, so a very big Thank You to both of them. Last but not least the Software Carpentry 'head office' was always ready to help Bob, Jackie and myself during the earlier planning stage, so another big Thank You! to Greg and Arliss.

/computers/misc | permanent link

RPushbullet 0.1.1

A minor bugfix release 0.1.1 of the RPushbullet package (interfacing the neat Pushbullet service) landed on CRAN yesterday morning.

It cleans up a small issue related to the ability to transfer files between devices via the Pushbullet service where the ability to select a (non-default) target device has now been restored.

With that, allow me to borrow one excellent use case of RPushbullet from the blog of Karl Broman: how to use RPushbullet for error notifications from R:

options(error = function() {
    library(RPushbullet)
    pbPost("note", "Error", geterrmessage())
})

This is very clever: should an error occur, you get immediate notification in browser or on your phone. Left as an exercise for the reader is to combine this with the equally excellent rfoaas package (github|cran) to get appropriately colourful error messages...

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rpushbullet | permanent link

Sun, 02 Nov 2014

RcppArmadillo 0.4.500.0

A few days ago, Conrad provided another minor release of Armadillo. Once again, I had created a GitHub-only pre-release of his pre-release which was tested against (then) all ninety (!!) CRAN dependents of our RcppArmadillo package, providing a further test for Conrad's code and uploaded RcppArmadillo 0.4.500.0 to CRAN and Debian once his release was finalized.

The few changes from his end are summarized below; our end also includes an update / extension to the sample() method provided by Christian Gunning --- and used to great effect in the excellent Rcpp Gallery post by Jonathan Olmsted.

Changes in RcppArmadillo version 0.4.500.0 (2014-10-30)

  • Upgraded to Armadillo release Version 4.500 ("Singapore Sling")

    • faster handling of complex vectors by norm()

    • expanded chol() to optionally specify output matrix as upper or lower triangular

    • better handling of non-finite values when saving matrices as text files

  • The sample functionality has been extended to provide the Walker Alias method (including new unit tests) via a pull request by Christian Gunning

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 23 Oct 2014

Introducing Rocker: Docker for R

You only know two things about Docker. First, it uses Linux
containers. Second, the Internet won't shut up about it.

-- attributed to Solomon Hykes, Docker CEO

So what is Docker?

Docker is a relatively new open source application and service, which is seeing interest across a number of areas. It uses recent Linux kernel features (containers, namespaces) to shield processes. While its use (superficially) resembles that of virtual machines, it is much more lightweight as it operates at the level of a single process (rather than an emulation of an entire OS layer). This also allows it to start almost instantly, require very little resources and hence permits an order of magnitude more deployments per host than a virtual machine.

Docker offers a standard interface to creation, distribution and deployment. The shipping container analogy is apt: just how shipping containers (via their standard size and "interface") allow global trade to prosper, Docker is aiming for nothing less for deployment. A Dockerfile provides a concise, extensible, and executable description of the computational environment. Docker software then builds a Docker image from the Dockerfile. Docker images are analogous to virtual machine images, but smaller and built in discrete, extensible and reuseable layers. Images can be distributed and run on any machine that has Docker software installed---including Windows, OS X and of course Linux. Running instances are called Docker containers. A single machine can run hundreds of such containers, including multiple containers running the same image.

There are many good tutorials and introductory materials on Docker on the web. The official online tutorial is a good place to start; this post can not go into more detail in order to remain short and introductory.

So what is Rocker?

rocker logo

At its core, Rocker is a project for running R using Docker containers. We provide a collection of Dockerfiles and pre-built Docker images that can be used and extended for many purposes.

Rocker is the the name of our GitHub repository contained with the Rocker-Org GitHub organization.

Rocker is also the name the account under which the automated builds at Docker provide containers ready for download.

Current Rocker Status

Core Rocker Containers

The Rocker project develops the following containers in the core Rocker repository

  • r-base provides a base R container to build from
  • r-devel provides the basic R container, as well as a complete R-devel build based on current SVN sources of R
  • rstudio provides the base R container as well an RStudio Server instance

We have settled on these three core images after earlier work in repositories such as docker-debian-r and docker-ubuntu-r.

Rocker Use Case Containers

Within the Rocker-org organization on GitHub, we are also working on

  • Hadleyverse which extends the rstudio container with a number of Hadley packages
  • rOpenSci which extends hadleyverse with a number of rOpenSci packages
  • r-devel-san provides an R-devel build for "Sanitizer" run-time diagnostics via a properly instrumented version of R-devel via a recent compiler build
  • rocker-versioned aims to provided containers with 'versioned' previous R releases and matching packages

Other repositories will probably be added as new needs and opportunities are identified.

Deprecation

The Rocker effort supersedes and replaces earlier work by Dirk (in the docker-debian-r and docker-ubuntu-r GitHub repositories) and Carl. Please use the Rocker GitHub repo and Rocker Containers from Docker.com going forward.

Next Steps

We intend to follow-up with more posts detailing usage of both the source Dockerfiles and binary containers on different platforms.

Rocker containers are fully functional. We invite you to take them for a spin. Bug reports, comments, and suggestions are welcome; we suggest you use the GitHub issue tracker.

Acknowledgments

We are very appreciative of all comments received by early adopters and testers. We also would like to thank RStudio for allowing us the redistribution of their RStudio Server binary.

Published concurrently at rOpenSci blog and Dirk's blog.

Authors

Dirk Eddelbuettel and Carl Boettiger

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Sun, 19 Oct 2014

littler 0.2.1

max-heap image

A new maintenance release of littler is available now.

The main change are a few updates and extensions to the examples provided along with littler. Several of those continue to make use of the wonderful docopt package by Edwin de Jonge. Carl Boettiger and I are making good use of these littler examples, particularly to install directly from CRAN or GitHub, in our Rocker builds of R for Docker (about which we should have a bit more to blog soon too).

Full details for the littler release are provided as usual at the ChangeLog page.

The code is available via the GitHub repo, from tarballs off my littler page and the local directory here. A fresh package has gone to the incoming queue at Debian; Michael Rutter will probably have new Ubuntu binaries at CRAN in a few days too.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/littler | permanent link

Sun, 12 Oct 2014

Seinfeld streak at GitHub

Early last year, I referred to a Seinfeld Streak in a blog post referring to almost two months of updates to the Rcpp Gallery. This is sometimes called Jerry Seinfeld's secret to productivity: Just keep at it. Don't break the streak.

I now have different streak:

github activity october 2013 to october 2014

Now we'll see how far this one will go.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/computers/misc | permanent link

Fri, 10 Oct 2014

RPushbullet 0.1.0 with a lot more awesome

A new release 0.1.0 of the RPushbullet package (interfacing the neat Pushbullet service) landed on CRAN today.

It brings a number of goodies relative to the first release 0.0.2 of a few months ago:

  • pushing of files is now supported thanks to a nice pull request bu Mike Birdgeneau
  • a default device can be designated in the ~/.rpushbullet.json file or options
  • initialization has been rewritten to use recpients which can be indices, device names or, if missing entirely, the (new) default device
  • alternatively, email is supported as another recipient option in which case the Pushbullet service will send an email to the give address
  • pbGetDevices() now returns a proper S3 object with corresponding print() and summary() methods
  • the documentation regarding package initialization, and setting of key, devices, etc has been expanded
  • more examples has been added to the documentation
  • various minor cleanups, fixes, corrections throughout

There is a whole boat load of more wickedness in the Pushbullet API so if anybody feels compelled to add it, fire off pull requests at GitHub.

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rpushbullet | permanent link

Mon, 29 Sep 2014

Rcpp 0.11.3

A new release 0.11.3 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package has been uploaded too.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 273 packages on CRAN depend on Rcpp for making analyses go faster and further.

This release brings a fairly large number of continued enhancements, fixes and polishing to Rcpp. These were provided by a total of seven different contributors---which is a new record as well.

See below for a detailed list of changes extracted from the NEWS file, but some highlights included in this release are

  • Several API cleanups, polishes and a pre-announced code removal
  • New InternalFunction interface, and new Timer functionality.
  • More robust functionality of Rcpp Attributes as well as a new dryRun option.
  • The Rcpp FAQ was updated, as was the main Description: in the DESCRIPTION file.
  • Rcpp.package.skeleton() can now deploy functionality from pkgKitten to create Rcpp packages that purr.

One sore point, however, is that we missed that packages using Rcpp Modules appear to require a rebuild. We are sorry for the inconvenience; this has highlighted a shortcoming in our fairly robust and extensive tests. While we test our packages against all known CRAN dependents, such tests check for the ability to compile and run freshly and not whether previously built packages still run. We intend to augment our testing in this direction to avoid a repeat occurrence of such a misfeature.

Changes in Rcpp version 0.11.3 (2014-09-27)

  • Changes in Rcpp API:

    • The deprecation of RCPP_FUNCTION_* which was announced with release 0.10.5 last year is proceeding as planned, and the file macros/preprocessor_generated.h has been removed.

    • Timer no longer records time between steps, but times from the origin. It also gains a get_timers(int) methods that creates a vector of Timer that have the same origin. This is modelled on the Rcpp11 implementation and is more useful for situations where we use timers in several threads. Timer also gains a constructor taking a nanotime_t to use as its origin, and a origin method. This can be useful for situations where the number of threads is not known in advance but we still want to track what goes on in each thread.

    • A cast to bool was removed in the vector proxy code as inconsistent behaviour between clang and g++ compilations was noticed.

    • A missing update(SEXP) method was added thanks to pull request by Omar Andres Zapata Mesa.

    • A proxy for DimNames was added.

    • A no_init option was added for Matrices and Vectors.

    • The InternalFunction class was updated to work with std::function (provided a suitable C++11 compiler is available) via a pull request by Christian Authmann.

    • A new_env() function was added to Environment.h

    • The return value of range eraser for Vectors was fixed in a pull request by Yixuan Qiu.

  • Changes in Rcpp Sugar:

    • In ifelse(), the returned NA type was corrected for operator[].

  • Changes in Rcpp Attributes:

    • Include LinkingTo in DESCRIPTION fields scanned to confirm that C++ dependencies are referenced by package.

    • Add dryRun parameter to sourceCpp.

    • Corrected issue with relative path and R chunk use for sourceCpp.

  • Changes in Rcpp Documentation:

    • The Rcpp-FAQ vignette was updated with respect to OS X issues.

    • A new entry in the Rcpp-FAQ clarifies the use of licenses.

    • Vignettes build results no longer copied to /tmp to please CRAN.

    • The Description in DESCRIPTION has been shortened.

  • Changes in Rcpp support functions:

    • The Rcpp.package.skeleton() function will now use pkgKitten package, if available, to create a package which passes R CMD check without warnings. A new Suggests: has been added for pkgKitten.

    • The modules=TRUE case for Rcpp.package.skeleton() has been improved and now runs without complaints from R CMD check as well.

  • Changes in Rcpp unit test functions:

    • Functions from the RUnit package are now prefixed with RUnit::

    • The testRcppModule and testRcppClass sample packages now pass R CMD check --as-cran cleanly with NOTES or WARNINGS

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 25 Sep 2014

R and Docker

r and docker talk picture by @mediafly

Earlier this evening I gave a short talk about R and Docker at the September Meetup of the Docker Chicago group.

Thanks to Karl Grzeszczak for setting the meeting, and for providing a pretty thorough intro talk regarding CoreOS and Docker.

My slides are now up on my presentations page.

/computers/R | permanent link

Mon, 22 Sep 2014

RcppArmadillo 0.4.450.1.0

Continuing with his standard pace of approximately one new version per month, Conrad released a new minor release of Armadillo a few days ago. As before, I had created a GitHub-only pre-release which was tested against all eighty-seven (!!) CRAN dependents of our RcppArmadillo package and then uploaded RcppArmadillo 0.4.450.0 to CRAN.

The CRAN maintainers pointed out that under the R-development release, a NOTE was issued concerning the C-library's rand() call. This is a pretty new NOTE, but it means using the (sometimes poor quality) rand() generator is now a no-no. Now, Armadillo being as robustly engineered as it is offers a new random number generator based on C++11 as well as a fallback generator for those unfortunate enough to live with an older C++98 compiler. (I would like to note here that I find Conrad's continued support for both C++11, offering very useful modern language idioms, as well as the fallback code for continued deployment and usage by those constrained in their choice of compilers rather exemplary --- because contrary to what some people may claim, it is not a matter of one or the other. C++ always was, and continues to be, a multi-paradigm language which can be supported easily by several standard. But I digress...)

In any event, one cannot argue with CRAN about their prescription of a C++98 compiler. So Conrad and I discussed this over email, and came up with a scheme where a user-package (such as RcppArmadillo) can provide an alternate generator which Armadillo then deploys. I implemented a first solution which was then altered / reflected by Conrad in a revised version 4.450.1 of Armadillo. I packaged, and now uploaded, that version as RcppArmadillo 0.4.450.1.0 to both CRAN and into Debian.

Besides the RNG change already discussed, this release brings a few smaller changes from the Armadillo side. These are detailed below in the extract from the NEWS file. On the RcppArmadillo side, we now have support for pkgKitten which is both very exciting and likely the topic of another blog post with an example of creating an RcppArmadillo package that purrs. In the process, I overhauled and polished how new packages are created by RcppArmadillo.package.skeleton(). An upcoming blog post may provide an example.

Changes in RcppArmadillo version 0.4.450.1.0 (2014-09-21)

  • Upgraded to Armadillo release Version 4.450.1 (Spring Hill Fort)

    • faster handling of matrix transposes within compound expressions

    • expanded symmatu()/symmatl() to optionally disable taking the complex conjugate of elements

    • expanded sort_index() to handle complex vectors

    • expanded the gmm_diag class with functions to generate random samples

  • A new random-number implementation for Armadillo uses the RNG from R as a fallback (when C++11 is not selected so the C++11-based RNG is unavailable) which avoids using the older C++98-based std::rand

  • The RcppArmadillo.package.skeleton() function was updated to only set an "Imports:" for Rcpp, but not RcppArmadillo which (as a template library) needs only LinkingTo:

  • The RcppArmadillo.package.skeleton() function will now prefer pkgKitten::kitten() over package.skeleton() in order to create a working package which passes R CMD check.

  • The pkgKitten package is now a Suggests:

  • A manual page was added to provide documentation for the functions provided by the skeleton package.

  • A small update was made to the package manual page.

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 11 Sep 2014

pkgKitten 0.1.2: Still creating R Packages that purr

A brown bag release 0.1.2 of pkgKitten is now on CRAN, following yesterday's 0.1.1 upload

Next time I'll try to remember that when I have parameters name and path, it won't work so well to call them as path and name ...

Changes in version 0.1.2 (2014-09-11)

  • Brown-bag fix of calling the new helper function with then correct order of arguments.

More details about the package are at the pkgKitten webpage and the pkgKitten GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/pkgkitten | permanent link

Wed, 10 Sep 2014

pkgKitten 0.1.1: Still creating R Packages that purr

A maintenance release 0.1.1 of pkgKitten is now on CRAN.

It has only one small change: the function playWithPerPackageHelpPage() was factored out of the main function kitten() as I happened to be needing something just like playWithPerPackageHelpPage() to make packages created by the Rcpp function Rcpp.package.skeleton() a little nicer.

We also added a NEWS.Rd file which restates major release features. As it is so short, we include it in its entirety.

Changes in version 0.1.1 (2014-09-09)

  • New (exported) function playWithPerPackageHelpPage() which lets other packages create a non-complaint-generating package help page

Changes in version 0.1.0 (2014-06-13)

  • Initial public version and CRAN upload

More details about the package are at the pkgKitten webpage and the pkgKitten GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/pkgkitten | permanent link

Tue, 02 Sep 2014

littler is faster at doing nothing!

With yesterday's announcement of littler 0.2.0, I kept thinking about a few not-so-frequently-asked but recurring questions about littler. And an obvious one if of course the relationship to Rscript.

As we have pointed out before, littler preceded Rscript. Now, with Rscript being present in every R installation, it is of course by now more widely known.

But there is one important aspect which I stressed once or twice in the past and which bears repeating. Due to the lean and mean way in which littler is designed and set-up, it actually starts a lot faster than either Rscript or R. How, you ask? Well we actually query the environement at build time and hardcode a number of settings which R and Rscript re-acquire and learn each time they start. Which is more flexible. But slower.

So consider the following, really simple example. In it, we create a simple worker function f which launches the given (and simplest possible) R command of just quitting 250 times (by launching a shell command which loops). We then let littler, Rscript and R (in its just execute this expression mode) run this, and time it via one of the common benchmarking packages.

R> library(rbenchmark)
R> f <- function(cmd) { system(paste0("bash -c \"for i in \\$(seq 1 250); do ", cmd, " -e 'q()'; done\"")); }
R> res <- benchmark(f("r"), f("Rscript"), f("R --slave"), replications=5, order="relative")
R> res
            test replications elapsed relative user.self sys.self user.child sys.child
1         f("r")            5  32.256    1.000     0.001    0.004     26.538     5.706
2   f("Rscript")            5 180.154    5.585     0.001    0.004    152.751    28.522
3 f("R --slave")            5 302.714    9.385     0.001    0.004    270.569    33.098
R> 

So there: littler does "nothing" about five times as fast as Rscript, and about nine times as fast as R. Not that this matters greatly -- but when you design something for repeated (unsupervised) execution, say in a cronjob, it might as well be lightweight.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/littler | permanent link

Mon, 01 Sep 2014

littler 0.2.0

We are happy to announce a new release of littler.

A few minor things have changes since the last release: max-heap image
  • A few new examples were added or updated, including use of the fabulous new docopt package by Edwin de Jonge which makes command-line parsing a breeze.
  • Other new examples show simple calls to help with sweave, knitr, roxygen2, Rcpp's attribute compilation, and more.
  • We also wrote an entirely new webpage with usage example.
  • A new option -d | --datastdin was added which will read stdin into a data.frame variable X.
  • The repository has been move to this GitHub repo.
  • With that, the build process was updated both throughout but also to reflect the current git commit at time of build.

Full details are provided at the ChangeLog

The code is available via the GitHub repo, from tarballs off my littler page and the local directory here. A fresh package will got to Debian's incoming queue shortly as well.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/littler | permanent link

Fri, 29 Aug 2014

BH release 1.54.0-4

Another small new release of our BH package providing Boost headers for use by R is now on CRAN. This one brings a one-file change: the file any.hpp comprising the Boost.Any library --- as requested by a fellow package maintainer needing it for a pending upload to CRAN.

No other changes were made.

Changes in version 1.54.0-4 (2014-08-29)

  • Added Boost Any requested by Greg Jeffries for his nabo package

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Thu, 21 Aug 2014

RcppEigen 0.3.2.2.0

A new upstream release of the Eigen C++ template library for linear algebra was released a few days ago. And Yixuan Qiu did some really nice work rolling this into a new RcppEigen released and then sent me a nice pull requent. The new version is now on CRAN, and I will prepare a Debian in a moment too.

Upstream changes for Eigen are summarized in their changelog. On the RcppEigen side, Yixuan also rolled in some more changes on as<>() and wrap() converters as noted below in the NEWS entry.

Changes in RcppEigen version 0.3.2.2.0 (2014-08-19)

  • Updated to version 3.2.2 of Eigen

  • Rcpp::as() now supports the conversion from R vector to “row array”, i.e., Eigen::Array<T, 1, Eigen::Dynamic>

  • Rcpp::as() now supports the conversion from dgRMatrix (row oriented sparse matrices, defined in Matrix package) to Eigen::MappedSparseMatrix<T, Eigen::RowMajor>

  • Conversion from R matrix to Eigen::MatrixXd and Eigen::ArrayXXd using Rcpp::as() no longer gives compilation errors

Courtesy of CRANberries, there are diffstat reports for the most recent release.

Questions, comments etc about RcppEigen should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Tue, 19 Aug 2014

RcppArmadillo 0.4.400.0

After two pre-releases in the last few days, Conrad finalised a new Armadillo version 4.400 today. I had kept up with the pre-releases, tested twice against all eighty (!!) CRAN dependents of RcppArmadillo and have hence uploaded RcppArmadillo 0.4.400.0 to CRAN and into Debian.

This release brings a number of new upstream features which are detailed below. As included is s bugfix for sparse matrix creation at the RcppArmadillo end which was found by the ASAN tests at CRAN --- which are similar to the sanitizers tests I recently blogged. I was able to develop and test the fix in the very docker r-devel-san images I had written about which was nice. Special thanks also to Ryan Curtin for help with the fix.

Changes in RcppArmadillo version 0.4.400.0 (2014-08-19)

  • Upgraded to Armadillo release Version 4.400 (Winter Shark Alley)

    • added gmm_diag class for statistical modelling using Gaussian Mixture Models; includes multi-threaded implementation of k-means and Expectation-Maximisation for parameter estimation

    • added clamp() for clamping values to be between lower and upper limits

    • expanded batch insertion constructors for sparse matrices to add values at repeated locations

    • faster handling of subvectors by dot()

    • faster handling of aliasing by submatrix views

  • Corrected a bug (found by the g++ Address Sanitizer) in sparse matrix initialization where space for a sentinel was allocated, but the sentinel was not set; with extra thanks to Ryan Curtin for help

  • Added a few unit tests for sparse matrices

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 07 Aug 2014

Rcpp now used by 250 CRAN packages

Rcpp growth

Rcpp reached a nice round milestone yesterday: 250 packages on CRAN now depend on it (as measured by Depends, Imports and LinkingTo declarations).

The graph is on the left depicts the growth of Rcpp over time. Or at least since I started to write down some usage numbers: first informally, then via a script.

Also displayed is the relative proportion of CRAN packages using it. Rcpp cleared the four per-cent hurdle just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited talk.

250 is a pretty impressive, and rather humbling, number.

From everybody behind Rcpp, I would like to say a heartfelt Thank You! to all the users and of course contributors.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 04 Aug 2014

BH release 1.54.0-3

A new release of our BH package providing Boost headers for use by R is now on the CRAN mirrors. max-heap image This release is the third based on Boost 1.54.0.

At the request of the maintainer of the recent added RcppMLPACK package, it adds the Boost.Heap library. Boost.Heap implements priority queues which extend beyond the corresponding (and somewhat simpler) class in the STL. Key features of the Boost.Heap priority queues are mutability, iterators, ability to merge, stable sort, and comparison.

No other changes were made.

Changes in version 1.54.0-3 (2014-08-03)

  • Added Boost Heap library which will be needed by the next version of RcppMLPACK

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Sun, 03 Aug 2014

Introducing sanitizers 0.1.0

A new package sanitizers is now on CRAN. It provides test cases for Address Sanitizers, and Undefined Behaviour Sanitizers. These are two recent features of both g++ and clang++, and described in the Checking Memory Access section of the Writing R Extension manual.

I set up a new web page for the sanitizers package which illustrates their use case via pre-built Docker images, similar to what I presented at the end of my useR! 2014 keynote a few weeks ago.

So instead of repeating this over here, I invite you to read the detailed discussion on the sanitizers page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/sanitizers | permanent link