Sun, 01 Mar 2015

drat 0.0.2: Improved Support for Lightweight R Repositories

A few weeks ago we introduced the drat package. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages. Two early blog posts describe drat: First Steps Towards Lightweight Repositories, and Publishing a Package.

A new version 0.0.2 is now on CRAN. It adds several new features:

  • beginnings of native git support via the excellent new git2r package,
  • a new helper function to prune a repo of older versions of packages (as R repositories only show the newest release of a package),
  • improved core functionality in inserting a package, and adding a repo.

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sat, 28 Feb 2015

RcppEigen 0.3.2.4.0

A new release of RcppEigen is now on CRAN and in Debian. It synchronizes the Eigen code with the 3.2.4 upstream release, and updates the RcppEigen.package.skeleton() package creation helper to use the kitten() function from pkgKitten for enhanced package creation.

The NEWS file entry follows.

Changes in RcppEigen version 0.3.2.4.0 (2015-02-23)

  • Updated to version 3.2.4 of Eigen

  • Update RcppEigen.package.skeleton() to use pkgKitten if available

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 26 Feb 2015

RcppArmadillo 0.4.650.1.1 (and also 0.4.650.2.0)

A new Armadillo release 4.650.1 was released by Conrad a few days ago. Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab.

It turned out that this release had one shortcoming with respect to the C++11 RNG initializations in the R use case (where we need to protect the users from the C++98 RNG deemed unsuitable by the CRAN gatekeepers). And this lead to upstream release 4.650.1 which we wrapped into RcppArmadillo 0.4.650.1.1. As before this, was tested against all 107 reverse dependencies of RcppArmadillo on the CRAN repo.

This version is now on CRAN, and was just uploaded to Debian. Its changes are summarized below based on the NEWS.Rd file.

Changes in RcppArmadillo version 0.4.650.1.1 (2015-02-25)

  • Upgraded to Armadillo release Version 4.650.1 ("Intravenous Caffeine Injector")

    • added randg() for generating random values from gamma distributions (C++11 only)

    • added .head_rows() and .tail_rows() to submatrix views

    • added .head_cols() and .tail_cols() to submatrix views

    • expanded eigs_sym() to optionally calculate eigenvalues with smallest/largest algebraic values fixes for handling of sparse matrices

  • Applied small correction to main header file to set up C++11 RNG whether or not the alternate RNG (based on R, our default) is used

Now, it turns out that another small fix was needed for the corner case of a submatrix within a submatrix, ie V.subvec(1,10).tail(5). I decided not to re-release this to CRAN given the CRAN Repository Policy preference for releases “no more than every 1–2 months”.

But fear not, for we now have drat. I created a drat package repository in the RcppCore account (to not put a larger package into my main drat repository often used via a fork to initialize a drat). So now with these two simple commands

## if needed, first install 'drat' via:   install.packages("drat")
drat:::add("RcppCore")
update.packages()

you will get the newest RcppArmadillo via this drat package repository. And course install.packages("RcppArmadillo") would also work, but takes longer to type :)

Lastly, courtesy of CRANberries, there is also a diffstat report for the most recent CRAN release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 22 Feb 2015

drat Tutorial: Publishing a package

Introduction

The drat package was released earlier this month, and described in a first blog post. I received some helpful feedback about what works and what doesn't. For example, Jenny Bryan pointed out that I was not making a clear enough distinction between the role of using drat to publish code, and using drat to receive/install code. Very fair point, and somewhat tricky as R aims to blur the line between being a user and developer of statistical analyses, and hence packages. Many of us are both. Both the main point is well taken, and this note aims to clarify this issue a little by focusing on the former.

Another point make by Jenny concerns the double use of repository. And indeed, I conflated repository (in the sense of a GitHub code repository) with repository for a package store used by a package manager. The former, a GitHub repository, is something we use to implement a personal drat with: A GitHub repository happens to be uniquely identifiable just by its account name, and given an (optional) gh-pages branch also offers a stable and performant webserver we use to deliver packages for R. A (personal) code repository on the other hand is something we implement somewhere---possibly via drat which supports local directories, possibly on a network share, as well as anywhere web-accessible, e.g. via a GitHub repository. It is a little confusing, but I will aim to make the distinction clearer.

Just once: Setting up a drat repository

So let us for the remainder of this post assume the role of a code publisher. Assume you have a package you would like to make available, which may not be on CRAN and for which you would like to make installation by others easier via drat. The example below will use an interim version of drat which I pushed out yesterday (after fixing a bug noticed when pushing the very new RcppAPT package).

For the following, all we assume (apart from having a package to publish) is that you have a drat directory setup within your git / GitHub repository. This is not an onerous restriction. First off, you don't have to use git or GitHub to publish via drat: local file stores and other web servers work just as well (and are documented). GitHub simply makes it easiest. Second, bootstrapping one is trivial: just fork my drat GitHub repository and then create a local clone of the fork.

There is one additional requirement: you need a gh-pages branch. Using the fork-and-clone approach ensures this. Otherwise, if you know your way around git you already know how to create a gh-pages branch.

Enough of the prerequisities. And on towards real fun. Let's ensure we are in the gh-pages branch:

edd@max:~/git/drat(master)$ git checkout gh-pages
Switched to branch 'gh-pages'
Your branch is up-to-date with 'origin/gh-pages'.
edd@max:~/git/drat(gh-pages)$ 

Publish: Run one drat command to insert a package

Now, let us assume you have a package to publish. In my case this was version 0.0.1.2 of drat itself as it contains a fix for the very command I am showing here. So if you want to run this, ensure you have this version of drat as the CRAN version is currently behind at release 0.0.1 (though I plan to correct that in the next few days).

To publish an R package into a code repository created via drat running on a drat GitHub repository, just run insertPackage(packagefile) which we show here with the optional commit=TRUE. The path to the package can be absolute are relative; the easists is often to go up one directory from the sources to where R CMD build ... has created the package file.

edd@max:~/git$ Rscript -e 'library(drat); insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE)'
[gh-pages 0d2093a] adding drat_0.0.1.2.tar.gz to drat
 3 files changed, 2 insertions(+), 2 deletions(-)
 create mode 100644 src/contrib/drat_0.0.1.2.tar.gz
Counting objects: 7, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 7.37 KiB | 0 bytes/s, done.
Total 7 (delta 1), reused 0 (delta 0)
To git@github.com:eddelbuettel/drat.git
   206d2fa..0d2093a  gh-pages -> gh-pages
edd@max:~/git$ 

You can equally well run this as insertPackage("drat_0.0.1.2.tar.gz"), then inspect the repo and only then run the git commands add, commit and push. Also note that future versions of drat will most likely support git operations directly by relying on the very promising git2r package. But this just affect package internals, the user-facing call of e.g. insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE) will remain unchanged.

And in a nutshell that really is all there is to it. With the newly drat-ed package pushed to your GitHub repository with a single function call), it is available via the automatically-provided gh-pages webserver access to anyone in the world. All they need to do is to point R's package management code (which is built into R itself and used for e.g._ CRAN and BioConductor R package repositories) to the new repo---and that is also just a single drat command. We showed this in the first blog post and may expand on it again in a follow-up.

So in summary, that really is all there is to it. After a one-time setup / ensuring you are on the gh-pages branch, all it takes is a single function call from the drat package to publish your package to your drat GitHub repository.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sat, 21 Feb 2015

RcppAPT 0.0.1

Over the last few days I put together a new package RcppAPT which interfaces the C++ library behind the awesome apt, apt-get, apt-cache, ... commands and their GUI-based brethren.

The package currently implements two functions which permit search for package information via a regular expression, as well as a (vectorised) package name-based check. More to come, and contributions would be very welcome.

A few examples just to illustrate follow.

R> hasPackages(c("r-cran-rcpp", "r-cran-rcppapt"))
   r-cran-rcpp r-cran-rcppapt 
          TRUE          FALSE 

This shows that Rcpp is (of course) available as a binary, but this (very new) package is (unsurprisingly) not yet available pre-built.

We can search by regular expression:

R> library(RcppAPT)
R> getPackages("^r-base-c.")
          Package      Installed       Section
1 r-base-core-dbg 3.1.2-1utopic0 universe/math
2 r-base-core-dbg           <NA> universe/math
3     r-base-core 3.1.2-1utopic0 universe/math
4     r-base-core           <NA> universe/math
R> 

With the (default) expression catching everything, we see a lot of packages:

R> dim(getPackages())
[1] 104431      3
R> 

A bit more information is on the package page here as well as as the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 09 Feb 2015

RPushbullet 0.2.0

A new releases of the RPushbullet package (interfacing the neat Pushbullet service) arrived on CRAN today.

It brings several weeks of extensions, corrections and cleanups---with key contributions by Mike Birdgeneau and Henrik Bengtsson.

RPushbullet now has support for channels (a reasonably new feature upstream). Setup, initialization and tests all got improved as well. Changes are summarized below based in the extract from the NEWS.Rd file.

Changes in version 0.2.0 (2015-02-07)

  • Added support for Pushbullet 'channels' (once again thanks to Mike Birdgeneau for the initial push on this)

  • Support for pushes was solidified: proper choices of either device, email or channel should work in all cases

  • S3 methods are now properly exports (thanks to Henrik Bengtsson)

  • File transfer mode has been improved / corrected (thanks to Mike Birdgeneau)

  • The regression test suite was expanded and robustified

  • This NEWS file was added. Better late than never.

Courtesy of CRANberries, there is also a diffstat report for this release.

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rpushbullet | permanent link

Sat, 07 Feb 2015

rfoaas 0.1.3

A brand new version of rfoaas is now on CRAN. It shadows the 0.1.3 release of FOAAS just how an earlier 0.1.2 had done (but there was something not quite right at the server backend which we coded around with an interim release 0.1.2.1; neither one of these was ever released to CRAN).

The rfoaas package provides an interface for R to the most excellent FOAAS service--which provides a modern, scalable and RESTful web service for the frequent need to tell someone to f$#@ off. Release 0.1.3 of FOAAS brings support for filters which the initial support going to the absolutely outstanding shoutcloud.io service. This can be enabled by adding filter="shoutcloud" as an argument to any of the access functions. And thanks to shoutcloud.io, the result will be LOUD AND CLEAR.

As usual, CRANberries provides a diff to the previous CRAN release. Questions, comments etc should go to the GitHub issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

drat Tutorial: First Steps towards Lightweight R Repositories

Now that drat is on CRAN and I got a bit of feedback (or typo corrections) in three issue tickets, I thought I could show how to quickly post such an interim version in a drat repository.

Now, I obviously already have a checkout of drat. If you, dear reader, wanted to play along and create your own drat repository, one rather simple way would be to simply clone my repo as this gets you the desired gh-pages branch with the required src/contrib/ directories. Otherwise just do it by hand.

Back to a new interim version. I just pushed commit fd06293 which bumps the version and date for the new interim release based mostly on the three tickets addresses right after the initial release 0.0.1. So by building it we get a new version 0.0.1.1:

edd@max:~/git$ R CMD build drat
* checking for file ‘drat/DESCRIPTION’ ... OK
* preparing ‘drat’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘drat_0.0.1.1.tar.gz’

edd@max:~/git$ 

Because I want to use the drat repo next, I need to now switch from master to gh-pages; a step I am omitting as we can assume that your drat repo will already be on its gh-pages branch.

Next we simply call the drat function to add the release:

edd@max:~/git$ r -e 'drat:::insert("drat_0.0.1.1.tar.gz")'
edd@max:~/git$ 

As expected, now have two updated PACKAGES files (compressed and plain) and a new tarball:

edd@max:~/git/drat(gh-pages)$ git status
On branch gh-pages
Your branch is up-to-date with 'origin/gh-pages'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   src/contrib/PACKAGES
        modified:   src/contrib/PACKAGES.gz

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        src/contrib/drat_0.0.1.1.tar.gz

no changes added to commit (use "git add" and/or "git commit -a")
edd@max:~/git/drat(gh-pages)$

All is left to is to add, commit and push---either as I usually via the spectacularly useful editor mode, or on the command-line, or by simply adding commit=TRUE in the call to insert() or insertPackage().

I prefer to use littler's r for command-line work, so I am setting the desired repos in ~/.littler.r which is read on startup (since the recent littler release 0.2.2) with these two lines:


## add RStudio CRAN mirror
drat:::add("CRAN", "http://cran.rstudio.com")

## add Dirk's drat
drat:::add("eddelbuettel")

After that, repos are set as I like them (at home at least):

edd@max:~/git$ r -e'print(options("repos"))'
$repos
                                 CRAN                          eddelbuettel 
            "http://cran.rstudio.com" "http://eddelbuettel.github.io/drat/" 

edd@max:~/git$ 

And with that, we can just call update.packages() specifying the package directory to update:

edd@max:~/git$ r -e 'update.packages(ask=FALSE, lib.loc="/usr/local/lib/R/site-library")'                                                                                                                            
trying URL 'http://eddelbuettel.github.io/drat/src/contrib/drat_0.0.1.1.tar.gz'
Content type 'application/octet-stream' length 5829 bytes
opened URL
==================================================
downloaded 5829 bytes

* installing *source* package ‘drat’ ...
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (drat)

The downloaded source packages are in
        ‘/tmp/downloaded_packagesedd@max:~/git$

and presto, a new version of a package we have installed (here the very drat interim release we just pushed above) is updated.

Writing this up made me realize I need to update the handy update.r script (see e.g. the littler examples page for more) and it hard-wires just one repo which needs to be relaxed for drat. Maybe in install2.r which already has docopt support...

/code/drat | permanent link

Thu, 05 Feb 2015

Introducing drat: Lightweight R Repositories

A new package of mine just got to CRAN in its very first version 0.0.1: drat. Its name stands for drat R Archive Template, and an introduction is provided at the drat page, the GitHub repository, and below.

drat builds on a core strength of R: the ability to query multiple repositories. Just how one could always query, say, CRAN, BioConductor and OmegaHat---one can now adds drats of one or more other developers with ease. drat also builds on a core strength of GitHub. Every user automagically has a corresponding github.io address, and by appending drat we are getting a standardized URL.

drat combines both strengths. So after an initial install.packages("drat") to get drat, you can just do either one of

library(drat)
addRepo("eddelbuettel")

or equally

drat:::add("eddelbuettel")

to register my drat. Now install.packages() will work using this new drat, as will update.packages(). The fact that the update mechanism works is a key strength: not only can you get a package, but you can gets its updates once its author replaces them into his drat.

How does one do that? Easy! For a package foo_0.1.0.tar.gz we do

library(drat)
insertPackage("foo_0.1.0.tar.gz")

The default git repository locally is taken as the default ~/git/drat/ but can be overriden as both a local default (via options()) or directly on the command-line. Note that this also assumes that you a) have a gh-pages branch and b) have that branch as the currently active branch. Automating this / testing for this is left for a subsequent release. Also available is an alternative unexported short-hand function:

drat:::insert("foo_0.1.0.tar.gz", "/opt/myWork/git")

show here with the alternate use case of a local fileshare you can copy into and query from---something we do at work where we share packages only locally.

The easiest way to obtain the corresponding file system layout may be to just fork the drat repository.

So that is it. Two exported functions, and two unexported (potentially name-clobbering) shorthands. Now drat away!

Courtesy of CRANberries, there is also a copy of the DESCRIPTION file for this initial release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Mon, 02 Feb 2015

RcppStreams 0.1.0

Streamulus

The new package RcppStreams arrived on CRAN on Saturday. RcppStreams brings the excellent Streamulus C++ template library for event stream processing to R.

Streamulus, written by Irit Katriel, uses very clever template meta-programming (via Boost Fusion) to implement an embedded domain-specific event language created specifically for event stream processing.

The packages wraps the four existing examples by Irit, all her unit tests and includes a slide deck from a workshop presentation. The framework is quite powerful, and it brings a very promising avenue for (efficient !!) stream event processing to R.

The NEWS file entries follows below:

Changes in version 0.1.0 (2015-01-30)

  • First CRAN release

  • Contains all upstream examples and documentation from Streamulus

  • Added Rcpp Attributes wrapping and minimal manual pages

  • Added Travis CI unit testing

Courtesy of CRANberries, there is also a copy of the DESCRIPTION file for this initial release. More detailed information is on the RcppStreams page page and of course on the Streamulus page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Fri, 30 Jan 2015

littler 0.2.2

max-heap image

A new minor release of littler is available now.

Several examples were added or extended:

  • a new script check.r to check a source tarball with R CMD check after loading required packages first (and a good use case was given in the recent UBSAN testing with Rocker post);
  • a new script to launch Shiny apps via runApp();
  • a new feature to install.r and install2.r whereby source tarballs are recognized and installed directly
  • new options to install.r and install2.r to set repos and lib location.

See the littler examples page for more details.

Another useful change is that r now reads either one (or both) of /etc/littler.r and ~/.littler.r. These are interpreted as standard R files, allowing users to provide initialization, package loading and more.

Carl Boettiger and I continue to make good use of these littler examples (to to install directly from CRAN or GitHub, or to run checks) in our Rocker builds of R for Docker.

Full details for the littler release are provided as usual at the ChangeLog page.

The code is available via the GitHub repo, from tarballs off my littler page and the local directory here. A fresh package has gone to the incoming queue at Debian; Michael Rutter will probably have new Ubuntu binaries at CRAN in a few days too.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/littler | permanent link

Wed, 28 Jan 2015

RInside 0.2.12

A new release 0.2.12 of RInside is now on CRAN. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp integration package.

This release adds new examples which were contributed by Christian Authmann, plus some updates and fixes including one requested by the CRAN maintainers regarding GNU extensions to Makefile. The NEWS extract below has more details.

Changes in RInside version 0.2.12 (2015-01-27)

  • Several new examples have been added (with most of the work done by Christian Authmann):

    • standard/rinside_sample15.cpp shows how to create a lattice plot (following a StackOverflow question)

    • standard/rinside_sample16.cpp shows object wrapping, and exposing of C++ functions

    • standard/rinside_sample17.cpp does the same via C++11

    • sandboxed_servers/ adds an entire framework of client/server communication outside the main process (but using a subset of supported types)

  • standard/rinside_module_sample9.cpp was repaired following a fix to InternalFunction in Rcpp

  • For the seven example directories which contain a Makefile, the Makefile was renamed GNUmakefile to please R CMD check as well as the CRAN Maintainers.

CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rinside | permanent link

Sun, 25 Jan 2015

RcppArmadillo 0.4.600.4.0

Conrad put up a maintenance release 4.600.4 of Armadillo a few days ago. As in the past, we tested this with number of pre-releases and test builds against the now over one hundred CRAN dependents of our RcppArmadillo package. The tests passed fine as usual, and results are as always in the rcpp-logs repository.

Changes are summarized below based on the NEWS.Rd file.

Changes in RcppArmadillo version 0.4.600.4.0 (2015-01-23)

  • Upgraded to Armadillo release Version 4.600.4 (still "Off The Reservation")

    • Speedups in the transpose operation

    • Small bug fixes

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sat, 24 Jan 2015

Rcpp 0.11.4

A new release 0.11.4 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package will be uploaded in due course.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 323 packages on CRAN depend on Rcpp for making analyses go faster and further; BioConductor adds another 41 packages, and casual searches on GitHub suggests dozens mores.

This release once again adds a large number of small bug fixes, polishes and enhancements. And like the last time, these changes were made by a group of seven different contributors (counting code commits) plus three more providing concrete suggestions. This shows that the Rcpp development and maintenance rests a large number of (broad) shoulders.

See below for a detailed list of changes extracted from the NEWS file.

Changes in Rcpp version 0.11.4 (2015-01-20)

  • Changes in Rcpp API:

    • The ListOf<T> class gains the .attr and .names methods common to other Rcpp vectors.

    • The [dpq]nbinom_mu() scalar functions are now available via the R:: namespace when R 3.1.2 or newer is used.

    • Add an additional test for AIX before attempting to include execinfo.h.

    • Rcpp::stop now supports improved printf-like syntax using the small tinyformat header-only library (following a similar implementation in Rcpp11)

    • Pairlist objects are now protected via an additional Shield<> as suggested by Martin Morgan on the rcpp-devel list.

    • Sorting is now prohibited at compile time for objects of type List, RawVector and ExpressionVector.

    • Vectors now have a Vector::const_iterator that is 'const correct' thanks to fix by Romain following a bug report in rcpp-devel by Martyn Plummer.

    • The mean() sugar function now uses a more robust two-pass method, and new unit tests for mean() were added at the same time.

    • The mean() and var() functions now support all core vector types.

    • The setequal() sugar function has been corrected via suggestion by Qiang Kou following a bug report by Søren Højsgaard.

    • The macros major, minor, and makedev no longer leak in from the (Linux) system header sys/sysmacros.h.

    • The push_front() string function was corrected.

  • Changes in Rcpp Attributes:

    • Only look for plugins in the package's namespace (rather than entire search path).

    • Also scan header files for definitions of functions to be considerd by Attributes.

    • Correct the regular expression for source files which are scanned.

  • Changes in Rcpp unit tests

    • Added a new binary test which will load a pre-built package to ensure that the Application Binary Interface (ABI) did not change; this test will (mostly or) only run at Travis where we have reasonable control over the platform running the test and can provide a binary.

    • New unit tests for sugar functions mean, setequal and var were added as noted above.

  • Changes in Rcpp Examples:

    • For the (old) examples ConvolveBenchmarks and OpenMP, the respective Makefile was renamed to GNUmakefile to please R CMD check as well as the CRAN Maintainers.

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

RcppGSL 0.2.4

A new version of RcppGSL is now on CRAN. This package provides an interface from R to the GNU GSL using our Rcpp package.

This follows on the heels on the recent RcppGSL 0.2.3 release and extends the excellent point made by Qiang Kou in a contributed section of the vignette: We now not only allow to turn the GSL error handler off (to not abort() on error) but do so on package initialisation.

No other user-facing changes were made.

The NEWS file entries follows below:

Changes in version 0.2.4 (2015-01-24)

  • Two new helper function to turn the default GSL error handler off (and to restore it) were added. The default handler is now turned off when the package is attached so that GSL will no longer abort an R session on error. Users will have to check the error code.

  • The RcppGSL-intro.Rnw vignette was expanded with a short section on the GSL error handler (thanks to Qiang Kou).

Courtesy of CRANberries, a summary of changes to the most recent release is available.

More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

RcppAnnoy 0.0.5

A new version of RcppAnnoy is now on CRAN. RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

This version contains a trivial one-character change requested by CRAN to cleanse the Makevars file of possible GNU Make-isms. Oh well. This release also overcomes an undefined behaviour sanitizer bug noticed by CRAN that took somewhat more effort to deal with. As mentioned recently in another blog post, it took some work to create a proper Docker container with the required compiler and subsequent R setup, but we have one now, and the aforementioned blog post has details on how we replicated the CRAN finding of an UBSAN issue. It also took Erik some extra efforts to set something up for his C++/Python side, but eventually an EC2 instance with Ubuntu 14.10 did the task as my Docker sales skills are seemingly not convincing enough. In any event, he very quickly added the right fix, and I synced RcppAnnoy with his Annoy code.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 18 Jan 2015

Running UBSAN tests via clang with Rocker

Every now and then we get reports from CRAN about our packages failing a test there. A challenging one concerns UBSAN, or Undefined Behaviour Sanitizer. For background on UBSAN, see this RedHat blog post for gcc and this one from LLVM about clang.

I had written briefly about this before in a blog post introducing the sanitizers package for tests, as well as the corresponding package page for sanitizers, which clearly predates our follow-up Rocker.org repo / project described in this initial announcement and when we became the official R container for Docker.

Rocker had support for SAN testing, but UBSAN was not working yet. So following a recent CRAN report against our RcppAnnoy package, I was unable to replicate the error and asked for help on r-devel in this thread.

Martyn Plummer and Jan van der Laan kindly sent their configurations in the same thread and off-list; Jeff Horner did so too following an initial tweet offering help. None of these worked for me, but further trials eventually lead me to the (already mentioned above) RedHat blog post with its mention of -fno-sanitize-recover to actually have an error abort a test. Which, coupled with the settings used by Martyn, were what worked for me: clang-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover.

This is now part of the updated Dockerfile of the R-devel-SAN-Clang repo behind the r-devel-ubsan-clang. It contains these settings, as well a new support script check.r for littler---which enables testing right out the box.

Here is a complete example:

docker                              # run Docker (any recent version, I use 1.2.0)
  run                               # launch a container 
    --rm                            # remove Docker temporary objects when dome
    -ti                             # use a terminal and interactive mode 
    -v $(pwd):/mnt                  # mount the current directory as /mnt in the container
    rocker/r-devel-ubsan-clang      # using the rocker/r-devel-ubsan-clang container
  check.r                           # launch the check.r command from littler (in the container)
    --setwd /mnt                    # with a setwd() to the /mnt directory
    --install-deps                  # installing all package dependencies before the test
    RcppAnnoy_0.0.5.tar.gz          # and test this tarball

I know. It is a mouthful. But it really is merely the standard practice of running Docker to launch a single command. And while I frequently make this the /bin/bash command (hence the -ti options I always use) to work and explore interactively, here we do one better thanks to the (pretty useful so far) check.r script I wrote over the last two days.

check.r does about the same as R CMD check. If you look inside check you will see a call to a (non-exported) function from the (R base-internal) tools package. We call the same function here. But to make things more interesting we also first install the package we test to really ensure we have all build-dependencies from CRAN met. (And we plan to extend check.r to support additional apt-get calls in case other libraries etc are needed.) We use the dependencies=TRUE option to have R smartly install Suggests: as well, but only one level (see help(install.packages) for details. With that prerequisite out of the way, the test can proceed as if we had done R CMD check (and additional R CMD INSTALL as well). The result for this (known-bad) package:

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.tar.gz 
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’

trying URL 'http://cran.rstudio.com/src/contrib/Rcpp_0.11.3.tar.gz'
Content type 'application/x-gzip' length 2169583 bytes (2.1 MB)
opened URL
==================================================
downloaded 2.1 MB

trying URL 'http://cran.rstudio.com/src/contrib/BH_1.55.0-3.tar.gz'
Content type 'application/x-gzip' length 7860141 bytes (7.5 MB)
opened URL
==================================================
downloaded 7.5 MB

trying URL 'http://cran.rstudio.com/src/contrib/RUnit_0.4.28.tar.gz'
Content type 'application/x-gzip' length 322486 bytes (314 KB)
opened URL
==================================================
downloaded 314 KB

trying URL 'http://cran.rstudio.com/src/contrib/RcppAnnoy_0.0.4.tar.gz'
Content type 'application/x-gzip' length 25777 bytes (25 KB)
opened URL
==================================================
downloaded 25 KB

* installing *source* package ‘Rcpp’ ...
** package ‘Rcpp’ successfully unpacked and MD5 sums checked
** libs
clang++-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover -I/usr/local/lib/R/include -DNDEBUG -I../inst/include/ -I/usr/local/include    -fpic  -pipe -Wall -pedantic -
g  -c Date.cpp -o Date.o

[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 ERROR
Running the tests in ‘tests/runUnitTests.R’ failed.
Last 13 lines of output:
  +     if (getErrors(tests)$nFail > 0) {
  +         stop("TEST FAILED!")
  +     }
  +     if (getErrors(tests)$nErr > 0) {
  +         stop("TEST HAD ERRORS!")
  +     }
  +     if (getErrors(tests)$nTestFunc < 1) {
  +         stop("NO TEST FUNCTIONS RUN!")
  +     }
  + }
  
  
  Executing test function test01getNNsByVector  ... ../inst/include/annoylib.h:532:40: runtime error: index 3 out of bounds for type 'int const[2]'
* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 2 WARNINGs, 1 NOTE
See/tmp/RcppAnnoy/..Rcheck/00check.logfor details.
root@a7687c014e55:/tmp/RcppAnnoy# 

The log shows that thanks to check.r, we first download and the install the required packages Rcpp, BH, RUnit and RcppAnnoy itself (in the CRAN release). Rcpp is installed first, we then cut out the middle until we get to ... the failure we set out to confirm.

Now having a tool to confirm the error, we can work on improved code.

One such fix currently under inspection in a non-release version 0.0.5.1 then passes with the exact same invocation (but pointing at RcppAnnoy_0.0.5.1.tar.gz):

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.1.tar.gz
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’
[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 OK
* checking PDF version of manual ... OK
* DONE

Status: 1 WARNING
See/mnt/RcppAnnoy.Rcheck/00check.logfor details.

edd@max:~/git$

This proceeds the same way from the same pristine, clean container for testing. It first installs the four required packages, and the proceeds to test the new and improved tarball. Which passes the test which failed above with no issues. Good.

So we now have an "appliance" container anybody can download from free from the Docker hub, and deploy as we did here in order to have fully automated, one-command setup for testing for UBSAN errors.

UBSAN is a very powerful tool. We are only beginning to deploy it. There are many more useful configuration settings. I would love to hear from anyone who would like to work on building this out via the R-devel-SAN-Clang GitHub repo. Improvements to the littler scripts are similarly welcome (and I plan on releasing an updated littler package "soon").

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Mon, 12 Jan 2015

RcppGSL 0.2.3

A new version of RcppGSL is now on CRAN today. This package provides an interface from R to the GNU GSL using our Rcpp.

Similar to the recent RcppClassic release, this update was triggered by the CRAN maintainers desire to keep the Makefile free of GNU make extensions. At the same time we simplified the configure file a little, and refreshed the look and feel of the vignette.

No user-facing changes were made.

The NEWS file entries follows below:

Changes in version 0.2.3 (2015-01-10)

  • The src/Makevars.in was pruned of GNU make features at the request of the CRAN Maintainers

  • configure.ac and configure were updated, and shortened

  • The RcppGSL-intro.Rnw vignette was updated for its look and feel.

Courtesy of CRANberries, a summary of changes to the most recent release is available.

More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 11 Jan 2015

rfoaas 0.1.1

A brand new and shiny version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service--which provides a modern, scalable and RESTful web service for the frequent need to tell someone to f$#@ off.

There are two (internal) changes of merit in this version. First off, as FOAAS was refactored upstream, we are now forced to supply an accept: text/plain http request header. Which, sadly enough, is not something the url() function in R can do---so we brought in more cavalry and now depend on the httr package by Hadley, and use its GET() method. A second change is that we added a (simple but effective enough) regression test which simply calls all foaas entry points available throug rfoaas, and compares this to the anticipated result. To run it, you need to set an environment variable RunFOAASTests=yes as eg out Travis script does. Finally, we aligned the version number with upstream to signal that we cover all available entry points of that version.

As usual, CRANberries provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Sat, 10 Jan 2015

RcppClassic 0.9.6

A maintenance release of RcppClassic, now at version 0.9.6, went out to CRAN today. This package provides a maintained version of the otherwise deprecated first Rcpp API; no new projects should use it.

No changes were in user-facing code. The Makevars file was change to accomodate a request by the CRAN Maintainer to keep it free of GNU Make extensions. At the same time, we overhauled the look and feel of the (very short) vignette. Build instructions were updated both in the vignette and in the included example package. Other accumulated changes since the last release were updates to the DESCRIPTION and NAMESPACE file as well two namespace-related R code updates.

Courtesy of CRANberries, there is the set of changes relative to the previous release.

Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 08 Jan 2015

random 0.2.3

A new release of my random package for truly (hardware-based) random numbers as provided by random.org is now on CRAN.

The main change is a switch to the curl() function from the eponymous package by Jeroen Ooms. This was caused by random.org now using https instead of http, annd the fact that te url() function from R does not cope well with the redirect. Besides this (enforced) change, everything else remains the same.

Courtesy of CRANberries comes a diffstat report for this release. Current and previous releases are available here as well as on CRAN.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/random | permanent link

Tue, 06 Jan 2015

RcppCNPy 0.2.4

A new release of the RcppCNPy package is now on CRAN.

This release mostly solidifies and fixes things. Support for saving integer objects, which was expanded in release 0.2.3, was not entirely correct. Operations on big-endian systems were not up to snuff either.

Wush Wu helped in getting this right with very diligent testing and patching particularly on big-endian hardware. We also got a pull request from Romain to reflect better const correctness at the Rcpp side of things. Last but not least we obliged by the CRAN Maintainers to not assume one could call gzip from system() call because, well, you guessed it.

Changes in version 0.2.4 (2015-01-05)

  • Support for saving integer objects was not correct and has been fixed.

  • Support for loading and saving on 'big endian' systems was incomplete, has been greatly expanded and corrected, thanks in large part to very diligent testing as well as patching by Wush Wu.

  • The implementation now uses const iterators, thanks to a pull request by Romain Francois.

  • The vignette no longer assumes that one can call gzip via system as the world's leading consumer OS may disagree.

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is may be the best place to start a discussion. GitHub issue tickets are also welcome.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 05 Jan 2015

BH release 1.55.0-3

Right on the heels of yesterday's BH release 1.55.0-2 bringing Boost Fusion, we now have release 1.55.0-3 bringing Boost Graph. To recap, BH is our CRAN package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R and of course Rcpp.

And as a small project I am working on--and which should now be so much closer to release--needed not only Boost Fusion but also Boost Graph, I had to bother the CRAN maintainers twice in two days. This new version closes issue ticket 9, and may be of interest to other packages such as the venerable RBGL formerly on CRAN and now a BioConductor package which includes its own copy of this graph library (plus depends).

A brief summary of changes from the NEWS file is below.

Changes in version 1.55.0-3 (2015-01-04)

  • Added Boost Graph requested in GH ticket #9 by Dirk for RcppStreams

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Sun, 04 Jan 2015

BH release 1.55.0-2

A new release of BH, our package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R, is now on CRAN.

This is a relatively minor change which expands the set of Boost libraries included in the package to Boost Fusion per issue ticket 7. Boost Fusion is a very clever library providing a fusion of both compile-time meta-programming and run-time programming to provide something similar to the STL (i.e. containers, algorithms, ...) for heterogenous tuples. I also added pointers to both the mailing list and the GitHub issue tracker to the DESCRIPTION file, README and main manual page.

A brief summary of changes from the NEWS file is below.

Changes in version 1.55.0-2 (2015-01-03)

  • Added Boost Fusion requested in GH ticket #7 by Dirk for RcppStreams

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Thu, 01 Jan 2015

(Belated) 2014 Running Recap

Having just blogged about today's 5k, I noticed that I did not blog at all about running in 2014. There wasn't much, but another Ragnar Relay in February, again as an Ultra team. I more-or-less live tweeted this though: Pre-race warning (and that was my route!), Team photo at start, Leg 1, Leg 2, Leg 3, Leg 4, Leg 5, Leg 6, and Chilling

I also ran the Philly Half-Marathon in November following up on the Penn R Workshop, and posted some tweets: Post race selfie, Strava result (thinking it was a 13.5 miler race--WTF), and Post-race chill.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/sports/running | permanent link

New Year's Run 2015

Nice, crisp and pretty cold morning for the 2015 New Year's 5k. I had run this before with friends: 2003, 2005, 2006 and 2007.

Needless to say, I am a little slower now: A hand-stopped 23:13 for a 7:27 min/mile pace is fine by me given the weather, lack of speedwork in the last few months, and generally reduced running---but also almost a minute slower than the last time I ran this.

In case you're a fellow running geek and into this, Strava has my result nicely aggregated.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/sports/running | permanent link

Wed, 31 Dec 2014

digest 0.6.8

Release 0.6.8 of digest package is now on CRAN and will get to Debian shortly.

This release opens the door to also providing the digest functionality at the C level to other R packages. Wush Wu is going to use the murmurHash C implementation in his recently-created FeatureHashing package.

We plan to export the other hashing function as well. Another small change attempts to overcome a build limitation on that other largely-irrelevant-but-still-check-by-CRAN OS.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

Mon, 29 Dec 2014

rfoaas 0.0.5

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service--which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

This version aligns the rfoaas version number with the (at long last) updated upstream version number, and brings a change suggested by Richie Cotton to set the encoding of the returned object.

As usual, CRANberries provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Sun, 28 Dec 2014

RcppArmadillo 0.4.600.0

Conrad produced another minor release 4.600 of Armadillo. As before, I had created a GitHub-only pre-release(s) of his pre-release(s), and tested a pre-release as well as the actual release against the now over one hundred CRAN dependents of our RcppArmadillo package. The tests passed fine as usual with less than a handful of checks not passing, all for known cases -- and results are as always in the rcpp-logs repository.

Changes are summarized below based on the NEWS.Rd file.

Changes in RcppArmadillo version 0.4.600.0 (2014-12-27)

  • Upgraded to Armadillo release Version 4.600 ("Off The Reservation")

    • added .head() and .tail() to submatrix views

    • faster matrix transposes within compound expressions

    • faster accu() and norm() when compiling with -O3 -ffast-math -march=native (gcc and clang)

    • workaround for a bug in GCC 4.4

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 25 Dec 2014

rfoaas 0.0.4.20141225 -- not on CRAN

A new version of rfoaas was prepared for CRAN, but refused on the grounds of having been updated within 24 hours. Oh well.

To recap, the rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

And having seen the Christmas Eve (ie December 24) update, upstream immediatly and rather kindly added a new xmas(name, from) function -- so now we could do rfoaas::xmas("CRAN Maintainers") to wish the CRAN Maintainers a very Merry Christmas.

So for once, there is no CRANberries report as the package is not on CRAN :-/ There is of course always GitHub.

Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Wed, 24 Dec 2014

rfoaas 0.0.4.20141224

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

This is minor update, affecting only the rfoaas side without changes at the FOAAS backend side. We documented that the result may need an encoding update (at least on the World's leading consumer OS), and now provide a default value for from so that many services can be called argument-less, or at least with one argument less than before. Both of these were suggested by Richie Cotton via issue tickets at the GitHub repo.

CRANberries also provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Tue, 23 Dec 2014

RcppEigen 0.3.2.3.0

A new release of RcppEigen is now on CRAN, and will get to Debian shortly.

This is thanks to a fabulous pull request by Yixuan Qiu who single-handedly managed to upgrade to the new minor release of Eigen -- after having expanded the unit tests a few weeks earlier in another pull request.

The NEWS file entry follows.

Changes in RcppEigen version 0.3.2.3.0 (2014-12-22)

  • Updated to version 3.2.3 of Eigen

  • Added a number of additional unit tests for wrap, transform and sparse

  • Updated one of the examples

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 22 Dec 2014

BH release 1.55.0-1

A new release of BH, our package providing (a large part of the) Boost C++ libraries as a set of template headers for use by R, is now on CRAN.

There are two changes. First, we upgraded to Boost 1.55. While Boost 1.57 is now current, I am playing it somewhat safe and conservative here by relying on what is the current version in Debian (and Ubuntu). I even started from the Debian tarball. This ensures we include nothing not in accordance with the Debian Free Software Guidelines (as they are a good proxy for what is acceptable to CRAN). The second change was the addition of Boost.Geometry as requested in issue ticket 5.

This may be a good time to recall that best way to get in touch for desired additions in BH are the mailing list or the issue tracker at the GitHub repo.

A brief summary of changes from the NEWS file (which we failed to update for the release; the text below is however now in the repo):

Changes in version 1.55.0-1 (2014-12-21)

Courtesy of CRANberries, there is also a (very long!) diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Sun, 21 Dec 2014

If only there was a Romeo somewhere ...

Attention: rant coming. You have been warned, and may want to tune out now.

So the top of my Twitter timeline just had a re-tweet posting to this marvel on the state of Julia. I should have known better than to glance at it as it comes from someone providing (as per the side-bar) Thought leadership in Big Data, systems architecture and more. Reading something like this violates the first rule of airport book stores: never touch anything from the business school section, especially on (wait for it) leadership or, worse yet, thought leadership.

But it is Sunday, my first cup of coffee still warm (after finalising two R package updates on GitHub, and one upload to CRAN) and so I read on. Only to be mildly appalled by the usual comparison to R based on the same old Fibonacci sequence.

Look, I am as guilty as anyone of using it (for example all over chapter one of my Rcpp book), but at least I try to stress each and every time that this is kicking R where it is down as its (fairly) poor performance on functions calls (that is well-known and documented) obviously gets aggravated by recursive calls. But hey, for the record, let me restate those results. So Julia beats R by a factor of 385. But let's take a closer look.

For n=25, I get R to take 241 milliseconds---as opposed to his 6905 milliseconds---simply by using the same function I use in every workshop, eg last used at Penn in November, which does not use the dreaded ifelse operator:

fibR <- function(n) {
  if (n < 2) return(n)
  return(fibR(n-1) + fibR(n-2))
}

Switching that to the standard C++ three-liner using Rcpp

library(Rcpp)
cppFunction('int fibCpp(int n) { 
  if (n < 2) return(n);  
  return(fibCpp(n-1) + fibCpp(n-2));  
  }')

and running a standard benchmark suite gets us the usual result of

R> library(rbenchmark)
R> benchmark(fibR(25),fibCpp(25),order="relative")[,1:4]
        test replications elapsed relative
2 fibCpp(25)          100   0.048    1.000
1   fibR(25)          100  24.674  514.042
R> 

So for the record as we need this later: that is 48 milliseconds for 100 replications, or about 0.48 milliseconds per run.

Now Julia. And of my standard Ubuntu server running the current release 14.10:

edd@max:~$ julia
ERROR: could not open file /home/edd//home/edd//etc/julia/juliarc.jl
 in include at boot.jl:238

edd@max:~$ 

So wait, what? You guys can't even ensure a working release on what is probably the most popular and common Linux installation? And I get to that after reading a post on the importance of "Community, Community, Community" and you can't even make sure this works on Ubuntu? Really?

So a little bit of googling later, I see that julia -f is my friend for this flawed release, and I can try to replicate the original timing

edd@max:~$ julia -f
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.2.1 (2014-02-11 06:30 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-linux-gnu

julia> fib(n) = n < 2 ? n : fib(n - 1) + fib(n - 2)
fib (generic function with 1 method)

julia> @elapsed fib(25)
0.002299559

julia> 

Interestingly the posts author claims 18 milliseconds. I see 2.3 milliseconds here. Maybe someone is having a hard time comparing things to the right of the decimal point. Or maybe his computer is an order of magnitude slower than mine. The more important thing is that Julia is of course faster than R (no surprise: LLVM at work) but also still a lot slower than a (trivial to write and deploy) C++ function. Nothing new here.

So let's recap. Comparison to R was based on a flawed version of a function we only use when we deliberately want to put R down, can be improved significantly when using a better implementation, results are still off by order of magnitude to what was reported ("math is hard"), and the standard C / C++ way of doing things is still several times faster than our new saviour language---which I can't even launch on the current version of one of the more common free operating systems. Ok then. Someone please wake me up in a few years and I will try again.

Now, coming to the end of the rant I should really stress that of course I too hope that Julia succeeds. Every user pulled away from Matlab is a win for all us. We're in this together and the endless navel gazing between ourselves is so tiresome and irrelevant. And as I argue here, even more so when we among ourselves stick to unfair comparisons as well as badly chosen implementation details.

What matters are wins against the likes of Matlab, Excel, SAS and so on. Let's build on our joint strength. I am sure I will use Julia one day, and I am grateful for everyone helping with it---as a lot of help seems to be needed. In the meantime, and with CRAN at 6130 packages that just work I'll continue to make use of this amazing community and trying my bit to help it grow and prosper. As part of our joint community.

/computers/misc | permanent link

Sat, 20 Dec 2014

digest 0.6.7

Release 0.6.7 of digest package is now on CRAN and in Debian.

Jim Hester was at it again and added murmurHash. I cleaned up several sets of pedantic warnings in some of the source files, updated the test reference out, and that forms version 0.6.7.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

Fri, 19 Dec 2014

Rocker is now the official R image for Docker

big deal

Something happened a little while ago which we did not have time to commensurate properly. Our Rocker image for R is now the official R image for Docker itself. So getting R (via Docker) is now as simple as saying docker pull r-base.

This particular container is essentially just the standard r-base Debian package for R (which is one of a few I maintain there) plus a mininal set of extras. This r-base forms the basis of our other containers as e.g. the rather popular r-studio container wrapping the excellent RStudio Server.

A lot of work went into this. Carl and I also got a tremendous amount of help from the good folks at Docker. Details are as always at the Rocker repo at GitHub.

Docker itself continues to make great strides, and it has been great fun help to help along. With this post I achieved another goal: blog about Docker with an image not containing shipping containers. Just kidding.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Sat, 13 Dec 2014

rfoaas 0.0.4.20141212

A new version of rfoaas is now on CRAN. The rfoaas package provides an interface for R to the most excellent FOAAS service -- which provides a modern, scalable and RESTful web service for the frequent need to tell someone to eff off.

The FOAAS backend gets updated in spurts, and yesterday a few pull requests were integrated, including one from yours truly. So with that it was time for an update to rfoaas. As the version number upstream did not change (bad, bad, practice) I appended the date the version number.

CRANberries also provides a diff to the previous release. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rfoaas | permanent link

Thu, 11 Dec 2014

RProtoBuf 0.4.2

A new release 0.4.2 of RProtoBuf is now on CRAN. RProtoBuf provides R bindings for the Google Protocol Buffers ("Protobuf") data encoding library used and released by Google, and deployed as a language and operating-system agnostic protocol by numerous projects.

Murray and Jeroen did almost all of the heavy lifting. Many changes were triggered by two helpful referee reports, and we are slowly getting to the point where we will resubmit a much improved paper. Full details are below.

Changes in RProtoBuf version 0.4.2 (2014-12-10)

  • Address changes suggested by anonymous reviewers for our Journal of Statistical Software submission.

  • Make Descriptor and EnumDescriptor objects subsettable with "[[".

  • Add length() method for Descriptor objects.

  • Add names() method for Message, Descriptor, and EnumDescriptor objects.

  • Clarify order of returned list for descriptor objects in as.list documentation.

  • Correct the definition of as.list for EnumDescriptors to return a proper list instead of a named vector.

  • Update the default print methods to use cat() with fill=TRUE instead of show() to eliminate the confusing [1] since the classes in RProtoBuf are not vectorized.

  • Add support for serializing function, language, and environment objects by falling back to R's native serialization with serialize_pb and unserialize_pb to make it easy to serialize into a Protocol Buffer all of the more than 100 datasets which come with R.

  • Use normalizePath instead of creating a temporary file with file.create when getting absolute path names.

  • Add unit tests for all of the above.

CRANberries also provides a diff to the previous release. RProtoBuf page which has a draft package vignette, a a 'quick' overview vignette, and a unit test summary vignette. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rprotobuf | permanent link

Wed, 10 Dec 2014

digest 0.6.6 (and 0.6.5)

A new release 0.6.6 of the digest package is now on CRAN and in Debian.

This release brings the xxHash non-cryptographic hash function by Yann Collet, thanks to several pull requests by Jim Hester. After the upload of version 0.6.5 we uncovered another lovely non-standardness of Windoze: you cannot format unsigned long long via printf() format strings. Great. Luckily Jim found a quick (and portable) fix via the inttypes.h header, and that went into the 0.6.6 release.

The release also contains an earlier extension for hmac() to also cover crc32 hashes, kindly provided by Suchen Jin.

I also made a number of small internal changes such as

  • switching (compiled) function registration to package load via a the useDynLib() declaration in NAMESPACE,
  • (finally!!) formating code to proper four-space indentation,
  • adding some documentation around Jim's pull request, and
  • adding a few GPL copyright headers.

CRANberries provides the usual summary of changes to the previous version.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/digest | permanent link

RcppRedis 0.1.3

A very minor bugfix release of RcppRedis is now on CRAN. The zcount function now returns the correct type.

Changes in version 0.1.3 (2014-12-10)

  • Bug fix setting correct return type of zcount

Courtesy of CRANberries, there is also a diffstat report for the most recent release. More information is on the RcppRedis page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Tue, 09 Dec 2014

Wilco!!

Wilco at The Riviera

With a bit of luck due to a collegue having a spare ticket, I managed to make it to an awesome Wilco show at The Riviera in Uptown.

This concert was part of a set a six shows. Tweedy and the band were fast, and loose, and wonderful, and totally beloved by the home crowd. An truly outstanding show, and a great evening.

Also: I should get out more often. Last blog entry about Wilco was from 2005. Ouch.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/music/rock | permanent link

RcppAnnoy 0.0.4

A few weeks ago, RcppAnnoy had its initial release 0.0.2 and subsequent update in release 0.0.3. The latter brought Windows support, thanks to a neat pull request by Qiang Kou.

RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

In the 0.0.3 release, I overlooked one thing: that with builds on Windows, we would also get builds against what CRAN calls R-oldrel: the previous release, which cannot turn on C++11 via the simple CXX_STD = CXX11 declaration in src/Makevars (and which we need because use of Boost brings in long long which R can only cope with under C++11 ...).

So this new release 0.0.4 does nothing more than add a constraint in a Depends: R (>= 3.1.0) to avoid builds not being able to turn on C++11.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Tue, 02 Dec 2014

RQuantLib 0.4.0

A new major release of RQuantLib is now on CRAN and getting to Debian.

All C++ source files have been rewritten to take advantage of newer Rcpp features. Several Fixed Income functions have been added, or refreshed, thanks to contributions by Michele Salvadore. Calendar support was greatly expanded thanks to a contribution by Danilo Dias da Silva. All key changes are listed in detail below.

Changes in RQuantLib version 0.4.0 (2014-12-01)

  • Changes in RQuantLib code:

    • All function interfaces have been rewritten using Rcpp Attributes. No SEXP remain in the function signatures. This make the code shorter, more readable and more easily extensible.

    • The header files have been reorganized so that plugin use is possible. An impl.h files is imported once for each compilation unit: for RQuantLib from the file src/dates.cpp directory, from a sourced file via a #define set by the plugin wrapper.

    • as<>() and wrap() converters have added for QuantLib Date types.

    • Plugin support has been added, allowing more ad-hoc use via Rcpp Attributes.

    • Several Fixed Income functions have been added, and/or rewritten to better match the QuantLib signatures; this was done mostly by Michele Salvadore.

    • Several Date and Calendar functions have been added.

    • Calendar support has been greatly expanded thanks to Danilo Dias da Silva.

    • Exported curve objects are now more parsimonious and advance entries in the table object roughly one business month at a time.

    • The DiscountCurve and Bond curve construction has been fixed via a corrected evaluation date and omitted the two-year swap rate, as suggested by Luigi Ballabio.

    • The NAMESPACE file has a tighter rule for export of *.default functions, as suggested by Bill Dunlap

    • Builds now use OpenMP where available.

    • The package now depends on QuantLib 1.4.0 or later.

  • Changes in RQuantLib tests:

    • New unit tests for dates have been added.

    • C++ code for the unit tests has also been converted to Rcpp Attributes use; a helper function unitTestSetup() has been added.

    • Continuous Integration via Travis is now enabled from the GitHub repo.

  • Changes in RQuantLib documentation:

    • This NEWS file has been added. Better late than never, as they say.

Courtesy of CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the RQuantLib page. Questions, comments etc should go to the rquantlib-devel mailing list off the R-Forge page. Issue tickets can be filed at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rquantlib | permanent link

Sat, 29 Nov 2014

RcppArmadillo 0.4.550.1.0

A week ago, Conrad provided another minor release 4.550.0 of Armadillo which has since received one minor correction in 4.550.1.0. As before, I had created a GitHub-only pre-release of his pre-release which was tested against the almost one hundred CRAN dependents of our RcppArmadillo package. This passed fine as usual, and results are as always in the rcpp-logs repository.

Processing and acceptance at the CRAN took a little longer as around the same time a fresh failure in unit tests had become apparent on an as-of-yet unannounced new architecture (!!) also tested at CRAN. The R-devel release has since gotten a new capabilities() test for long double, and we now only run this test (for our rmultinom()) if the test asserts that the given R build has this capability. Phew, so with all that the new version in now on CRAN; Windows binaries have been built and I also uploaded new Debian binaries.

Changes are summarized below; our end also includes added support for conversion of Field types takes to short pull request by Romain.

Changes in RcppArmadillo version 0.4.550.1.0 (2014-11-26)

  • Upgraded to Armadillo release Version 4.550.1 ("Singapore Sling Deluxe")

    • added matrix exponential function: expmat()

    • faster .log_p() and .avg_log_p() functions in the gmm_diag class when compiling with OpenMP enabled

    • faster handling of in-place addition/subtraction of expressions with an outer product

    • applied correction to gmm_diag relative to the 4.550 release

  • The Armadillo Field type is now converted in as<> conversions

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Fri, 28 Nov 2014

CRAN Task Views for Finance and HPC now (also) on GitHub

The CRAN Task View system is a fine project which Achim Zeileis initiated almost a decade ago. It is described in a short R Journal article in Volume 5, Number 1. I have been editor / maintainer of the Finance task view essentially since the very beginning of these CRAN Task Views, and added the High-Performance Computing one in the fall of 2008. Many, many people have helped by sending suggestions or even patches; email continues to be the main venue for the changes.

The maintainers of the Web Technologies task view were, at least as far as I know, the first to make the jump to maintaining the task view on GitHub. Karthik and I briefly talked about this when he was in town a few weeks ago for our joint Software Carpentry workshop at Northwestern.

So the topic had been on my mind, but it was only today that I realized that the near-limitless amount of awesome that is pandoc can probably help with maintenance. The task view code by Achim neatly converts the very regular, very XML, very boring original format into somewhat-CRAN-website-specific html. Pandoc, being as versatile as it is, can then make (GitHub-flavoured) markdown out of this, and with a minimal amount of sed magic, we get what we need.

And hence we now have these two new repos:

Contributions are now most welcome by pull request. You can run the included converter scripts, it differs between both repos only by one constant for the task view / file name. As an illustration, the one for Finance is below.

#!/usr/bin/r
## if you do not have /usr/bin/r from littler, just use Rscript

ctv <- "Finance"

ctvfile  <- paste0(ctv, ".ctv")
htmlfile <- paste0(ctv, ".html")
mdfile   <- "README.md"

## load packages
suppressMessages(library(XML))          # called by ctv
suppressMessages(library(ctv))

r <- getOption("repos")                 # set CRAN mirror
r["CRAN"] <- "http://cran.rstudio.com"
options(repos=r)

check_ctv_packages(ctvfile)             # run the check

## create html file from ctv file
ctv2html(read.ctv(ctvfile), htmlfile)

### these look atrocious, but are pretty straight forward. read them one by one
###  - start from the htmlfile
cmd <- paste0("cat ", htmlfile,
###  - in lines of the form  ^<a href="Word">Word.html</a>
###  - capture the 'Word' and insert it into a larger URL containing an absolute reference to task view 'Word'
  " | sed -e 's|^<a href=\"\\([a-zA-Z]*\\)\\.html|<a href=\"http://cran.rstudio.com/web/views/\\1.html\"|' | ",
###  - call pandoc, specifying html as input and github-flavoured markdown as output
              "pandoc -s -r html -w markdown_github | ",
###  - deal with the header by removing extra ||, replacing |** with ** and **| with **:              
              "sed -e's/||//g' -e's/|\\*\\*/\\*\\*/g' -e's/\\*\\*|/\\*\\* /g' -e's/|$/  /g' ",
###  - make the implicit URL to packages explicit
              "-e's|../packages/|http://cran.rstudio.com/web/packages/|g' ",
###  - write out mdfile
              "> ", mdfile)

system(cmd)                             # run the conversion

unlink(htmlfile)                        # remove temporary html file

cat("Done.\n")

I am quite pleased with this setup---so a quick thanks towards the maintainers of the Web Technologies task view; of course to Achim for creating CRAN Task Views in the first place, and maintaining them all those years; as always to John MacFarlance for the magic that is pandoc; and last but not least of course to anybody who has contributed to the CRAN Task Views.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/snippets | permanent link

Tue, 25 Nov 2014

Rcpp now used by 300 CRAN packages

max-heap image

This morning, Rcpp reached another round milestone: 300 packages on CRAN now depend on it (as measured by Depends, Imports and LinkingTo declarations). The graph is on the left depicts the growth of Rcpp usage over time. There are 41 more on BioConductor (which is not included in the chart).

The first and less detailed part uses manually save entries, the second half of the data set was generated semi-automatically via a short script appending updates to a small file-based backend. A list of user package is kept on this page.

Also displayed in the graph is the relative proportion of CRAN packages using Rcpp. The four per-cent hurdle was cleared just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited talk. We may well hit five per-cent before the end of the year.

300 is a pretty humbling and staggering number. Also interesting that we we cleared 200 only at the end of April, and 250 in early August.

So from everybody behind Rcpp, a heartfelt Thank You! to all the users and of course other contributors.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 24 Nov 2014

YATORP -- Yet Another Tutorial on R Packaging

What the world needs right now is yet another tutorial on R packages and their creation. Luckily, this last Friday and Saturday, I had the opportunity to present in a workshop organized by Frank DiTraglia at Penn's shiny new Warren Center, and held at Wharton.

Given the Warren Center's focus, the workshop centered around Big Data and Open Science with R. Yihui Xie and myself alternated on delivering four units on an Introduction to R, Writing R packages, Dynamic Documents with R, and HPC with Rcpp and RcppArmadillo.

So I had to come up with a plan for teaching creating R packages -- and decided to do it from the very bottom up, clearly introducing the underlying R CMD ... commands and only then switching to taking advantage of an environment such as the RStudio IDE.

The resulting slides are now available on my presentations page. The code examples are in a repo subdirectory on GitHub as well. While both were designed to support the parallel live instruction offered in the workshop, I would be interested in feedback (preferably via email) about how useful the slides are by themselves.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/computers/R | permanent link

Tue, 18 Nov 2014

R / Finance 2015 Call for Papers

Earlier today, Josh send the text below to the R-SIG-Finance list, and I updated the R/Finance website, including its Call for Papers page, accordingly.

We are once again very excited about our conference, thrilled about the four confirmed keynotes, and hope that many R / Finance users will not only join us in Chicago in May 2015 -- but also submit an exciting proposal.

So read on below, and see you in Chicago in May!

Call for Papers:

R/Finance 2015: Applied Finance with R
May 29 and 30, 2015
University of Illinois at Chicago, IL, USA

The seventh annual R/Finance conference for applied finance using R will be held on May 29 and 30, 2015 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past six years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2015. This year will include invited keynote presentations by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please make your submission online at this link. The submission deadline is January 31, 2015. Submitters will be notified via email by February 28, 2015 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the R/Finance conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal,
Jeffrey Ryan, Joshua Ulrich

/computers/R | permanent link

RcppAnnoy 0.0.3

Hours after the initial blog post announcing the first release of the new package RcppAnnoy, Qiang Kou sent us a very nice pull request adding mmap support in Windows.

So a new release with Windows support is on now CRAN, and Windows binaries should be available by this evening as usual.

To recap, RcppAnnoy wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify. RcppAnnoy uses Rcpp Modules to offer the exact same functionality as the Python module wrapped around Annoy.

Courtesy of CRANberries, there is also a diffstat report for this release. More detailed information is on the RcppAnnoy page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Sun, 16 Nov 2014

Introducing RcppAnnoy

A new package RcppAnnoy is now on CRAN.

It wraps the small, fast, and lightweight C++ template header library Annoy written by Erik Bernhardsson for use at Spotify.

While Annoy is setup for use by Python, RcppAnnoy offers the exact same functionality from R via Rcpp.

A new page for RcppAnnoy provides some more detail, example code and further links. See a recent blog post by Erik for a performance comparison of different approximate nearest neighbours libraries for Python.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Mon, 10 Nov 2014

BH release 1.54.0-5

A new release of BH, our package providing Boost headers for use by R is now on CRAN. This release was triggered via a request by Ben Goodrich as noted below who asked for Boost Circular Buffer which RStan uses.

No other changes were made.

Changes in version 1.54.0-5 (2014-11-09)

  • Added Boost Circular Buffer requested by Ben Goodrich for RStan

Courtesy of CRANberries, there is also a diffstat report for the most recent release

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHubGitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/bh | permanent link

Thu, 06 Nov 2014

RcppRedis 0.1.2

A new release of RcppRedis is now on CRAN. It contains additional commands for hashes and sets, all contributed by John Laing and Whit Armstrong.

Changes in version 0.1.2 (2014-11-06)

  • New commands execv, hset, hget, sadd, srem, and smembers contributed by John Laing and Whit Armstrong over several pull requests.

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppRedis page page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Wed, 05 Nov 2014

A Software Carpentry workshop at Northwestern

On Friday October 31, 2014, and Saturday November 1, 2014, around thirty-five graduate students and faculty members attended a Software Carpentry workshop. Attendees came primarily from the Economics department and the Kellogg School of Management, which was also the host and sponsor providing an excellent venue with the Allen Center on the (main) Evanston campus of Northwestern University.

The focus of the two-day workshop was an introduction, and practical initiation, to working effectively at the shell, getting introduced and familiar with the git revision control system, as well as a thorough introduction to working and programming in R---from the basics all the way to advanced graphing as well as creating reproducible research documents.

The idea of the workshop had come out of discussion during our R/Finance 2014 conference. Bob McDonald of Northwestern, one of this year's keynote speakers, and I were discussing various topic related to open source and good programming practice --- as well as the lack of a thorough introduction to either for graduate students and researcher. And once I mentioned and explained Software Carpentry, Bob was rather intrigued. And just a few months later we were hosting a workshop (along with outstanding support from Jackie Milhans from Research Computing at Northwestern).

We were extremely fortunate in that Karthik Ram and Ramnath Vaidyanathan were able to come to Chicago and act as lead instructors, giving me an opportunity to get my feet wet. The workshop began with a session on shell and automation, which was followed by three session focusing on R: a core introduction, a session focused on function, and to end the day, a session on the split-apply-combine approach to data transformation and analysis.

The second day started with two concentrated session on git and the git workflow. In the afternoon, one session on visualization with R as well as a capstone-alike session on reproducible research rounded out the second day.

Things that worked

The experience of the instructors showed, as the material was presented and an effective manner. The selection of topics, as well as the pace were seen by most students to be appropriate and just right. Karthik and Ramnath both did an outstanding job.

No students experienced any real difficulties installing software, or using the supplied information. Participants were roughly split between Windows and OS X laptops, and had generally no problem with bash, git, or R via RStudio.

The overall Software Carpentry setup, the lesson layout, the focus on hands-on exercises along with instruction, the use of the electronic noteboard provided by etherpad and, of course, the tried-and-tested material worked very well.

Things that could have worked better

Even more breaks for exercises could have been added. Students had difficulty staying on pace in some of the exercise: once someone fell behind even for a small error (maybe a typo) it was sometimes hard to catch up. That is a general problem for hands-on classes. I feel I could have done better with the scope of my two session.

Even more cohesion among the topics could have been achieved via a single continually used example dataset and analysis.

Acknowledgments

Robert McDonald from Kellogg, and Jackie Milhans from Research Computing IT, were superb hosts and organizers. Their help in preparing for the workshop was tremendous, and the pick of venue was excellent, and allowed for a stress-free two days of classes. We could not have done this without Karthik and Ramnath, so a very big Thank You to both of them. Last but not least the Software Carpentry 'head office' was always ready to help Bob, Jackie and myself during the earlier planning stage, so another big Thank You! to Greg and Arliss.

/computers/misc | permanent link

RPushbullet 0.1.1

A minor bugfix release 0.1.1 of the RPushbullet package (interfacing the neat Pushbullet service) landed on CRAN yesterday morning.

It cleans up a small issue related to the ability to transfer files between devices via the Pushbullet service where the ability to select a (non-default) target device has now been restored.

With that, allow me to borrow one excellent use case of RPushbullet from the blog of Karl Broman: how to use RPushbullet for error notifications from R:

options(error = function() {
    library(RPushbullet)
    pbPost("note", "Error", geterrmessage())
})

This is very clever: should an error occur, you get immediate notification in browser or on your phone. Left as an exercise for the reader is to combine this with the equally excellent rfoaas package (github|cran) to get appropriately colourful error messages...

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rpushbullet | permanent link

Sun, 02 Nov 2014

RcppArmadillo 0.4.500.0

A few days ago, Conrad provided another minor release of Armadillo. Once again, I had created a GitHub-only pre-release of his pre-release which was tested against (then) all ninety (!!) CRAN dependents of our RcppArmadillo package, providing a further test for Conrad's code and uploaded RcppArmadillo 0.4.500.0 to CRAN and Debian once his release was finalized.

The few changes from his end are summarized below; our end also includes an update / extension to the sample() method provided by Christian Gunning --- and used to great effect in the excellent Rcpp Gallery post by Jonathan Olmsted.

Changes in RcppArmadillo version 0.4.500.0 (2014-10-30)

  • Upgraded to Armadillo release Version 4.500 ("Singapore Sling")

    • faster handling of complex vectors by norm()

    • expanded chol() to optionally specify output matrix as upper or lower triangular

    • better handling of non-finite values when saving matrices as text files

  • The sample functionality has been extended to provide the Walker Alias method (including new unit tests) via a pull request by Christian Gunning

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link

Thu, 23 Oct 2014

Introducing Rocker: Docker for R

You only know two things about Docker. First, it uses Linux
containers. Second, the Internet won't shut up about it.

-- attributed to Solomon Hykes, Docker CEO

So what is Docker?

Docker is a relatively new open source application and service, which is seeing interest across a number of areas. It uses recent Linux kernel features (containers, namespaces) to shield processes. While its use (superficially) resembles that of virtual machines, it is much more lightweight as it operates at the level of a single process (rather than an emulation of an entire OS layer). This also allows it to start almost instantly, require very little resources and hence permits an order of magnitude more deployments per host than a virtual machine.

Docker offers a standard interface to creation, distribution and deployment. The shipping container analogy is apt: just how shipping containers (via their standard size and "interface") allow global trade to prosper, Docker is aiming for nothing less for deployment. A Dockerfile provides a concise, extensible, and executable description of the computational environment. Docker software then builds a Docker image from the Dockerfile. Docker images are analogous to virtual machine images, but smaller and built in discrete, extensible and reuseable layers. Images can be distributed and run on any machine that has Docker software installed---including Windows, OS X and of course Linux. Running instances are called Docker containers. A single machine can run hundreds of such containers, including multiple containers running the same image.

There are many good tutorials and introductory materials on Docker on the web. The official online tutorial is a good place to start; this post can not go into more detail in order to remain short and introductory.

So what is Rocker?

rocker logo

At its core, Rocker is a project for running R using Docker containers. We provide a collection of Dockerfiles and pre-built Docker images that can be used and extended for many purposes.

Rocker is the the name of our GitHub repository contained with the Rocker-Org GitHub organization.

Rocker is also the name the account under which the automated builds at Docker provide containers ready for download.

Current Rocker Status

Core Rocker Containers

The Rocker project develops the following containers in the core Rocker repository

  • r-base provides a base R container to build from
  • r-devel provides the basic R container, as well as a complete R-devel build based on current SVN sources of R
  • rstudio provides the base R container as well an RStudio Server instance

We have settled on these three core images after earlier work in repositories such as docker-debian-r and docker-ubuntu-r.

Rocker Use Case Containers

Within the Rocker-org organization on GitHub, we are also working on

  • Hadleyverse which extends the rstudio container with a number of Hadley packages
  • rOpenSci which extends hadleyverse with a number of rOpenSci packages
  • r-devel-san provides an R-devel build for "Sanitizer" run-time diagnostics via a properly instrumented version of R-devel via a recent compiler build
  • rocker-versioned aims to provided containers with 'versioned' previous R releases and matching packages

Other repositories will probably be added as new needs and opportunities are identified.

Deprecation

The Rocker effort supersedes and replaces earlier work by Dirk (in the docker-debian-r and docker-ubuntu-r GitHub repositories) and Carl. Please use the Rocker GitHub repo and Rocker Containers from Docker.com going forward.

Next Steps

We intend to follow-up with more posts detailing usage of both the source Dockerfiles and binary containers on different platforms.

Rocker containers are fully functional. We invite you to take them for a spin. Bug reports, comments, and suggestions are welcome; we suggest you use the GitHub issue tracker.

Acknowledgments

We are very appreciative of all comments received by early adopters and testers. We also would like to thank RStudio for allowing us the redistribution of their RStudio Server binary.

Published concurrently at rOpenSci blog and Dirk's blog.

Authors

Dirk Eddelbuettel and Carl Boettiger

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link

Sun, 19 Oct 2014

littler 0.2.1

max-heap image

A new maintenance release of littler is available now.

The main change are a few updates and extensions to the examples provided along with littler. Several of those continue to make use of the wonderful docopt package by Edwin de Jonge. Carl Boettiger and I are making good use of these littler examples, particularly to install directly from CRAN or GitHub, in our Rocker builds of R for Docker (about which we should have a bit more to blog soon too).

Full details for the littler release are provided as usual at the ChangeLog page.

The code is available via the GitHub repo, from tarballs off my littler page and the local directory here. A fresh package has gone to the incoming queue at Debian; Michael Rutter will probably have new Ubuntu binaries at CRAN in a few days too.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/littler | permanent link

Sun, 12 Oct 2014

Seinfeld streak at GitHub

Early last year, I referred to a Seinfeld Streak in a blog post referring to almost two months of updates to the Rcpp Gallery. This is sometimes called Jerry Seinfeld's secret to productivity: Just keep at it. Don't break the streak.

I now have different streak:

github activity october 2013 to october 2014

Now we'll see how far this one will go.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/computers/misc | permanent link

Fri, 10 Oct 2014

RPushbullet 0.1.0 with a lot more awesome

A new release 0.1.0 of the RPushbullet package (interfacing the neat Pushbullet service) landed on CRAN today.

It brings a number of goodies relative to the first release 0.0.2 of a few months ago:

  • pushing of files is now supported thanks to a nice pull request bu Mike Birdgeneau
  • a default device can be designated in the ~/.rpushbullet.json file or options
  • initialization has been rewritten to use recpients which can be indices, device names or, if missing entirely, the (new) default device
  • alternatively, email is supported as another recipient option in which case the Pushbullet service will send an email to the give address
  • pbGetDevices() now returns a proper S3 object with corresponding print() and summary() methods
  • the documentation regarding package initialization, and setting of key, devices, etc has been expanded
  • more examples has been added to the documentation
  • various minor cleanups, fixes, corrections throughout

There is a whole boat load of more wickedness in the Pushbullet API so if anybody feels compelled to add it, fire off pull requests at GitHub.

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rpushbullet | permanent link

Mon, 29 Sep 2014

Rcpp 0.11.3

A new release 0.11.3 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package has been uploaded too.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 273 packages on CRAN depend on Rcpp for making analyses go faster and further.

This release brings a fairly large number of continued enhancements, fixes and polishing to Rcpp. These were provided by a total of seven different contributors---which is a new record as well.

See below for a detailed list of changes extracted from the NEWS file, but some highlights included in this release are

  • Several API cleanups, polishes and a pre-announced code removal
  • New InternalFunction interface, and new Timer functionality.
  • More robust functionality of Rcpp Attributes as well as a new dryRun option.
  • The Rcpp FAQ was updated, as was the main Description: in the DESCRIPTION file.
  • Rcpp.package.skeleton() can now deploy functionality from pkgKitten to create Rcpp packages that purr.

One sore point, however, is that we missed that packages using Rcpp Modules appear to require a rebuild. We are sorry for the inconvenience; this has highlighted a shortcoming in our fairly robust and extensive tests. While we test our packages against all known CRAN dependents, such tests check for the ability to compile and run freshly and not whether previously built packages still run. We intend to augment our testing in this direction to avoid a repeat occurrence of such a misfeature.

Changes in Rcpp version 0.11.3 (2014-09-27)

  • Changes in Rcpp API:

    • The deprecation of RCPP_FUNCTION_* which was announced with release 0.10.5 last year is proceeding as planned, and the file macros/preprocessor_generated.h has been removed.

    • Timer no longer records time between steps, but times from the origin. It also gains a get_timers(int) methods that creates a vector of Timer that have the same origin. This is modelled on the Rcpp11 implementation and is more useful for situations where we use timers in several threads. Timer also gains a constructor taking a nanotime_t to use as its origin, and a origin method. This can be useful for situations where the number of threads is not known in advance but we still want to track what goes on in each thread.

    • A cast to bool was removed in the vector proxy code as inconsistent behaviour between clang and g++ compilations was noticed.

    • A missing update(SEXP) method was added thanks to pull request by Omar Andres Zapata Mesa.

    • A proxy for DimNames was added.

    • A no_init option was added for Matrices and Vectors.

    • The InternalFunction class was updated to work with std::function (provided a suitable C++11 compiler is available) via a pull request by Christian Authmann.

    • A new_env() function was added to Environment.h

    • The return value of range eraser for Vectors was fixed in a pull request by Yixuan Qiu.

  • Changes in Rcpp Sugar:

    • In ifelse(), the returned NA type was corrected for operator[].

  • Changes in Rcpp Attributes:

    • Include LinkingTo in DESCRIPTION fields scanned to confirm that C++ dependencies are referenced by package.

    • Add dryRun parameter to sourceCpp.

    • Corrected issue with relative path and R chunk use for sourceCpp.

  • Changes in Rcpp Documentation:

    • The Rcpp-FAQ vignette was updated with respect to OS X issues.

    • A new entry in the Rcpp-FAQ clarifies the use of licenses.

    • Vignettes build results no longer copied to /tmp to please CRAN.

    • The Description in DESCRIPTION has been shortened.

  • Changes in Rcpp support functions:

    • The Rcpp.package.skeleton() function will now use pkgKitten package, if available, to create a package which passes R CMD check without warnings. A new Suggests: has been added for pkgKitten.

    • The modules=TRUE case for Rcpp.package.skeleton() has been improved and now runs without complaints from R CMD check as well.

  • Changes in Rcpp unit test functions:

    • Functions from the RUnit package are now prefixed with RUnit::

    • The testRcppModule and testRcppClass sample packages now pass R CMD check --as-cran cleanly with NOTES or WARNINGS

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rcpp | permanent link