Mon, 17 Jan 2011

Keeping simple things simple

My friend Jeff deserves a sincere congratulation for finally unveiling his rebranded R consultancy Lemnica. One notable feature of the new website is a section called esoteric R which discusses less frequently-visited corners of the R world. It even boasts its own CRAN package esotericR with the example sources. esoteric R currently holds two articles. Jeff had sent me the one about introducing closures a while back, and I like it and may comment at another time. What caught me by surprise when Lemnica finally opened was the other article: R calling C.

It is a fine article motivated by all the usual reasons that are e.g. mentioned in the Google Tech Talk which Romain and I gave last October about our work around Rcpp. But it is just not simple.

Allow me to explain. When Jeff showed this C language file

#include <R.h>
#include <Rinternals.h>

SEXP esoteric_rev (SEXP x) {
  SEXP res;
  int i, r, P=0;
  PROTECT(res = allocVector(REALSXP, length(x))); P++;

  for(i=length(x), r=0; i>0; i--, r++) {
     REAL(res)[r] = REAL(x)[i-1];
  }

  copyMostAttrib(x, res);
  UNPROTECT(P);
  return res;
}
and then needs several paragraphs to explain what is going on, what is needed to compile and then how to load it --- I simply could not resist. Almost immediately, I emailed back to him something as simple as this using both our Rcpp package as well as the wonderful inline package by Oleg which Romain and I more or less adopted:
library(inline)  ## for cxxfunction()
src <- 'Rcpp::NumericVector x = Rcpp::NumericVector(xs);
        std::reverse(x.begin(), x.end());
        return(x);'
fun <- cxxfunction(signature(xs="numeric"), body=src, plugin="Rcpp")
fun( seq(0, 1, 0.1) )
Here we load inline, and then define a three-line C++ program using facilities from our Rcpp package. All we need to revert a vector is to first access its R object in C++ by instantiating the R vector as a NumericVector. These C++ classes then provide iterators which are compatible with the Standard Template Library (STL). So we simply call the STL function reverse pointing the beginning and end of the vector, and are done! Rcpp then allows us the return the C++ vector which it turns into an R vector. Efficient in-place reversal, just like Jeff had motivated, in three lines. Best of all, we can execute this from within R itself:
R> library(inline)  ## for cxxfunction()
R> src <- 'Rcpp::NumericVector x = Rcpp::NumericVector(xs);
+         std::reverse(x.begin(), x.end());
+         return(x);'
R> fun <- cxxfunction(signature(xs="numeric"), body=src, plugin="Rcpp")
R> fun( seq(0, 1, 0.1) )
 [1] 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
R>

Lastly, Jeff shows a more complete example wherein a new vector is created, and any potential attributes are copied as well. Naturally, we can do that too. First, we used clone() to make a deep copy (ie forcing creation of a new object rather than a mere proxy) and use the same R API function he accessed---but it our case both prefixed with ::Rf_ for R remapping (to protect clashed with other functions with identical names) and a global namespace identifier (as it is a global C function from R).

R> library(inline)
R> src <- 'Rcpp::NumericVector x = Rcpp::clone<Rcpp::NumericVector>(xs);
+         std::reverse(x.begin(), x.end());
+         ::Rf_copyMostAttrib(xs, x);
+         return(x);'
R> fun <- cxxfunction(signature(xs="numeric"), body=src, plugin="Rcpp")
R> obj <- structure(seq(0, 1, 0.1), obligatory="hello, world!")
R> fun(obj)
 [1] 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
attr(,"obligatory")
[1] "hello, world!"
R> obj
 [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
attr(,"obligatory")
[1] "hello, world!"
R>
Both the obj variable and the new copy contain the desired data attribute, the new copy is reversed, the original is untouched---and all in four lines of C++ called via one inline call. I have now been going on for over one hundred lines yet I never had to mention memory management, pointers, PROTECT or other components of the R API for C. Hopefully, this short writeup provided an idea of why Romain and I think Rcpp is the way to go for creating C/C++ functions for extending and enhancing R.

/code/snippets | permanent link