The post also fell short of a good *aggregated timing comparison* for which we love the
rbenchmark package. So in order to rectify this, and to see
what we can do here with Rcpp, a quick post revisiting the
issue.

As preliminaries, we need to load three packages: inline to create compiled code on the fly (which, I should mention, is also used together with Rcpp by the Stan / RStan MCMC sampler which is creating some buzz this week), the compiler package included with R to create byte-compiled code and lastly the aforementioned rbenchmark package to do the timings. We also set row and column dimension, and set them a little higher than the original example to actually have something measurable:

The first suggestion was the one by Scott himself. We will wrap this one, and all the following ones, in a function so that all approaches are comparable as being in a function of two dimension arguments:library(inline) library(compiler) library(rbenchmark) n <- 500 k <- 100

We also immediatly compute a byte-compiled version (just because we now can) to see if this helps at all with the code. As there are no (explicit !) loops, we do not expect a big pickup. Scott's function works, but sweeps thescott <- function(N, K) { mm <- matrix(0, N, K) apply(mm, c(1, 2), function(x) sample(c(0, 1), 1)) } scottComp <- cmpfun(scott)

`sample()`

function across all rows and columns which is probably going to be (relatively) expensive.
Next is the first improvement suggested to Scott which came from Ted Hart.

This is quite a bit smarter as it vectorises the approach, generating N times K elements at once which are then reshaped into a matrix.ted <- function(N, K) { matrix(rbinom(N * K, 1, 0.5), ncol = K, nrow = N) }

Another suggestion came from David Smith as well as Rafael Maia. We rewrite it slightly to make it a function with two arguments for the desired dimensions:

This is very clever as it usesdavid <- function(m, n) { matrix(sample(0:1, m * n, replace = TRUE), m, n) }

`sample()`

over zero and one rather than making (expensive) draws from random
number generator.
Next we have a version from Luis Apiolaza:

It draws from a random uniform and rounds to zero and one, rather than deploying the binomial.luis <- function(m, n) { round(matrix(runif(m * n), m, n)) }

Then we have the version using RcppArmadillo hinted at by Scott, but with actual arguments and a correction for row/column dimensions. Thanks to inline we can write the C++ code as an R character string; inline takes care of everything and we end up with C++-based solution directly callable from R:

This works, and is pretty fast. The only problem is thatarma <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); return wrap(arma::randu(n, k)); ')

which uses the the old rounding approximation of adding 1/2 before truncating.armaFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); return wrap(arma::floor(arma::randu(n, k) + 0.5)); ')

With Armadillo in the picture, we do wonder how Rcpp sugar would do. Rcpp sugar, described in one of the eight vignettes
of the Rcpp package, is using template meta-programming to
provide R-like expressiveness (aka "syntactic sugar") at the C++ level. In particular, it gives access to R's RNG
functions *using the exact same RNGs as R* making the results directly substitutable (whereas Armadillo uses its
own RNG).

Heresugar <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); Rcpp::RNGScope tmp; Rcpp::NumericVector draws = Rcpp::runif(n*k); return Rcpp::NumericMatrix(n, k, draws.begin()); ')

`Rcpp::RNGScope`

deals with setting/resetting the R RNG state. This draws a vector of N time K uniforms
similar to Luis' function -- and just like Luis' R function does so without looping -- and then shapes a matrix of dimension N by K from it.
And it does of course have the same problem as the RcppArmadillo approach earlier and we can use the same solution:

sugarFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); Rcpp::RNGScope tmp; Rcpp::NumericVector draws = Rcpp::floor(Rcpp::runif(n*k)+0.5); return Rcpp::NumericMatrix(n, k, draws.begin()); ')

Now that we have all the pieces in place, we can compare:

With all the above code example in a small R script we call via littler, we getres <- benchmark(scott(n, k), scottComp(n,k), ted(n, k), david(n, k), luis(n, k), arma(n, k), sugar(n,k), armaFloor(n, k), sugarFloor(n, k), order="relative", replications=100) print(res[,1:4])

We can see several takeaways:edd@max:~/svn/rcpp/pkg$ r /tmp/scott.r Loading required package: methods test replications elapsed relative 7 sugar(n, k) 100 0.072 1.000000 9 sugarFloor(n, k) 100 0.088 1.222222 6 arma(n, k) 100 0.126 1.750000 4 david(n, k) 100 0.136 1.888889 8 armaFloor(n, k) 100 0.138 1.916667 3 ted(n, k) 100 0.384 5.333333 5 luis(n, k) 100 0.410 5.694444 1 scott(n, k) 100 33.045 458.958333 2 scottComp(n, k) 100 33.767 468.986111

- Rcpp sugar wins, which is something we have seen in previous posts on this blog. One hundred replication take only 72 milliseconds (or 88 in the corrected version) --- less than one millisecond per matrix creation.
- RcppArmadillo does well too, and I presume that the small difference is due not to code in Armadillo but the fact that we need one more 'mapping' of data types on the way back to R
- The
`sample()`

idea by David and Rafael is very, very fast too. This proves once again that*well-written*R code can be competitive. It also suggest how to make the C++ solution by foregoing (expensive) RNG draws in favour of sampling - The approaches by Ted and Luis are also pretty good. In practice, the are probably good enough.
- Scott's function is not looking so hot (particularly as we increased the problem dimensions) and byte-compilation does not help at all.

`rbinom()`

Rcpp sugar
function, or implementing `sample()`

at the C++ level is, as the saying goes, left as an exercise to the reader.