|
|
Bio Code
rcpp
changelog rcppexamples rcpparmadillo rcppgsl
rinside rquantlib rpostgresql rprotobuf rvowpalwabbit rdieharder littler random digest beancounter smtm yahooquote octave-mt octave-pg
Linux Quantian About Blog
|
|
 |
|
Overview
The Rcpp package provides C++ classes that greatly facilitate interfacing
C or C++ code in R packages using the .Call() interface provided by R.
Rcpp provides matching C++ classes for a large number of basic R data
types. Hence, a package author can keep his data in normal R data structures
without having to worry about translation or transfering to C++. At the same
time, the data structures can be accessed as easily at the C++ level, and
used in the normal manner.
The mapping of data types works in both directions. It is as
straightforward to pass data from R to C++, as it is it return data
from C++ to R. The following two sections list supported data types.
Transfer from R to C++, and from C++ to R
R data types (SEXP) are matched to C++ objects in a class hierarchy. All R
types are supported (vectors, functions, environment, etc ...) and each
type is mapped to a dedicated class. For example, numeric vectors are
represented as instances of the Rcpp::NumericVector class, environments are
represented as instances of Rcpp::Environment, functions are represented as
Rcpp::Function, etc ...
The underlying C++ library also offers the Rcpp::wrap function which is a
templated function that transforms an arbitrary object into a SEXP. This
makes it straightforward to implement C++ logic in terms of standard C++
types such as STL containers and then wrap them when they need to be
returned to R. Internally, wrap uses advanced template meta programming
techniques and currently supports these data types: primitive types (bool, int, double,
size_t, Rbyte, Rcomplex, std::string), STL containers (e.g std::vector)
where T is wrappable, STL maps (e.g std::map) where T is
wrappable, and arbitrary types that support implicit conversion to SEXP.
The reverse conversion (from R into C++) is performed by the Rcpp::as function
template offering a similar degree of flexibility.
New features
Starting with release 0.7.1, a namespace Rcpp is provided. It contains a main
class RObject as well as other classes that derive from RObject to deal with
environments (ENVSXP) , "Language" for calls (LANGSXP) and the template XPTr
for external pointers.
Releases 0.7.2 and later extend this to a number of additional R types along
with a number of facilities for automatic conversion thanks to clever use of
templates.
Release 0.8.1 adds support for exposing code in C++ directly to R using
modules. The corresponding Rcpp-modules
vignette has more details.
Release 0.8.3 adds sugar: expression templates that allow compact
vectorised expression just like in R but at compiled speed; see the Rcpp-sugar
vignette.
Release 0.8.6 adds special functions cherished for statistics: d/p/q/r-style
for most relevant distribution, in a form that is very close to what we'd use
in R.
Release 0.8.7 adds support for ReferenceClasses in R 2.12.0; this now brings
S4-based ReferenceClasses in the OO-style of Java or C++ to the R language.
Release 0.9.0 split support for the legacy classic API into its own
package RcppClassic.
Inline use
As of version 0.7.0, Rcpp also contains a modified function 'cfunction' taken
from the excellent 'inline' package by Oleg Sklyar. This allows the user to
define the body of a C++ function as a standard R character vector -- which
is passed to 'cfunction' along with a few other parameters. The function
then builds a complete C++ source file containing a function with the given
body --- and then compiles, links and loads it for us. Together with the
Rcpp interface classes this makes for very easy use of C++ from R --- as
everything can be done from the R prompt without any need for Makefiles,
configuration settings etc pp.
As of version 0.8.1, an extended function 'cxxfunction' is used (which
requiers inline 0.3.5). This function makes it easier to use C++ code with Rcpp. In
particular, it enforces use of the .Call interface, adds the Rcpp amespace,
and sets up exception forwarding. It employs the macros BEGIN_RCPP and
END_RCPP macros to enclose the user code
Moreover, with cfunction (and cxxfunction), we can even call external
libraries and have them linked as well.
Several examples of this are included with the packages; one has also been posted on my
blog.
This even works on Windows if you have the working 'R tools' installed along
with R. See the R-on-Windows FAQ and additional documentation.
Unit testing
As of version 0.9.9, over 750 unit tests called from over 330 unit test
functions are included in the package to ensure that no regressions are
introduced in terms of API compatibility. The unit tests also serve as a
(arguably somewhat raw) form of examples for usage. A vignette is
auto-generated with the results of the unit tests.
Usage for package building
Rcpp provides a main header file Rcpp.h and a library inside the installed
package in the directory lib. From within R, you can compute
the directory location via
system.file("lib", "Rcpp.h", package="Rcpp")--but both are
provided for your use via the functions Rcpp::RcppCxxFlags()
and Rcpp::RcppLdFlags() functions. So we can just use the following as a
file src/Makevars (or src/Makevars.win on Windows)
PKG_CXXFLAGS=`${R_HOME}/bin/Rscript -e "Rcpp:::CxxFlags()"`
PKG_LIBS=`${R_HOME}/bin/Rscript -e "Rcpp:::LdFlags()"`
See the help page for Rcpp-package for details.
Also note that starting with version 0.8.0, the 'LinkingTo' argument can also be employed in
packages using Rcpp. This will let R determine the location of the header
files and users only need to use Rcpp::RcppLdFlags() (as
detailed above) to point to the actual library, and this is clearly the
recommended approach.
Moreover, we added an entire vignette on how
to use Rcpp in your package with a detailed discussion.
Demo package
The RcppExamples package (on CRAN) provides a simple illustration of how to
use Rcpp, and can also be used as a framework for deploying Rcpp. This
package is however somewhat incomplete in terms of example, so please see
below for examples provides by several dozen packages using Rcpp.
Class documentation
We now have Doxygen-generated documentation of all the classes in
browseable and searchable html
and as a
pdf file.
We no longer include the Doxygen-generated documentation in the source
tarball as it simply too big. But we have zip archives of the
html,
latex, and
man documentation.
Other documentation
Besides the doxygen-generated reference manual we also have these eight vignettes:
- The Rcpp-introduction
vignette provides a short overview of Rcpp and an introduction (and has
also been published as Volume 40, Issue 8 of theJournal of Statistical Software),
- the Rcpp-package
vignette shows how to write your own package using Rcpp,
- the Rcpp-FAQ
vignette addresses several frequently asked questions,
- Rcpp-modules
vignette discusses how to expose C++ functions and modules with ease
using an idea borrowed from Boost::Python,
- the Rcpp-extending
vignette details the steps needed to extend Rcpp with user-provided or third-party
classes,
- the Rcpp-sugar
vignette provides an introduction to the Rcpp sugar features
inspired by vectorised R code,
- the Rcpp-quickref
vignette provides a quick reference cheat sheet (but is still mostly incomplete), and
- the Rcpp-unitTests
vignette contains a summary of the (by now over two hundred) units tests for Rcpp.
All vignettes are also installed with the package, and available at the CRAN page.
Google Tech Talk
In late October 2010, the R intergrouplet at Google was kind enough
to invite us for a talk on Rcpp. The resulting talk was recorded and is now
available on YouTube
Example usage
The following CRAN, R-Forge or
BioConductor packages use Rcpp:
- RQuantLib, an R
interface to QuantLib quantitative
finance libraries
- RInside, a set of C++
classes that make it easy to embed R in your C++ applications
- EarthMoveDist,
an implementation of the Earth Move distance metric for R
- RProtoBuf,
an interface from R to the Google ProtoBuf library
- mvabund, a
set of tools for displaying, modeling and analysing multivariate abundance
data in community ecology.
- sdcTable, a
package for statistical disclosure control for tabular data.
- highlight, a
syntax highlighting utility based on an R parser that can render to latex
and html.
- bifactorial, a
package for global and multiple inference for given bi- and tri-factorial
clinical trial designs.
- RcppExamples, a
example package illustrating use of Rcpp and providing concrete examples.
- RcppArmadillo, an
interface from R to the Armadillo C++ linear algebra library using Rcpp.
- minqa which
provides derivative-free optimization by
quadratic approximation based on an interface to Fortran implementations by
M. J. D. Powell.
- pcaMethods
provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA.
- termstrc
offers a wide range of functions for term structure estimation based on
static and dynamic coupon bond and yield data.
- phylobase
implements a base S4 class for comparison of phylogenetic structures and data.
- RSNNS
wraps the Stuttgart Neural Network Simulator (SNNS), a library containing
many standard implementations of neural networks, and brings these to R.
- parser
implements a detailed source code parser based on the R parser and grammar
with a different representation of the parsed expressions.
- RcppGSL which
provides an interface from R to the
GNU GSL vector and matrix
types.
- orQA can be used
to assess repeatability, accuracy and cross-platform agreement of titration
microarray data.
- RcppDE provides
differential evolution optimization (just like
(DEoptim which it
is based) and serves as a small case study in porting from plain C to the
combination of C++ and Rcpp.
- RcppBDT provides
(parts of) Boost Date.Time
by using Rcpp modules to easily expose the Boost functionality to R.
- unmarked uses
Rcpp and RcppArmadillo to provide code to fit hierarchical models of animal
abundance and occurrence to data collected using survey methods such as
point counts, site occupancy sampling, distance sampling, removal sampling,
and double observer sampling.
- The simFrame package
provides an object-oriented general framework for statistical simulations.
- The rgam package
also uses Rcpp and RcppArmadillo to provide an outlier-robust fit for
Generalized Additive (gam) Models.
- The spacodiR package
implements an interface to SPACoDi which is primarily designed to
characterise the structure and phylogenetic diversity of communities using
abundance or presence-absence data of species among community plots.
- The VIM package
provides visualization for missing values.
- NetworkAnalysis
provides statistical inference on populations of weighted or unweighted
networks.
- The SBSA package
uses RcppArmadillo to provides functions for simplified Bayesian sensitivity analysis.
- GUTS contains
functions for the fast calculation of the likelihood of a stochastic
survival model.
-
FABIA implements a model-based technique for biclustering, that is
clustering rows and columns simultaneously. Biclusters are found by factor
analysis where both the factors and the loading matrix are sparse.
- The wordcloud package
uses Rcpp to accelerate rendering word clouds from text.
- auteur
implements a Bayesian sampler of the trait-evolutionary process to
identify shifts in process of continuous-trait evolution on phylogenetic trees.
- The cds
pakages uses Rcpp modules and RcppArmadillo to model coupled dipole
approximations: Given a set of ellipsoidal nanoparticles, it calculates the
polarizability tensor for the dipoles associated with each particle, and
solves the coupled-dipole equations by direct inversion of the interaction
matrix.
- The planar
pakages uses Rcpp modules and RcppArmadillo to solves the electromagnetic
problem of reflection and transmission at a multilayer planar
interface. Also computes the decay rates for a dipolar emitter near a
multilayer structure.
- The maxent
package provides tools for text classification using multinomial logistic
regression, also known as maximum entropy. The focus of this maximum
entropy classifier is to minimize memory consumption on very large
datasets, particularly sparse document-term matrices represented by the tm
package.
- fdaMixed
offers functional data analysis in a mixed-model framework via a
likelihood-based analysis; it uses Rcpp and RcppArmadillo.
- KernSmoothIRT
fits nonparametric item and option characteristic curves using kernel
smoothing, and allows for optimal selection of the smoothing bandwidth using
cross-validation and a variety of exploratory plotting tools.
- The rugarch
package can estimate a variety of univariate GARCH models including ARFIMA,
in-mean effects, use of external regressors and various other GARCH
flavours using both Rcpp and RcppArmadillo.
- bcp provides an
implementation of an approximation to the product partition model for the
normal errors change point problem using Markov Chain Monte Carlo, and also
extends the methodology to independent multivariate series with an assumed
common change point structure.
- RVowpalWabbit
provides an interface to the Vowpal Wabbit fast on-line learner by John
Langford et al.
- The rococo
package provides a robust gamma rank correlation coefficient
along with a permutation-based rank correlation test both of which
are explicitly designed for dealing with noisy numerical data.
- The LaF
package provides methods to efficiently access data from large ascii files,
including subsetting and block-wise access.
- The ANN
package implements a feedforward Artificial Neural Network (ANN) optimized
by Genetic Algorithm (GA), using the Rcpp and RcppClassic packages.
- The Rclusterpp
package provides flexible native clustering routines that can be linked
against in downstream packages, and uses Rcpp and RcppEigen.
- The bfa
package provides model fitting for several Bayesian factor models
including Gaussian, ordinal probit, mixed and semiparametric Gaussian
copula factor models; it uses Rcpp and RcppArmadillo.
- The nfda
package implements nonparametric functional data analysis; it also uses
Rcpp and RcppArmadillo.
- RSofia
provides an R interface to the sofia-ml suite of fast
incremental algorithms for machine learning suitable for training
models for classification or ranking.
- The fastGHQuad package
implements functions for fast (and numerically stable) Gauss-Hermite quadrature.
- The SpatialTools package
provides tools for spatial analysis with an emphasis on kriging using Rcpp
and RcppArmadillo.
- acer implements the
ACER method for extreme value estimation which finds return levels of extreme values.
- The psgp package provides
projected spatial gaussian process methods for sparse spatial kriging; it
uses Rcpp and RcppArmadillo.
History
Rcpp was initially written by Dominick Samperi to ease contributions to the
RQuantLib
project, and then released as a project in its own right. During 2006,
Dominick made several releases under the RCpp name (versions 1.0 to 1.4)
before he changed the name to RCppTemplate and made more releases (1.5 to
5.2). His project saw no public releases for the thirty-five months period
from November 2006 to November 2009.
As a user of Rcpp, I (Dirk) chose to adopt Rcpp during 2008, made a first release
0.6.0 in November 2008 and have made a number of new releases since -- see
the ChangeLog for details. Rcpp is
open for contributions and patches some of which have already been integrated.
Romain Francois
joined the effort just before the 0.7.0 release and brought
along a lot of energy and new ideas. We now have a
mailing list
for discussions around Rcpp. If you have ideas or suggested changes, send an
email there.
Download
A local archive is available
here and at
CRAN;
SVN access is provided at
R-Forge.
License
Rcpp is licensed under the GNU GPL version 2 or later.
Last modified: Tue Jan 24 07:32:23 CST 2012
|
|