|
|
Bio Code
rcpp
changelog rcppexamples rcpparmadillo rcppgsl rcppsmc
rinside rquantlib rpostgresql rprotobuf rvowpalwabbit rdieharder littler random digest beancounter smtm yahooquote octave-mt octave-pg
Papers Talks Linux Quantian About Blog
|
|
 |
|
Overview
The Rcpp package provides C++ classes that greatly facilitate interfacing
C or C++ code in R packages using the .Call() interface provided by R.
Rcpp provides matching C++ classes for a large number of basic R data
types. Hence, a package author can keep his data in normal R data structures
without having to worry about translation or transfering to C++. At the same
time, the data structures can be accessed as easily at the C++ level, and
used in the normal manner.
The mapping of data types works in both directions. It is as
straightforward to pass data from R to C++, as it is it return data
from C++ to R. The following two sections list supported data types.
Transfer from R to C++, and from C++ to R
R data types (SEXP) are matched to C++ objects in a class hierarchy. All R
types are supported (vectors, functions, environment, etc ...) and each
type is mapped to a dedicated class. For example, numeric vectors are
represented as instances of the Rcpp::NumericVector class, environments are
represented as instances of Rcpp::Environment, functions are represented as
Rcpp::Function, etc ...
The underlying C++ library also offers the Rcpp::wrap function which is a
templated function that transforms an arbitrary object into a SEXP. This
makes it straightforward to implement C++ logic in terms of standard C++
types such as STL containers and then wrap them when they need to be
returned to R. Internally, wrap uses advanced template meta programming
techniques and currently supports these data types: primitive types (bool, int, double,
size_t, Rbyte, Rcomplex, std::string), STL containers (e.g std::vector)
where T is wrappable, STL maps (e.g std::map) where T is
wrappable, and arbitrary types that support implicit conversion to SEXP.
The reverse conversion (from R into C++) is performed by the Rcpp::as function
template offering a similar degree of flexibility.
New features
Starting with release 0.7.1, a namespace Rcpp is provided. It contains a main
class RObject as well as other classes that derive from RObject to deal with
environments (ENVSXP) , "Language" for calls (LANGSXP) and the template XPTr
for external pointers.
Releases 0.7.2 and later extend this to a number of additional R types along
with a number of facilities for automatic conversion thanks to clever use of
templates.
Release 0.8.1 adds support for exposing code in C++ directly to R using
modules. The corresponding Rcpp-modules
vignette has more details.
Release 0.8.3 adds sugar: expression templates that allow compact
vectorised expression just like in R but at compiled speed; see the Rcpp-sugar
vignette.
Release 0.8.6 adds special functions cherished for statistics: d/p/q/r-style
for most relevant distribution, in a form that is very close to what we'd use
in R.
Release 0.8.7 adds support for ReferenceClasses in R 2.12.0; this now brings
S4-based ReferenceClasses in the OO-style of Java or C++ to the R language.
Release 0.9.0 split support for the legacy classic API into its own
package RcppClassic.
Release 0.10.0 bring Rcpp attributes, enhanced modules support and
more.
Inline use
As of version 0.7.0, Rcpp also contains a modified function 'cfunction' taken
from the excellent 'inline' package by Oleg Sklyar. This allows the user to
define the body of a C++ function as a standard R character vector -- which
is passed to 'cfunction' along with a few other parameters. The function
then builds a complete C++ source file containing a function with the given
body --- and then compiles, links and loads it for us. Together with the
Rcpp interface classes this makes for very easy use of C++ from R --- as
everything can be done from the R prompt without any need for Makefiles,
configuration settings etc pp.
As of version 0.8.1, an extended function 'cxxfunction' is used (which
requiers inline 0.3.5). This function makes it easier to use C++ code with Rcpp. In
particular, it enforces use of the .Call interface, adds the Rcpp amespace,
and sets up exception forwarding. It employs the macros BEGIN_RCPP and
END_RCPP macros to enclose the user code
Moreover, with cfunction (and cxxfunction), we can even call external
libraries and have them linked as well.
Several examples of this are included with the packages; one has also been posted on my
blog.
This even works on Windows if you have the working 'R tools' installed along
with R. See the R-on-Windows FAQ and additional documentation.
With version 0.10.0, this has been complemented by Rcpp attributes
which is even easier and more powerful than inline --- see the corresponding
vignette for details.
Unit testing
As of version 0.10.2, over 870 unit tests called from over 390 unit test
functions are included in the package to ensure that no regressions are
introduced in terms of API compatibility. The unit tests also serve as a
(arguably somewhat raw) form of examples for usage. A vignette is
auto-generated with the results of the unit tests.
Usage for package building
Rcpp provides a main header file Rcpp.h and a library inside the installed
package in the directory lib. From within R, you can compute
the directory location via
system.file("lib", "Rcpp.h", package="Rcpp")--but both are
provided for your use via the functions Rcpp::RcppCxxFlags()
and Rcpp::RcppLdFlags() functions. So we can just use the following as a
file src/Makevars (or src/Makevars.win on Windows)
PKG_CXXFLAGS=`${R_HOME}/bin/Rscript -e "Rcpp:::CxxFlags()"`
PKG_LIBS=`${R_HOME}/bin/Rscript -e "Rcpp:::LdFlags()"`
See the help page for Rcpp-package for details.
Also note that starting with version 0.8.0, the 'LinkingTo' argument can also be employed in
packages using Rcpp. This will let R determine the location of the header
files and users only need to use Rcpp::RcppLdFlags() (as
detailed above) to point to the actual library, and this is clearly the
recommended approach.
Moreover, we added an entire vignette on how
to use Rcpp in your package with a detailed discussion.
Demo package
The RcppExamples package (on CRAN) provides a simple illustration of how to
use Rcpp, and can also be used as a framework for deploying Rcpp. This
package is however somewhat incomplete in terms of example, so please see
below for examples provides by several dozen packages using Rcpp.
Class documentation
We now have Doxygen-generated documentation of all the classes in
browseable and searchable html
and as a
pdf file.
We no longer include the Doxygen-generated documentation in the source
tarball as it simply too big. But we have zip archives of the
html,
latex, and
man documentation.
Other documentation
Besides the doxygen-generated reference manual we also have these eight vignettes:
- The Rcpp-introduction
vignette provides a short overview of Rcpp and an introduction (and has
also been published as Volume 40, Issue 8 of theJournal of Statistical Software),
- the Rcpp-package
vignette shows how to write your own package using Rcpp,
- the Rcpp-FAQ
vignette addresses several frequently asked questions,
- Rcpp-modules
vignette discusses how to expose C++ functions and modules with ease
using an idea borrowed from Boost::Python,
- the Rcpp-extending
vignette details the steps needed to extend Rcpp with user-provided or third-party
classes,
- the Rcpp-sugar
vignette provides an introduction to the Rcpp sugar features
inspired by vectorised R code,
- the Rcpp-attributes
vignette introduces the attributes features for getting C++ into R with ease,
- the Rcpp-quickref
vignette provides a quick reference cheat sheet (but is still mostly incomplete),
- the Rcpp-attributes
vignette details the high-level syntax for declaring C++ functions as
callable from R and shows how to automatically generate the code required
to invoke them, and
- the Rcpp-unitTests
vignette contains a summary of the (by now over two hundred) units tests for Rcpp.
All vignettes are also installed with the package, and available at the CRAN page.
Google Tech Talk
In late October 2010, the R intergrouplet at Google was kind enough
to invite us for a talk on Rcpp. The resulting talk was recorded and is now
available on YouTube
Example usage
The following CRAN, R-Forge or
BioConductor packages use Rcpp:
- RQuantLib, an R
interface to QuantLib quantitative
finance libraries
- RInside, a set of C++
classes that make it easy to embed R in your C++ applications
- EarthMoveDist,
an implementation of the Earth Move distance metric for R
- RProtoBuf,
an interface from R to the Google ProtoBuf library
- mvabund, a
set of tools for displaying, modeling and analysing multivariate abundance
data in community ecology.
- sdcTable, a
package for statistical disclosure control for tabular data.
- bifactorial, a
package for global and multiple inference for given bi- and tri-factorial
clinical trial designs.
- RcppExamples, a
example package illustrating use of Rcpp and providing concrete examples.
- RcppArmadillo, an
interface from R to the Armadillo C++ linear algebra library using Rcpp.
- minqa which
provides derivative-free optimization by
quadratic approximation based on an interface to Fortran implementations by
M. J. D. Powell.
- pcaMethods
provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA.
- termstrc
offers a wide range of functions for term structure estimation based on
static and dynamic coupon bond and yield data.
- phylobase
implements a base S4 class for comparison of phylogenetic structures and data.
- RSNNS
wraps the Stuttgart Neural Network Simulator (SNNS), a library containing
many standard implementations of neural networks, and brings these to R.
- parser
implements a detailed source code parser based on the R parser and grammar
with a different representation of the parsed expressions.
- RcppGSL which
provides an interface from R to the
GNU GSL vector and matrix
types.
- orQA can be used
to assess repeatability, accuracy and cross-platform agreement of titration
microarray data.
- RcppDE provides
differential evolution optimization (just like
(DEoptim which it
is based) and serves as a small case study in porting from plain C to the
combination of C++ and Rcpp.
- RcppBDT provides
(parts of) Boost Date.Time
by using Rcpp modules to easily expose the Boost functionality to R.
- unmarked uses
Rcpp and RcppArmadillo to provide code to fit hierarchical models of animal
abundance and occurrence to data collected using survey methods such as
point counts, site occupancy sampling, distance sampling, removal sampling,
and double observer sampling.
- The simFrame package
provides an object-oriented general framework for statistical simulations.
- The rgam package
also uses Rcpp and RcppArmadillo to provide an outlier-robust fit for
Generalized Additive (gam) Models.
- The spacodiR package
implements an interface to SPACoDi which is primarily designed to
characterise the structure and phylogenetic diversity of communities using
abundance or presence-absence data of species among community plots.
- The VIM package
provides visualization for missing values.
- NetworkAnalysis
provides statistical inference on populations of weighted or unweighted
networks.
- The SBSA package
uses RcppArmadillo to provides functions for simplified Bayesian sensitivity analysis.
- GUTS contains
functions for the fast calculation of the likelihood of a stochastic
survival model.
- The wordcloud package
uses Rcpp to accelerate rendering word clouds from text.
- auteur
implements a Bayesian sampler of the trait-evolutionary process to
identify shifts in process of continuous-trait evolution on phylogenetic trees.
- The cda
pakages uses Rcpp modules and RcppArmadillo to model coupled dipole
approximations: Given a set of ellipsoidal nanoparticles, it calculates the
polarizability tensor for the dipoles associated with each particle, and
solves the coupled-dipole equations by direct inversion of the interaction
matrix.
- The planar
pakages uses Rcpp modules and RcppArmadillo to solves the electromagnetic
problem of reflection and transmission at a multilayer planar
interface. Also computes the decay rates for a dipolar emitter near a
multilayer structure.
- The maxent
package provides tools for text classification using multinomial logistic
regression, also known as maximum entropy. The focus of this maximum
entropy classifier is to minimize memory consumption on very large
datasets, particularly sparse document-term matrices represented by the tm
package.
- fdaMixed
offers functional data analysis in a mixed-model framework via a
likelihood-based analysis; it uses Rcpp and RcppArmadillo.
- KernSmoothIRT
fits nonparametric item and option characteristic curves using kernel
smoothing, and allows for optimal selection of the smoothing bandwidth using
cross-validation and a variety of exploratory plotting tools.
- The rugarch
package can estimate a variety of univariate GARCH models including ARFIMA,
in-mean effects, use of external regressors and various other GARCH
flavours using both Rcpp and RcppArmadillo.
- bcp provides an
implementation of an approximation to the product partition model for the
normal errors change point problem using Markov Chain Monte Carlo, and also
extends the methodology to independent multivariate series with an assumed
common change point structure.
- RVowpalWabbit
provides an interface to the Vowpal Wabbit fast on-line learner by John
Langford et al.
- The rococo
package provides a robust gamma rank correlation coefficient
along with a permutation-based rank correlation test both of which
are explicitly designed for dealing with noisy numerical data.
- The LaF
package provides methods to efficiently access data from large ascii files,
including subsetting and block-wise access.
- The Rclusterpp
package provides flexible native clustering routines that can be linked
against in downstream packages, and uses Rcpp and RcppEigen.
- The bfa
package provides model fitting for several Bayesian factor models
including Gaussian, ordinal probit, mixed and semiparametric Gaussian
copula factor models; it uses Rcpp and RcppArmadillo.
- RSofia
provides an R interface to the sofia-ml suite of fast
incremental algorithms for machine learning suitable for training
models for classification or ranking.
- The fastGHQuad package
implements functions for fast (and numerically stable) Gauss-Hermite quadrature.
- The SpatialTools package
provides tools for spatial analysis with an emphasis on kriging using Rcpp
and RcppArmadillo.
- acer implements the
ACER method for extreme value estimation which finds return levels of extreme values.
- RcppSMC implements several
Sequential Monte Carlo / Particle Filter models using the SMC template
library by Adam Johansen.
- The psgp package provides
projected spatial gaussian process methods for sparse spatial kriging; it
uses Rcpp and RcppArmadillo.
- phom computes
persistent homology of filtered simplicial complexes, and provides
facilities for constructing complexes from geometric data.
- The BioConductor package GRENITS
uses Rcpp and RcppArmadillo to implement network inference statistical
models using Dynamic Bayesian Networks and Gibbs Variable Selection.
- The BioConductor package mosaics
provides functions for fitting MOSAiCS, a statistical framework to analyze one-sample or two-sample ChIP-seq data.
- The BioConductor package mzR
provides a unified API to the common file formats and parsers available
for mass spectrometry data.
- The WideLm package uses Rcpp as well the NVidia CUDA API (>= 4.1) to
simultaneously estimate a large number of 'tall and skinny' models from the same dataset.
-
The forecast
package provides methods and tools for displaying and analysing
univariate time series forecasts including exponential smoothing via state
space models and automatic ARIMA modelling; it uses Rcpp and RcppArmadillo.
-
The multmod
package implements functions for testing of multiple outcomes using i.i.d. decompositions.
-
The openair
package provides tools to analyse, interpret and understand air pollution
data, typically from hourly time series and both monitoring data and dispersion model output can be analysed.
-
The Rmixmod
package implements high-performance model-based cluster analysis for mixture modelling.
-
The sdcMicro package contains
statistical disclosure control methods for the generation of public- and
scientific-use files and can be used for the generation of anonymized
(micro)data, i.e. for the generation of public- and scientific-use files.
- The BioConductor package Rdisop
uses Rcpp and RcppClassic for the decomposition of isotopic patterns.
- The Rmalchains package implements
an algorithm family for continuous optimization called memetic
algorithms with local search chains (MA-LS-Chains); memetic algorithms are
hybridizations of genetic algorithms with local search methods.
- The growcurves package
provides Bayesian semiparametric growth curve models that additionally
include multiple membership random effects, using both Rcpp and
RcppArmadillo.
- The apcluster package
implements Frey's and Dueck's Affinity Propagation clustering algorithm in
R, and also provides an algorithm for exemplar-based agglomerative
clustering that can also be used to join clusters obtained from affinity
propagation.
- The survSNP
package provides power and sample size calculations for SNP association
studies with right censored time to event outcomes.
- The robustHD
package provides robust methods for high-dimensional data, in particular
linear model selection techniques based on least angle regression and
sparse regression; it uses RcppArmadillo.
- The sparseLTSEigen
package implements an RcppEigen-based back-end for sparse least trimmed squares regression
with an L1 penalty; it uses RcppEigen.
- The waffect
package simulates phenotypic (case or control) datasets under a disease
model H1 such that the total number of cases is constant across all the
simulations.
- The zic
package implements Bayesian inference for zero-inflated count models using
MCMC written in C++; the package uses Rcpp and RcppArmadillo.
- The rcppbugs
package provides an R bindings to the CppBugs C++ library for MCMC and
aims to make writing mcmc models as painless as possible by
incorporating features from both WinBugs and PyMC. It uses both Rcpp and
RcppArmadillo.
- The mirt
package implements multidimensional item response theory for the
analysis of dichotomous and polychotomous response data
using latent trait models under the Item Response Theory
paradigm.
- The mets
package helps with the analysis of multivariate event times by implementing
various statistical models for multivariate event history data, including
multivariate cumulative incidence models, and bivariate random effects
probit models (liability models).
- The bfp
package implements the Bayesian paradigm for fractional polynomial models
under the assumption of normally distributed error terms.
- The gof
package implements model-checking techniques for generalized linear
models and linear structural equation models based on cumulative
residuals; it uses Rcpp and RcppArmadillo.
- The RcppOctave
package provides a bidirectional interface to GNU Octave, allowing R to
call Octave functions and script files.
- The blockcluster
package provides co-clustering for Binary, contingency and continuous
utility functions to visualize the results.
- The RcppCNPy
package uses Carl Rogers to read / write files created by / for Numeric
Python (aka "numpy").
- The MVB
package fits log-linear models for multivariate Bernoulli distributions
with mixed effect models and LASSO.
- The surveillance
package provides statistical methods for modeling and change-point
detection in time series of counts, proportions and categorical data for
temporal and spatio-temporal modeling and monitoring of epidemic phenomena.
- The fugeR
package provides "FUzzy GEnetic" machine learning for prediction models.
- classify
provides classification accuracy under IRT models.
- The ccaPP
package implements robust canonical correlation analysis via projection
pursuit; it uses Rcpp and RcppArmadillo.
- trustOptim
provides a trust region algorithm for nonlinear minimization with methods
that are designed to be efficient when the Hessian is sparse; it uses
Rcpp and RcppEigen.
- The tmg
package implements truncated multivariate gaussian sampling using
Hamiltonian Monte Carlo where the truncation is defined using linear
and/or quadratic polynomials; it uses Rcpp and RcppEigen.
- The mRMRe
package implements parallelized mRMR ensemble feature selection
to compute mutual information matrices from continuous, categorical and
survival variables; it also contains a function to perform feature
selection with mRMR and a new ensemble mRMR technique.
- The clusteval
package provides a suite of tools to evaluate clustering algorithms,
clusterings, and individual clusters.
- oem implements
orthogonolizing expectations maximisation to fit penalized regression; it
uses Rcpp and RcppArmadillo.
The quadrupen package
fits classical sparse regression models with efficient active set
algorithms by solving quadratic problems and also provides a few methods
for model selection purpose (cross-validation, stability selection); it
uses Rcpp and RcppArmadillo.
- The pbdBASE
package implements methods and classes for distributed data types using
MPI, and the
pbdDMAT provides
distributed linear algebra computations; both are part of a set of
packages for Programming with Big Data.
- The EpiContactTrace
packages provides routines for epidemiological contact tracing and
visualisation of network of contacts.
- The transmission
package simulates and fits continuous time infectious disease transmission models.
The Rchemcpp
package compares sets of molecules and returns a similarity matrix based
on the chemcpp library; it uses uses Rcpp and RcppClassic.
- The robustgam package
implements robust estimation for generalized additive models by
implementing the fast and stable algorithm in Wong, Yao and Lee (2012).
- The sparseHessianFD
package computes the sparse Hessian using ACM TOMS Algorithm 636; it uses
Rcpp and RcppEigen.
- The gMWT
package provides generalized Mann-Whitney type tests based on
probabilistic indices; it uses Rcpp and RcppArmadillo.
- The ngspatial
package provides tools for analyzing spatial data, especially
non-Gaussian areal data; it use Rcpp and RcppArmadillo.
- The surveillance
package implements tools for temporal and spatio-temporal modeling and
monitoring of epidemic phenomena.
- The GeneticTools
package contains a collection of routines for the analysis of expression
and genotype data, it uses Rcpp and RcppArmadillo.
- RcppClassicExamples
regroups examples from the deprecated initial API now provided by RcppClassic.
- The jaatha
package provides a fast parameter estimation method for evolutionary biology.
- The ConConPiWiFun
package implements continuous convex piecewise linear functions which are
useful for large class of optimization problems.
- The RcppRoll
package supplies fast functions for rolling over vectors and matrices,
and provides utility functions 'rollit' and 'rollit_raw' as an interface
for generating C++ backed rolling functions; it uses Rcpp and RcppArmadillo.
- rforensicbatwing
calculates forensic trace-suspect match probabilities using a modified
version of Ian Wilson's BATWING program.
- The RcppXts
package facilitates access to the C API functions of xts from Rcpp.
- The stochvol
package implements efficient algorithms for fully Bayesian estimation of
stochastic volatility (SV) models via MCMC.
- The marked
provides a framework for handling data and analysis for mark-recapture.
- The RMessenger
package R with access to the instant messaging protocol XMPP using an
embedded copy of libstrophe.
- The PReMiuN
package implements Dirichlet pricess Bayesian clustering also known as
profile regression.
- The ALKr
package provides several algorithms for generating age-length keys for
fish population from incomplete data.
- The ecp
package computes hierarchical change point analysis through the use of
the energy statistic for multiple change point analysis of multivariate data.
- The ExactNumCI
packages computes exact confidence interval for binomial proportions.
- The rexpokit
packages implements wrappers for EXPOKIT, a Fortran library for matrix
exponentiation.
- The amelia
package for missing data imputation now uses Rcpp too.
exponentiation.
History
Rcpp was initially written by Dominick Samperi to ease contributions to the
RQuantLib
project, and then released as a project in its own right. During 2006,
Dominick made several releases under the RCpp name (versions 1.0 to 1.4)
before he changed the name to RCppTemplate and made more releases (1.5 to
5.2). His project saw no public releases for the thirty-five months period
from November 2006 to November 2009.
As a user of Rcpp, I (Dirk) chose to adopt Rcpp during 2008, made a first release
0.6.0 in November 2008 and have made a number of new releases since -- see
the ChangeLog for details. Rcpp is
open for contributions and patches some of which have already been integrated.
Romain Francois
joined the effort just before the 0.7.0 release and brought
along a lot of energy and new ideas. We now have a
mailing list
for discussions around Rcpp. If you have ideas or suggested changes, send an
email there.
Download
A local archive is available
here and at
CRAN;
SVN access is provided at
R-Forge.
License
Rcpp is licensed under the GNU GPL version 2 or later.
Last modified: Sat Feb 16 10:29:47 CST 2013
|
|