We ran as a so-called ultra team of six runners, as opposed to a regular ream of twelve. The course is cut into 36 segments; on regular teams you get 3, we each had twice that. My first leg was a combined 17 miles in what turned out to be pretty blistering heat in mid-to-late afternoon. My fellow team members were awesome in getting me lots of water an ice, and I managed to hold onto a pace of just over 8 min/miles. One of the harder runs I've had. Next was a wonderful run pretty much exactly at midnight under starry skies---about seven or so miles followed by ten more miles the next morning.
We ended up coming third (yay!) beating the next time by about six or seven seconds (!!) over a total time of 25 or 26 hours.
It was hard. It was fun. It was exhilirating. It may also have broken me as I haven't really run much since. So good intentions for 2013: get back into the groove.
Changes in version 0.1.1 (2012-12-30)
Added versioned Depends: in DESCRIPTION to not build under older versions of Rcpp and RcppClassic
Changes in version 0.1.0 (2012-12-27)
Initial release, split from package RcppExamples
Thanks to CRANberries, you can also look at a diff to the previous release 0.1.0. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
RcppClassic
) out of package
RcppExamples and into a new package
RcppClassicExamples
.
And the updated version 0.1.5 of the RcppExamples
package appeared on CRAN and has now been joined by the initial version 0.1.0 of the new
package RcppClassicExamples
. No code changed were made; manual pages and descriptions where brushed up and that is about it.
Thanks to
CRANberries, you can also look at a
diff to the previous release 0.1.4 of RcppExamples
.
Questions, comments etc should go to the
rcpp-devel mailing list
off the R-Forge page.
Courtesy of CRANberries, there is the set of changes relative to the previous release 0.9.2.
Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
And they were, of course, awesome. Great mix of standards as well as new stuff from the Grammy-nominated new recording. If you can see Kurt Elling live, go. Now. They return for another gig tonight and I will have to see if I can swing that.
This closed the 2012 series of the Unity Temple concerts and the setting was a little, well, weird. Look like mostly concert subscribers, with the commensurate age brackets, and not too many jazz folks. Not sure how many Patricia Barber, along with Larry Kohut on bass, won over, but I enjoyed it. I had secured front row (!!) seats in what is already an wonderfully intimate setting for such a duet concert.
See the Google+ link for a YouTube recording by her.
It brings another great set of enhancements and extensions, building on the recent 0.10.0 and 0.10.1 releases. The new Rcpp attributes were rewritten to not require Rcpp modules (as we encountered on issue with exceptions on Windows when built this way), code was reorganized to significantly accelerate compilation and a couple of new things such as more Rcpp sugar goodies, a new timer class, and a new string class were added. See below for full details.
We also tested this fairly rigorously by checking about two thirds of the over 90 CRAN packages depending on Rcpp (and the remainder required even more package installs which we did not do as this was already taking about 12 total cpu hours to test). We are quite confident that no changes are required (besides one in our own RcppClassic package which we will update.
The complete NEWS
entry for 0.10.2 is below; more details are in the ChangeLog file in the package and on the
Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.10.1. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in Rcpp version 0.10.2 (2012-12-21)
Changes in Rcpp API:
Source and header files were reorganized and consolidated so that compile time are now significantly lower
Added additional check in
Rstreambuf
deletetionAdded support for
clang++
when usinglibc++
, and for ancicpc
instd=c++11
mode, thanks to a patch by Yan ZhouNew class
Rcpp::String
to facilitate working with a single element of a character vectorNew utility class sugar::IndexHash inspired from Simon Urbanek's fastmatch package
Implementation of the equality operator between two Rcomplex
RNGScope
now has an internal counter that enables it to be safely used multiple times in the same stack frame.New class
Rcpp::Timer
for benchmarkingChanges in Rcpp sugar:
More efficient version of
match
based onIndexHash
More efficient version of
unique
base onIndexHash
More efficient version of
in
base onIndexHash
More efficient version of
duplicated
base onIndexHash
More efficient version of
self_match
base onIndexHash
New function
collapse
that implements paste(., collapse= "" )Changes in Rcpp attributes:
Use code generation rather than modules to implement
sourceCpp
andcompileAttributes
(eliminates problem with exceptions not being able to cross shared library boundaries on Windows)Exported functions now automatically establish an
RNGScope
Functions exported by
sourceCpp
now directly reference the external function pointer rather than rely on dynlib lookupOn Windows, Rtools is automatically added to the PATH during
sourceCpp
compilationsDiagnostics are printed to the console if
sourceCpp
fails and C++ development tools are not installedA warning is printed if when
compileAttributes
detectsRcpp::depends
attributes in source files that are not matched by Depends/LinkingTo entries in the package DESCRIPTION
Call for Papers:
R/Finance 2013: Applied Finance with R
May 17 and 18, 2013
University of Illinois, Chicago, IL, USA
The fifth annual R/Finance conference for applied finance using R will be held on May 17 and 18, 2013 in Chicago, IL, USA at the University of Illinois at Chicago. The conference is expected to cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past four years, R/Finance has included attendees from around the world. It featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2013.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for full talks and abbreviated "lightning talks". Both academic and practitioner proposals related to R are encouraged.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters at the discretion of the conference committee. Requests for assistance should be made at the time of submission.
Please send submissions to: committee at RinFinance.com. The submission deadline is February 15, 2013. Submitters will be notified of acceptance via email by February 28, 2013. Notification of whether a presentation will be a long presentation or a lightning talk will also be made at that time.
Additional details will be announced at this website as they become available. Information on previous year's presenters and their presentations are also at the conference website.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
So see you in Chicago in May!
Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.6.1 (2012-12-17)
Upgraded to Armadillo release Version 3.6.1
faster
trace()
fix for handling sparse matrices by
dot()
fixes for interactions between sparse and dense matrices
Now throws compiler error if
Rcpp.h
is included beforeRcppArmadillo.h
(as the former is included automatically by the latter anyway, but template logic prefers this ordering).
One key aspect is the use of the plugins for the inline package. They provide something akin to a callback mechanism so that compilation and linking steps can be informed about header and library locations and names. We are going to illustrate this with an example from GNU Scientific Library (GSL). The example I picked uses B-Spline estimation from the GSL. This is a little redundant as R has its own spline routines and package, but serves well as a simple illustration---and, by reproducing an existing example, followed an established path. So we will look at Section 39.7 of the GSL manual which has a complete example as a standalone C program, generating both the data and the fit via cubic B-splines.
We can decompose this two parts: data generation, and fitting. We will provide one function each, and then use both from R. These two function will follow the aforementioned example from Section 39.7 somewhat closely.
We start with the first function to generate the data.
We include a few header files, define (in what is common for C programs) a few constants and then define a single function// [[Rcpp::depends(RcppGSL)]] #include <RcppGSL.h> #include <gsl/gsl_bspline.h> #include <gsl/gsl_multifit.h> #include <gsl/gsl_rng.h> #include <gsl/gsl_randist.h> #include <gsl/gsl_statistics.h> const int N = 200; // number of data points to fit const int NCOEFFS = 12; // number of fit coefficients */ const int NBREAK = (NCOEFFS - 2); // nbreak = ncoeffs + 2 - k = ncoeffs - 2 since k = 4 */ // [[Rcpp::export]] Rcpp::List genData() { const size_t n = N; size_t i; double dy; gsl_rng *r; RcppGSL::vector<double> w(n), x(n), y(n); gsl_rng_env_setup(); r = gsl_rng_alloc(gsl_rng_default); //printf("#m=0,S=0\n"); /* this is the data to be fitted */ for (i = 0; i < n; ++i) { double sigma; double xi = (15.0 / (N - 1)) * i; double yi = cos(xi) * exp(-0.1 * xi); sigma = 0.1 * yi; dy = gsl_ran_gaussian(r, sigma); yi += dy; gsl_vector_set(x, i, xi); gsl_vector_set(y, i, yi); gsl_vector_set(w, i, 1.0 / (sigma * sigma)); //printf("%f %f\n", xi, yi); } Rcpp::DataFrame res = Rcpp::DataFrame::create(Rcpp::Named("x") = x, Rcpp::Named("y") = y, Rcpp::Named("w") = w); x.free(); y.free(); w.free(); gsl_rng_free(r); return(res); }
genData()
which returns and Rcpp::List
as a list object to R. A primary importance here are the two attributes:
one to declare a dependence on the RcppGSL package, and one to declare the
export of the data generator function. That is all it takes! The plugin of the
RcppGSL will provide information about the headers and library, and Rcpp
Attributes will do the rest.
The core of the function is fairly self-explanatory, and closely follows the
original example. Space gets
allocated, the RNG is setup and a simple functional form generates some data plus noise (see below). In the original, the data is written to
the standard output; here we return it to R as three columns in a data.frame
object familiar to R users. We then free the GSL
vectors; this manual step is needed as they are implemented as C vectors which do not have a destructor.
Next, we can turn the fitting function.
// [[Rcpp::export]] Rcpp::List fitData(Rcpp::DataFrame ds) { const size_t ncoeffs = NCOEFFS; const size_t nbreak = NBREAK; const size_t n = N; size_t i, j; Rcpp::DataFrame D(ds); // construct the data.frame object RcppGSL::vector<double> y = D["y"]; // access columns by name, RcppGSL::vector<double> x = D["x"]; // assigning to GSL vectors RcppGSL::vector<double> w = D["w"]; gsl_bspline_workspace *bw; gsl_vector *B; gsl_vector *c; gsl_matrix *X, *cov; gsl_multifit_linear_workspace *mw; double chisq, Rsq, dof, tss; bw = gsl_bspline_alloc(4, nbreak); // allocate a cubic bspline workspace (k = 4) B = gsl_vector_alloc(ncoeffs); X = gsl_matrix_alloc(n, ncoeffs); c = gsl_vector_alloc(ncoeffs); cov = gsl_matrix_alloc(ncoeffs, ncoeffs); mw = gsl_multifit_linear_alloc(n, ncoeffs); gsl_bspline_knots_uniform(0.0, 15.0, bw); // use uniform breakpoints on [0, 15] for (i = 0; i < n; ++i) { // construct the fit matrix X double xi = gsl_vector_get(x, i); gsl_bspline_eval(xi, B, bw); // compute B_j(xi) for all j for (j = 0; j < ncoeffs; ++j) { // fill in row i of X double Bj = gsl_vector_get(B, j); gsl_matrix_set(X, i, j, Bj); } } gsl_multifit_wlinear(X, w, y, c, cov, &chisq, mw); // do the fit dof = n - ncoeffs; tss = gsl_stats_wtss(w->data, 1, y->data, 1, y->size); Rsq = 1.0 - chisq / tss; Rcpp::NumericVector FX(151), FY(151); // output the smoothed curve double xi, yi, yerr; for (xi = 0.0, i=0; xi < 15.0; xi += 0.1, i++) { gsl_bspline_eval(xi, B, bw); gsl_multifit_linear_est(B, c, cov, &yi, &yerr); FX[i] = xi; FY[i] = yi; } Rcpp::List res = Rcpp::List::create(Rcpp::Named("X")=FX, Rcpp::Named("Y")=FY, Rcpp::Named("chisqdof")=Rcpp::wrap(chisq/dof), Rcpp::Named("rsq")=Rcpp::wrap(Rsq)); gsl_bspline_free(bw); gsl_vector_free(B); gsl_matrix_free(X); gsl_vector_free(c); gsl_matrix_free(cov); gsl_multifit_linear_free(mw); y.free(); x.free(); w.free(); return(res); }
The second function closely follows the second part of the GSL example and, given the input data, fits the output data. Data structures are setup, the spline basis is created, data is fit and then the fit is evaluated at a number of points. These two vectors are returned along with two goodness of fit measures.
We only need to load the Rcpp package and source a file containing the two snippets shown above, and we are ready to deploy this:
library(Rcpp) sourceCpp("bSpline.cpp") # compile two functions dat <- genData() # generate the data fit <- fitData(dat) # fit the model, returns matrix and gof measures
And with that, we generate a chart such as
via a simple four lines, or as much as it took to create the C++ functions, generate the data and fit it!
op <- par(mar=c(3,3,1,1)) plot(dat[,"x"], dat[,"y"], pch=19, col="#00000044") lines(fit[[1]], fit[[2]], col="orange", lwd=2) par(op)
The RcppArmadillo and RcppEigen package support plugin use in the same way. Add an attribute to export a function, and an attribute for the depends -- and you're done. Extending R with (potentially much faster) C++ code has never been easier, and opens a whole new set of doors.
Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.6.0 (2012-12-07)
Upgraded to Armadillo release Version 3.6.0 (Piazza del Duomo)
faster handling of compound expressions with submatrices and subcubes
added support for loading matrices as text files with
NaN
andInf
elementsadded
stable_sort_index()
, which preserves the relative order of elements with equivalent valuesadded handling of sparse matrices by
mean()
,var()
,norm()
,abs()
,square()
,sqrt()
added saving and loading of sparse matrices in arma_binary format
This release helps with an update to stack checking, required by a recent
change in R itself. The
NEWS
extract below has more details.
CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.Changes in RInside version 0.2.10 (2012-12-05)
Adjusted to change in R which requires turning checking of the stack limit off in order to allow for access from multiple threads as in the Wt examples. As there are have been no side-effects, this is enabled by default on all platforms (with the exception of Windows).
Added new ‘threads’ example directory with a simple example based on a Boost mutex example.
Disabled two examples (passing an external function down) which do not currently work; external pointer use should still work.
Bryan Lewis had suggested to enable
another pricing engine for American Options in order to get (at least some)
Greeks. This is now supported by picking
engine="CrankNicolson"
as shown in the default example for the
AmericanOption
function:
R> library(RQuantLib) R> example(AmericanOption) AmrcnOR> # simple call with unnamed parameters AmrcnOR> AmericanOption("call", 100, 100, 0.02, 0.03, 0.5, 0.4) Concise summary of valuation for AmericanOption value delta gamma vega theta rho divRho 11.3648 NA NA NA NA NA NA AmrcnOR> # simple call with some explicit parameters AmrcnOR> AmericanOption("put", strike=100, volatility=0.4, 100, 0.02, 0.03, 0.5) Concise summary of valuation for AmericanOption value delta gamma vega theta rho divRho 10.9174 NA NA NA NA NA NA AmrcnOR> # simple call with unnamed parameters, using Crank-Nicolons AmrcnOR> AmericanOption("put", strike=100, volatility=0.4, 100, 0.02, 0.03, 0.5, engine="CrankNicolson") Concise summary of valuation for AmericanOption value delta gamma vega theta rho divRho 10.9173 -0.4358 0.0140 NA NA NA NA R>
Thanks to CRANberries, there is also a diff to the previous release 0.3.8. Full changelog details, examples and more details about this package are at my RQuantLib page.
This is a follow-up to the recent 0.10.0 release which extends the exciting new Rcpp-attributes and Rcpp-sugar work further, and as in a number of other areas as detailed below in the NEWS sections.
This release brings an change to some of the binary interfaces. If you have packages using Rcpp, you will most likely have to reinstall them from source. Some change were made to const correctness as well as other aspects, and it seems that we have temporarily broken the excellent RcppEigen and RcppOctave packages. We are looking into this, and are sorry about the bug.
The complete NEWS
entry for 0.10.1 is below; more details are in the ChangeLog file in the package and on the
Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.10.0. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in Rcpp version 0.10.1 (2012-11-26)
Changes in Rcpp sugar:
New functions:
setdiff
,union_
,intersect
setequal
,in
,min
,max
,range
,match
,table
,duplicated
New function:
clamp
which combines pmin and pmax, e.g. clamp( a, x, b) is the same as pmax( b, pmin(x, a) )New function:
self_match
which implements something similar tomatch( x, unique( x ) )
Changes in Rcpp API:
The
Vector
template class (henceNumericVector
...) get theis_na
and theget_na
static methods.New helper class
no_init
that can be used to create a vector without initializing its data, e.g. :IntegerVector out = no_init(n) ;
New exception constructor requiring only a message;
stop
function to throw an exception
DataFrame
gains anrows
methodChanges in Rcpp attributes:
Ability to embed R code chunks (via specially formatted block comments) in C++ source files.
Allow specification of argument defaults for exported functions.
New scheme for more flexible mixing of generated and user composed C++ headers.
Print warning if no export attributes are found in source file.
Updated vignette with additional documentation on exposing C++ interfaces from packages and signaling errors.
Changes in Rcpp modules:
Enclose .External invocations in
BEGIN_RCPP
/END_RCPP
Changes in R code :
New function
areMacrosDefined
Additions to
Rcpp.package.skeleton
:
attributes
parameter to generate a version ofrcpp_hello_world
that usesRcpp::export
.
cpp_files
parameter to provide a list of C++ files to include the in thesrc
directory of the package.Miscellaneous changes:
New example 'pi simulation' using R and C++ via Rcpp attributes
For this release, Hannes Muehleisen added support for sha-512 using an older
standalone function by Aaron D. Gifford
which I had to whip into slightly more portable shape to work on
Windows. (Hint: uint32_t
from stdint.h
, not u_int32_t
)
CRANberries provides the usual summary of changes to version 0.5.2.
As usual, our package is available via the R-Forge page leading to svn and tarball access, my digest page and the local directory here.
But because few things beat a nice example, this post tries to build some more excitement. We will illustrate how Rcpp attributes makes it really easy to add C++ code to R session, and that that code is as easy to grasp as R code.
Our motivating example is everybody's favourite introduction to Monte Carlo simulation: estimating π. A common method uses the fact
the unit circle has a surface area equal to π. We draw two uniform random numbers x
and y
, each between zero
and one. We then check for the distance of the corresponding point (x,y)
relative to the origin. If less than one (or equal),
it is in the circle (or on it); if more than one it is outside. As the first quadrant is a quarter of a square of area one, the area of
the whole circle is π -- so our first quadrant approximates π over four. The following figure, kindly borrowed from Wikipedia
with full attribution and credit, illustrates this:
Now, a vectorized version (drawing N
such pairs at once) of this approach is provided by the following R function.
piR <- function(N) { x <- runif(N) y <- runif(N) d <- sqrt(x^2 + y^2) return(4 * sum(d < 1.0) / N) }
And in C++ we can write almost exactly the same function thanks the Rcpp sugar vectorisation available via Rcpp:
Sure, there are small differences: C++ is statically typed, R is not. We need one include file for declaration, and we need one instantiation of the#include <Rcpp.h> using namespace Rcpp; // [[Rcpp::export]] double piSugar(const int N) { RNGScope scope; // ensure RNG gets set/reset NumericVector x = runif(N); NumericVector y = runif(N); NumericVector d = sqrt(x*x + y*y); return 4.0 * sum(d < 1.0) / N; }
RNGScope
object to ensure random number draws remain coordinated between the calling R process and the C++ function
calling into its (compiled C-code based) random number generators. That way we even get the exact same draws for the same seed.
But the basic approach is identical: draw a vector x
and vector y
, compute the distance to the origin and then
obtain the proportion within the unit circle -- which we scale by four. Same idea, same vectorised implementation in C++.
But the real key here is the one short line with the [[Rcpp::export]]
attribute. This is all it takes (along with
sourceCpp()
from Rcpp 0.10.0) to get the C++ code into R.
The full example (which assumes the C++ file is saved as piSugar.cpp
in the same directory) is now:
and it does a few things: set up the R function, source the C++ function (and presto: we have a callable C++ function just like that), compute two simulations given the same seed and ensure they are in fact identical -- and proceed to compare the timing in a benchmarking exercise. That last aspect is not even that important -- we end up being almost-but-not-quite twice as fast on my machine for different values of#!/usr/bin/r library(Rcpp) library(rbenchmark) piR <- function(N) { x <- runif(N) y <- runif(N) d <- sqrt(x^2 + y^2) return(4 * sum(d < 1.0) / N) } sourceCpp("piSugar.cpp") N <- 1e6 set.seed(42) resR <- piR(N) set.seed(42) resCpp <- piSugar(N) ## important: check results are identical with RNG seeded stopifnot(identical(resR, resCpp)) res <- benchmark(piR(N), piSugar(N), order="relative") print(res[,1:4])
N
.
The real takeaway here is the ease with which we can get a C++ function into R --- and the new process completely takes care of passing parameters in, results out, and does the compilation, linking and loading.
More details about Rcpp attributes are in the new vignette. Now enjoy the π.
Update:One somewhat bad typo fixed.
Update:Corrected one background tag.
Courtesy of CRANberries, there is also a diffstat report for 0.3.4.4 relative to 0.3.4.3 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.4.4 (2012-11-15)
Upgraded to Armadillo release 3.4.4
fix for handling complex numbers by sparse matrices
fix for minor memory leak by sparse matrices
Rmath.h
-- and illustrate one of the
key features (Rcpp attributes) in passing.
R, as a statistical language and environment, has very well written and tested statistical distribution functions providing
probability density, cumulative distribution, quantiles and random number draws for dozens of common and not so common distribution functions.
This code is used inside R, and available for use from standalone C or C++ programs via the standalone R math library which Debian /
Ubuntu have as a package r-mathlib
(and which can be built from R sources).
User sometimes write code against this interface, and then want to combine the code with other code, possibly even with Rcpp. We allowed
for this, but it required a bit of an ugly interface. R provides a C interface; these have no namespaces. Identifiers can clash, and to be
safe one can enable a generic prefix Rf_
. So functions which could clash such as length
or error
become Rf_length
and Rf_error
and are less likely to conflict with symbols from other libraries. Unfortunately, the side-effect
is that calling, say, the probability distribution function for the Normal distribution becomes Rf_pnorm5()
(with the 5
denoting the five parameters: quantile, mean, std.deviation, lowerTail, logValue). Not pretty, and not obvious.
So one of the things we added was another layer of indirection by adding a namespace R
with a bunch of inline'd wrapper
functions (as well as several handful of unit tests to make sure we avoided typos and argument transposition and what not).
The short example below shows this for a simple function taking a vector, and returning its pnorm
computed three different ways:
This example also uses the new Rcpp attributes described briefly in the announcement blog post and of course in more detail in the corresponding vignette. Let us just state here that we simply provide a complete C++ function, using standard Rcpp types -- along with one 'attribute' declaration of an export via Rcpp. That's it -- even easier than using inline.#include <Rcpp.h> // [[Rcpp::export]] Rcpp::DataFrame mypnorm(Rcpp::NumericVector x) { int n = x.size(); Rcpp::NumericVector y1(n), y2(n), y3(n); for (int i=0; i<n; i++) { // the way we used to do this y1[i] = ::Rf_pnorm5(x[i], 0.0, 1.0, 1, 0); // the way we can do it now y2[i] = R::pnorm(x[i], 0.0, 1.0, 1, 0); } // or using Rcpp sugar in one go y3 = Rcpp::pnorm(x); return Rcpp::DataFrame::create(Rcpp::Named("Rold") = y1, Rcpp::Named("Rnew") = y2, Rcpp::Named("sugar") = y3); }
Now in R we simply do
to obtain a callable R function with the C++ code just shown behind it. No Makefile, no command-line tool invocation -- nothing but a single call toR> sourceCpp("mypnorm.cpp")
sourceCpp()
which takes care of things --- and brings us a compiled C++ function to R just given the source file with
its attribute declaration.
We can now use the new function to compute the probaility distribution both the old way, the new way with the 'cleaner'
R::pnorm()
, and of course the Rcpp sugar way in a single call. We build a data frame in C++, and assert that all
three variants are the same:
This example hopefully helped to illustrate how Rcpp 0.10.0 brings both something really powerful (Rcpp attributes -- more on this another time, hopefully) and convenient in the new namespace for statistical functions.R> x <- seq(0, 1, length=1e3) R> res <- mypnorm(x) R> head(res) Rold Rnew sugar 1 0.500000 0.500000 0.500000 2 0.500399 0.500399 0.500399 3 0.500799 0.500799 0.500799 4 0.501198 0.501198 0.501198 5 0.501597 0.501597 0.501597 6 0.501997 0.501997 0.501997 R> all.equal(res[,1], res[,2], res[,3]) [1] TRUE R>
This is a new feature release, and we are very exciting about the
changes, notably Rcpp attributes which make using C++ from R even
easier than inline (see below as well as the
new
vignette for details and first examples), the extensions to Rcpp
modules (see below) and more as for example new Rcpp sugar
functions, a new error output device syncing to R, and a new namespace R>
for the statistical functions from Rmath.h
.
The complete NEWS
entry for 0.10.0 is below; more details are in the ChangeLog file in the package and on the
Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.15. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in Rcpp version 0.10.0 (2012-11-13)
Support for C++11 style attributes (embedded in comments) to enable use of C++ within interactive sessions and to automatically generate module declarations for packages:
Rcpp::export attribute to export a C++ function to R
sourceCpp()
function to source exported functions from a file
cppFunction()
andevalCpp()
functions for inline declarations and execution
compileAttribtes()
function to generate Rcpp modules from exported functions within a packageRcpp::depends attribute for specifying additional build dependencies for
sourceCpp()
Rcpp::interfaces attribute to specify the external bindings
compileAttributes()
should generate (defaults to R-only but a C++ include file using R_GetCCallable can also be generated)New vignette "Rcpp-attribute"
Rcpp modules feature set has been expanded:
Functions and methods can now return objects from classes that are exposed through modules. This uses the make_new_object template internally. This feature requires that some class traits are declared to indicate Rcpp's
wrap
/as
system that these classes are covered by modules. The macro RCPP_EXPOSED_CLASS and RCPP_EXPOSED_CLASS_NODECL can be used to declared these type traits.Classes exposed through modules can also be used as parameters of exposed functions or methods.
Exposed classes can declare factories with ".factory". A factory is a c++ function that returns a pointer to the target class. It is assumed that these objects are allocated with new on the factory. On the R side, factories are called just like other constructors, with the "new" function. This feature allows an alternative way to construct objects.
"converter" can be used to declare a way to convert an object of a type to another type. This gets translated to the appropriate "as" method on the R side.
Inheritance. A class can now declare that it inherits from another class with the .derives<Parent>( "Parent" ) notation. As a result the exposed class gains methods and properties (fields) from its parent class.
New sugar functions:
which_min
implements which.min. Traversing the sugar expression and returning the index of the first time the minimum value is found.
which_max
idem
unique
uses unordered_set to find unique values. In particular, the version for CharacterVector is found to be more efficient than R's version
sort_unique
calculates unique values and then sorts them.Improvements to output facilities:
Implemented
sync()
so that flushing output streams worksAdded
Rcerr
output stream (forwarding toREprintf
)Provide a namespace 'R' for the standalone Rmath library so that Rcpp users can access those functions too; also added unit tests
Development releases sets variable RunAllRcppTests to yes to run all tests (unless it was alredy set to 'no'); CRAN releases do not and still require setting – which helps with the desired CRAN default of less testing at the CRAN server farm.
Update: One link corrected.
This release adds a few new features as detailed in the extract from the
NEWS
below.
A key new feature is the added support for resilience to bad user input based on discussions and an initial (but altered) patch by Theodore Lytras. There are a few ways that this can be deployedon so we also added two more example programs detailing it. The featured Qt example and the Wt example have been updated to use this too.
Another feature that may be quite useful for some is the additonal attempt to
find a value for R_HOME
if none has been set. The value is
typically found at compile time of the RInside package which poses a
problem for those using the Windows build -- which is shipped as a binary
reflecting the value of the build system. One alternative has always been to
build the package locally too to get the local value, or to set it
explicitly. But because the error behaviour – a cryptic message of
Cannot open base package – is confusing to many, we now try to
call the R function which gets this value from the registry. This may need
more tweaking and testing, and if you use
RInside
on the Windows platform I would appreciate feedback.
The complete list of changes since the last release are summarized below in the corresponding NEWS
file entry:
CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.Changes in RInside version 0.2.9 (2012-11-04)
Applied (modified) patch by Theodore Lytras which lets RInside recover from some parsing errors and makes RInside applications more tolerant of errors
Added non-throwing variants of parseEval() and parseEvalQ()
Modified Qt and Wt examples of density estimation applications to be much more resilient to bad user input
On Windows, have RInside use R's get_R_HOME() function to get R_HOME value from registry if not set by user
Added note to
examples/standard/Makefile.win
that R_HOME may need to be set to run the executables – so either export your local value, or re-install RInside from source to have it reflected in the library build oflibRinside
Updated CMake build support for standard, armadillo and eigen
Improved CMake builds of examples/standard, examples/eigen and examples/armadillo by detecting architecture
In this context, I have a nice new example. And for once, it is work-related. I generally cannot share too much of what we do there as this is, well, proprietary, but I have this nice new example. The other day, I was constructing (large) time series of implied volatilities. Implied volatilities can be thought of as the complement to an option's price: given a price (and all other observables which can be thought of as fixed), we compute an implied volatility price (typically via the standard Black-Scholes model). Given a changed implied volatility, we infer a new price -- see this Wikipedia page for more details. In essence, it opens the door to all sorts of arbitrage and relative value pricing adventures.
Now, we observe prices fairly frequently to create somewhat sizeable time series of option prices. And each price corresponds to one matching implied volatility, and for each such price we have to solve a small and straightforward optimization problem: to compute the implied volatility given the price. This is usually done with an iterative root finder.
The problem comes from the fact that we have to do this (i) over and over and over for large data sets, and (ii) that there are a number of callbacks from the (generic) solver to the (standard) option pricer.
So our first approach was to just call the corresponding function GBSVolatility
from the
fOption package from the trusted
Rmetrics project by Diethelm Wuertz et al. This worked fine, but even with the usual tricks of splitting over
multiple cores/machines, it simply took too long for the resolution and data amount we desired. One of the problems is that this function (which
uses the proper uniroot
optimizer in R) is not inefficient per se, but simply makes to many function call back to the option pricer as
can be seen from a quick glance at the code. The helper function .fGBSVolatility
gets called time and time again:
R> GBSVolatility function (price, TypeFlag = c("c", "p"), S, X, Time, r, b, tol = .Machine$double.eps, maxiter = 10000) { TypeFlag = TypeFlag[1] volatility = uniroot(.fGBSVolatility, interval = c(-10, 10), price = price, TypeFlag = TypeFlag, S = S, X = X, Time = Time, r = r, b = b, tol = tol, maxiter = maxiter)$root volatility } <environment: namespace:fOptions> R> R> .fGBSVolatility function (x, price, TypeFlag, S, X, Time, r, b, ...) { GBS = GBSOption(TypeFlag = TypeFlag, S = S, X = X, Time = Time, r = r, b = b, sigma = x)@price price - GBS } <environment: namespace:fOptions>
So the next idea was to try the corresponding function from my RQuantLib package which brings (parts of) QuantLib to R. That was seen as been lots faster already. Now, QuantLib is pretty big and so is RQuantLib, and we felt it may not make sense to install it on a number of machines just for this simple problem. So one evening this week I noodled around for an hour or two and combined (i) a basic Black/Scholes calculation and (ii) a standard univariate zero finder (both of which can be found or described in numerous places) to minimize the difference between the observed price and the price given an implied volatility. With about one hundred lines in C++, I had something which felt fast enough. So today I hooked this into R via a two-line wrapper in quickly-created package using Rcpp.
I had one more advantage here. For our time series problem, the majority of the parameters (strike, time to maturity, rate, ...) are fixed, so we can
structure the problem to be vectorised right from the start. I cannot share the code or more the details of my new implementation. However, both
GBSVolatility
and EuropeanOprionImpliedVolatility
are on CRAN (and as I happen to maintain these for
Debian, also just one sudo apt-get install r-cran-foptions r-cran-rquantlib
away if you're
on Debian or Ubuntu). And writing the other solver is really not that
involved.
Anyway, here is the result, courtesy of a quick run via the rbenchmark package. We create a vector of length 500; the implied volatility computation will be performed at each point (and yes, our time series are much longer indeed). This is replicated 100 times (as is the default for rbenchmark) for each of the three approaches:
The new local solution is denoted byxyz@xxxxxxxx:~$ r xxxxR/packages/xxxxOptions/demo/timing.R test replications elapsed relative user.self sys.self user.child sys.child 3 zzz(X) 100 0.038 1.000 0.040 0.000 0 0 2 RQL(X) 100 3.657 96.237 3.596 0.060 0 0 1 fOp(X) 100 448.060 11791.053 446.644 1.436 0 0 xyz@xxxxxxxx:~$
zzz(X)
. It is already orders of magnitude faster than the RQL(x)
function using
RQuantLib (which is, I presume, due to my custom solution internalising the loop). And
the new approach is a laughable amount faster than the basic approach (shown as fOp
) via
fOptions. For one hundred replications of solving implied volatilities for
all elements of a vector of size 500, the slow solution takes about 7.5 minutes --- while the fast solution takes 38 milliseconds. Which comes to a
relative gain of over 11,000.
So sitting down with your C++ compiler to craft a quick one-hundred lines, combining two well-known and tested methods, can reap sizeable benefits. And Rcpp makes it trivial to call this from R.
Martin Morgan provided a clever fix for a header search needed between
clang++ (especially on OS X) and g++ (which still provided libstdc++ and
headers for clang++). This should hopefully put the clang issues to bed.
Ben North noticed an unprotected string conversion when exception messages
are turned into R errors which got fixed, and I expanded the coverage of Date
(and Datetime
) types to deal
properly with non-finite values NA
, NaN
and Inf
.
The complete NEWS
entry for 0.9.15 is below; more details are in the ChangeLog file in the package and on the
Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.14. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in Rcpp version 0.9.15 (2012-10-13)
Untangling the clang++ build issue about the location of the exceptions header by directly checking for the include file – an approach provided by Martin Morgan in a kindly contributed patch as unit tests for them.
The
Date
andDatetime
types now correctly handlesNA
,NaN
andInf
representation; theDate
type switched to an internal representation viadouble
Added
Date
andDatetime
unit tests for the new featuresAn additional
PROTECT
was added for parsing exception messages before returning them to R, following a report by Ben North
Courtesy of CRANberries, there is also a diffstat report for 0.3.4.3 relative to 0.3.4.2 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.4.3 (2012-10-04)
Upgraded to Armadillo release 3.4.3
fix for aliasing issue in
diagmat()
fix for
speye()
signature
This release was once more driven largely by Murray whom we have now added among the list of authors of the package too. The NEWS file entry follows below:
CRANberries also provides a diff to the previous release 0.2.5. More information is at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.Changes in version 0.2.6 (2012-10-04)
Applied several more patches by Murray to
correct '_' and '__' mismatches in wrapper calls
update a few manual pages for style, and add examples
fix a bug where NAs were silently treated as TRUE for logical/bool types
fix a bug that caused crashes when adding vectors to optional fields
fix bugs in readASCII that returned empty protocol buffers when the file or connection could not be opened
distinguish between non-existant and not-set fieldswith has() by returning NULL in the former case.
fix a bug that caused non-deterministic behavior when setting a repeated message field in a protobuf to a single Message.
add unit tests for all of the above.
Added Murray to Authors: field in DESCRIPTION
Removed old and unconvincing example on RProtoBuf for storage and serialization in an imagined HighFrequencyFinance context
It addresses yet another issue we had on OS X and should hopefully put
the build issues to rest. Three new (vectorized) sugar
functions
were added, along with some new regression tests and more. The complete NEWS
entry for 0.9.14 is below; more details are in the ChangeLog file in the package and on the
Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.13. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in Rcpp version 0.9.14 (2012-09-30)
Added new Rcpp sugar functions trunc(), round() and signif(), as well as unit tests for them
Be more conservative about where we support clang++ and the inclusion of exception_defines.h and prevent this from being attempted on OS X where it failed for clang 3.1
Corrected a typo in Module.h which now again permits use of finalizers
Small correction for (unexported) bib() function (which provides a path to the bibtex file that ships with Rcpp)
Converted NEWS to NEWS.Rd
Courtesy of CRANberries, there is also a diffstat report for 0.3.4.2 relative to 0.3.4.1 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.4.2 (2012-09-25)
Upgraded to Armadillo release 3.4.2
minor fixes for handling sparse submatrix views
minor speedups for sparse matrices
Courtesy of CRANberries, there is also a diffstat report for 0.3.4.1 relative to 0.3.4.0 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.4.1 (2012-09-18)
Upgraded to Armadillo release 3.4.1
workaround for a bug in the Mac OS X accelerate framework
fixes for handling empty sparse matrices
added documentation for saving & loading matrices in HDF5 format
faster dot() and cdot() for complex numbers
This release adds no new features but improves the build process a little and should make use on Windows a little easier. All changes since the last release are summarized below in the NEWS file entry:
CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.Changes in RInside version 0.2.8 (2012-09-07)
Added CMake build support for armadillo and eigen examples, once again kindly contributed by Peter Aberline
Corrected Windows package build to always generate a 64 bit static library too
Updated package build to no longer require configire and configure.win to update the two header file supplying compile-time information; tightened build dependencies on headers in Makevars / Makevars.win
Improved examples/standard/Makefile.win by detecting architecture
fastLm()
function for faster
linear model fits.
Courtesy of CRANberries, there is also a diffstat report for 0.3.4.0 relative to 0.3.2.4 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.Changes in RcppArmadillo version 0.3.4.0 (2012-09-06)
Upgraded to Armadillo release 3.4.0 (Ku De Ta)
added economical QR decomposition: qr_econ()
added .each_col() & .each_row() for vector operations repeated on each column or row
added preliminary support for sparse matrices, contributed by Ryan Curtin et al. (Georgia Institute of Technology)
faster singular value decomposition via divide-and-conquer algorithm
faster .randn()
NEWS file converted to Rd format
Changes in RcppArmadillo version 0.3.3.91 (2012-08-30)
Upgraded to Armadillo release 3.3.91
faster singular value decomposition via "divide and conquer" algorithm
added economical QR decomposition: qr_econ()
added .each_col() & .each_row() for vector operations repeated on each column or row
added preliminary support for sparse matrices, contributed by Ryan Curtin, James Cline and Matthew Amidon (Georgia Institute of Technology)
Corrected summary method to deal with the no intercept case when using a formula; also display residual summary() statistics
Expanded unit tests for fastLm
The post also fell short of a good aggregated timing comparison for which we love the rbenchmark package. So in order to rectify this, and to see what we can do here with Rcpp, a quick post revisiting the issue.
As preliminaries, we need to load three packages: inline to create compiled code on the fly (which, I should mention, is also used together with Rcpp by the Stan / RStan MCMC sampler which is creating some buzz this week), the compiler package included with R to create byte-compiled code and lastly the aforementioned rbenchmark package to do the timings. We also set row and column dimension, and set them a little higher than the original example to actually have something measurable:
The first suggestion was the one by Scott himself. We will wrap this one, and all the following ones, in a function so that all approaches are comparable as being in a function of two dimension arguments:library(inline) library(compiler) library(rbenchmark) n <- 500 k <- 100
We also immediatly compute a byte-compiled version (just because we now can) to see if this helps at all with the code. As there are no (explicit !) loops, we do not expect a big pickup. Scott's function works, but sweeps thescott <- function(N, K) { mm <- matrix(0, N, K) apply(mm, c(1, 2), function(x) sample(c(0, 1), 1)) } scottComp <- cmpfun(scott)
sample()
function across all rows and columns which is probably going to be (relatively) expensive.
Next is the first improvement suggested to Scott which came from Ted Hart.
This is quite a bit smarter as it vectorises the approach, generating N times K elements at once which are then reshaped into a matrix.ted <- function(N, K) { matrix(rbinom(N * K, 1, 0.5), ncol = K, nrow = N) }
Another suggestion came from David Smith as well as Rafael Maia. We rewrite it slightly to make it a function with two arguments for the desired dimensions:
This is very clever as it usesdavid <- function(m, n) { matrix(sample(0:1, m * n, replace = TRUE), m, n) }
sample()
over zero and one rather than making (expensive) draws from random
number generator.
Next we have a version from Luis Apiolaza:
It draws from a random uniform and rounds to zero and one, rather than deploying the binomial.luis <- function(m, n) { round(matrix(runif(m * n), m, n)) }
Then we have the version using RcppArmadillo hinted at by Scott, but with actual arguments and a correction for row/column dimensions. Thanks to inline we can write the C++ code as an R character string; inline takes care of everything and we end up with C++-based solution directly callable from R:
This works, and is pretty fast. The only problem is that it answers the wrong question as it returns U(0,1) draws and not binomials. We need to truncate or round. So a corrected version isarma <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); return wrap(arma::randu(n, k)); ')
which uses the the old rounding approximation of adding 1/2 before truncating.armaFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); return wrap(arma::floor(arma::randu(n, k) + 0.5)); ')
With Armadillo in the picture, we do wonder how Rcpp sugar would do. Rcpp sugar, described in one of the eight vignettes of the Rcpp package, is using template meta-programming to provide R-like expressiveness (aka "syntactic sugar") at the C++ level. In particular, it gives access to R's RNG functions using the exact same RNGs as R making the results directly substitutable (whereas Armadillo uses its own RNG).
Heresugar <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); Rcpp::RNGScope tmp; Rcpp::NumericVector draws = Rcpp::runif(n*k); return Rcpp::NumericMatrix(n, k, draws.begin()); ')
Rcpp::RNGScope
deals with setting/resetting the R RNG state. This draws a vector of N time K uniforms
similar to Luis' function -- and just like Luis' R function does so without looping -- and then shapes a matrix of dimension N by K from it.
And it does of course have the same problem as the RcppArmadillo approach earlier and we can use the same solution:
sugarFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body=' int n = Rcpp::as<int>(ns); int k = Rcpp::as<int>(ks); Rcpp::RNGScope tmp; Rcpp::NumericVector draws = Rcpp::floor(Rcpp::runif(n*k)+0.5); return Rcpp::NumericMatrix(n, k, draws.begin()); ')
Now that we have all the pieces in place, we can compare:
With all the above code example in a small R script we call via littler, we getres <- benchmark(scott(n, k), scottComp(n,k), ted(n, k), david(n, k), luis(n, k), arma(n, k), sugar(n,k), armaFloor(n, k), sugarFloor(n, k), order="relative", replications=100) print(res[,1:4])
We can see several takeaways:edd@max:~/svn/rcpp/pkg$ r /tmp/scott.r Loading required package: methods test replications elapsed relative 7 sugar(n, k) 100 0.072 1.000000 9 sugarFloor(n, k) 100 0.088 1.222222 6 arma(n, k) 100 0.126 1.750000 4 david(n, k) 100 0.136 1.888889 8 armaFloor(n, k) 100 0.138 1.916667 3 ted(n, k) 100 0.384 5.333333 5 luis(n, k) 100 0.410 5.694444 1 scott(n, k) 100 33.045 458.958333 2 scottComp(n, k) 100 33.767 468.986111
sample()
idea by David and Rafael is very, very fast too. This proves once again that
well-written R code can be competitive. It also suggest how to make the C++ solution by foregoing (expensive)
RNG draws in favour of samplingrbinom()
Rcpp sugar
function, or implementing sample()
at the C++ level is, as the saying goes, left as an exercise to the reader.
So here is an updated version, where we limit the display to the top twenty packages counted by reverse 'Depends:', and excluding those already shipping with R such as MASS, lattice, survival, Matrix, or nlme.
The mvtnorm package is still out by a wide margin, but we can note that (cough, cough) our Rcpp package for seamless R and C++ is now tied for second with the coda package for MCMC analysis. Also of note is the fact that CRAN keeps growing relentlessly and moved from 3969 packages to 3981 packages in the space of these few days...
Lastly, I have been asked about the code and/or data behind this. It is
really pretty simply as the main data.frame
can be had from CRAN
(where I also found the initial few lines to load it). After that, one only
needs a little bit of subsetting as shown below. I look forward to seeing
other people riff on this data set.
#!/usr/bin/r ## ## Initial db downloand from http://developer.r-project.org/CRAN/Scripts/depends.R and adapted require("tools") ## this function is essentially the same as R Core's from the URL ## http://developer.r-project.org/CRAN/Scripts/depends.R getDB <- function() { contrib.url(getOption("repos")["CRAN"], "source") # trigger chooseCRANmirror() if required description <- sprintf("%s/web/packages/packages.rds", getOption("repos")["CRAN"]) con <- if(substring(description, 1L, 7L) == "file://") { file(description, "rb") } else { url(description, "rb") } on.exit(close(con)) db <- readRDS(gzcon(con)) rownames(db) <- db[,"Package"] db } db <- getDB() ## count packages getCounts <- function(db, col) { foo <- sapply(db[,col], function(s) { if (is.na(s)) NA else length(strsplit(s, ",")[[1]]) } ) } ## build a data.frame with the number of entries for reverse depends, reverse imports, ## reverse linkingto and reverse suggests; also keep Recommended status ddall <- data.frame(pkg=db[,1], RDepends=getCounts(db, "Reverse depends"), RImports=getCounts(db, "Reverse imports"), RLinkingTo=getCounts(db, "Reverse linking to"), RSuggests=getCounts(db, "Reverse suggests"), Recommended=db[,"Priority"]=="recommended" ) ## Subset to non-Recommended packages as in David Smith's follow-up post dd <- subset(ddall, is.na(ddall[,"Recommended"]) | ddall[,"Recommended"] != TRUE) labeltxt <- paste("Analysis as of", format(Sys.Date(), "%d %b %Y"), "covering", nrow(db), "total CRAN packages") cutOff <- 20 doPNG <- TRUE if (doPNG) png("/tmp/CRAN_ReverseDepends.png", width=600, heigh=600) z <- dd[head(order(dd[,2], decreasing=TRUE), cutOff),c(1,2)] dotchart(z[,2], labels=z[,1], cex=1, pch=19, main="CRAN Packages sorted by Reverse Depends:", sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"), xlab=labeltxt) if (doPNG) dev.off() if (doPNG) png("/tmp/CRAN_ReverseImports.png", width=600, heigh=600) z <- dd[head(order(dd[,3], decreasing=TRUE), cutOff),c(1,3)] dotchart(z[,2], labels=z[,1], cex=1, pch=19, main="CRAN Packages sorted by Reverse Imports:", sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"), xlab=labeltxt) if (doPNG) dev.off() # no cutOff but rather a na.omit if (doPNG) png("/tmp/CRAN_ReverseLinkingTo.png", width=600, heigh=600) z <- na.omit(dd[head(order(dd[,4], decreasing=TRUE), 30),c(1,4)]) dotchart(z[,2], labels=z[,1], pch=19, main="CRAN Packages sorted by Reverse LinkingTo:", xlab=labeltxt) if (doPNG) dev.off()
This release adds two new examples subdirectories demonstrating use of RInside with, respectively, RcppArmadillo and RcppEigen. We extended the 'web application' example using the Wt toolkit by adding CSS and XML support, and added another new example motivated by a StackOverflow question. CMake support has been added for Windows as well thanks to Peter Aberline---he also contributed CMake code for the two new example directories but that contribution made it only into SVN and not this release.
All changes since the last release are summarized below in the NEWS file entry:
CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.Changes in RInside version 0.2.7 (2012-08-12)
New fifth examples subdirectory 'armadillo' with two new examples showing how to combine RInside with RcppArmadillo
New sixth examples subdirectory 'eigen' with two new examples showing how to combine RInside with RcppEigen
Prettified the Wt example 'web application' with CSS use, also added and XML file with simple headers and description text
New example rinside_sample12 motivated by StackOverflow question on using
sample()
from CAdded CMake build support on Windows for the examples
The NEWS entry is below: a new example was added illustrating use of the (vectorised) random-number generators for three of the different distributions --- and showing how it perfectly reproduces the values one gets in R.
Thanks to CRANberries, you can also look at a diff to the previous release 0.1.3. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge pageChanges in RcppExamples version 0.1.4 (2012-08-09)
Added new example for Rcpp sugar and vectorised draws of RNGs
Minor updates to reflect newer CRAN Policy
This release once again contains a number of patches kindly contributed by Murray Stokely, as well as some updates to conform to CRAN Policy changes.
The NEWS file entry follows below:
CRANberries also provides a diff to the previous release 0.243. More information is at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.Changes in version 0.2.5 (2012-08-08)
Applied patches by Murray to
correctly deal with nested Protocol Buffer definitions, and also add new unit test for this
test a a protocol buffer for missing required fields before serializing it, also add a unit test
add a small stylistic fix and examples to the 'add.Rd' manual page
Moved inst/doc/ to vignettes/ per newer CRAN Policy
getEndOfMonth
and getEndOfBizWeek
were not working right. These are convenience wrappers around the real
functionality provided as a member function to the reference class built by
Rcpp modules---which works off a reference instance of the class, and these
two convenience functions were not updating the date. This is now fixed.
The complete NEWS entry is below:
Changes in version 0.2.1 (2012-08-08)
Bug for getEndOfBizWeek() and getEndOfMonth() who were lacking a call to fromDate(date) to actually pass the date for which the functions are computing the end of business week or month.
Courtesy of CRANberries, there is also a diffstat report for 0.2.1 relative to 0.2.0. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.
Depends
.
Far from it, and I immediately replied listing lattice and
Matrix as packages with way more other packages depending upon them.
But as the question seemed deserving of a bit more analysis, I spent a few minutes on this and prepared three charts listing package in order of reverse Depends, reverse Imports and reverse LinkingTo.
First off, the reverse Depends:
. This is the standard means of
declaring a dependence of one package upon another.
Unsurprisingly, the MASS package from the classic Venables and Ripley book comes first, with Deepayan Sarkar's powerful lattice package (also covered in a book) coming second. These are both recommended packages which are commonly distributed with R itself. Next are mvtnorm and survival. Our Rcpp is up there in the top-ten, but not a frontrunner.
With the advent of namespaces a few R releases ago, it became possible to
import functions from other packages. So the Imports:
statement
now provides an alternative to the (older) Depends:
. The next
chart displays the same relationship for Imports:
:
Now lattice still leads, but Hadleys's plyr package grabbed the second spot just before MASS and Matrix.
It is interesting to see that the sheer number of Imports:
is
still not where the Depends:
are. On the other hand, we see a
number of more recent packages popping up in the second chart. This may
reflect more recent coding practices. It will be interesting to see how this
stacks up over time when we revisit this chart.
Lastly, we can also look at LinkingTo:
, a declaration used to
provide a C/C++
-level dependency at the source code level. We use
this in the
Rcpp family to
provide automatic resolution of the header files needed to compile against
our packages. And unsurprisingly, because packages using
Rcpp
actually use its API (rather than R functions), the package is a little ahead
of others. In the package we find three more packages of the
Rcpp
family, but only a limited number of other packages as
C/C++
-level dependencies are still somewhat rare in the R universe.
There are also fewer packages overall making use of this mechanism.
One could of course take this one level further and sum up dependencies in a
recursive manner, or visualize the relationship differently. But these
dotchart
graphs provide a first visual description of the
magnitude of Depends
, Imports
and
LinkingTo
among CRAN packages for R.
The main change are the added ability to also write gzip
-ed npy
files, to suppress an automatic transposition as well as the correction of
one memory leak on data read.
The NEWS file entry is below:
Changes in version 0.2.0 (2012-07-30)
Support for writing of
gzip
-ednpy
files has been added.A new option
dotranspose
has been added tonpyLoad()
to support data sets that do not need to be transposed to be used in R.A memory leak in reading files has been corrected.
CRANberries also provides a diffstat report for 0.2.0 relative to 0.1.0. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.
Courtesy of CRANberries, here is a link to the changes relative to the previous release 0.9.1.
Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
fastLm
, a faster replacement for
the standard lm
function. The summary methods now provide more
information, and we added a number of new regression tests.
The NEWS file entry follows below:
And courtesy of CRANberries, a summary of the changes to the previous release 0.1.1 is available too.Changes in version 0.2.0 (2012-07-22)
summary() for fastLm() now displays more information
fastLmPure() now uses same argument order as R's lm.fit()
Added more unit tests for fastLm() and related functions
Export and document S3 methods in NAMESPACE and manual page as such
Vignettes have been moved to the
vignettes/
directoryMain vignette renamed to
RcppGSL-intro.pdf
to use a filename different from the package reference manualNEWS file converted to .Rd format
inline support function no longer uses assignInNamespace
More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
The short NEWS entry follows below.
Courtesy of CRANberries, there is also a diffstat report for 0.3.2.4 relative to 0.3.2.3 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.2.4 2012-07-11 o Upgraded to Armadillo release 3.2.4 * workaround for a regression (bug) in GCC 4.7.0 and 4.7.1
This post will show a quick example, also summarized in the short pdf vignette describing the package, and provided as a demo within the package.
We first load the new package (as well as the rbenchmark package used for the benchmarking example) into R. We then create a large matrix of 100,000 rows and 50 columns. Not quite big data by any stretch, but large enough for ascii reading to be painfully slow. We also write two npy files and compress the second one.R> library(RcppCNPy) Loading required package: Rcpp R> library(rbenchmark) R> R> n <- 1e5 R> k <- 50 R> R> M <- matrix(seq(1.0, n*k, by=1.0), n, k) R> R> txtfile <- tempfile(fileext=".txt") R> write.table(M, file=txtfile) R> R> pyfile <- tempfile(fileext=".py") R> npySave(pyfile, M) R> R> pygzfile <- tempfile(fileext=".py") R> npySave(pygzfile, M) R> system(paste("gzip -9", pygzfile)) R> pygzfile <- paste(pygzfile, ".gz", sep="") R> R>
Next, we use the benchmark
function to time the three
approaches:
As shown by this example, loading a numpy file directly beats the pants off reading the data from ascii: it is about 78 times faster. Reading a compressed file is somewhat slower as the data stream has to be passed through the uncompressor provide by the zlib library. So instead of reading a binary blob in one go (once the file header has been parsed) we have to operate piecemeal---which is bound to be slower. It does however save in storage space (and users can make this tradeoff between speed and size) and is still orders of magnitude faster than parsing the ascii file. Finally, and not shown here, we unlink the temporary files.R> res <- benchmark(read.table(txtfile), + npyLoad(pyfile), + npyLoad(pygzfile), + order="relative", + columns=c("test", "replications", "elapsed", "relative"), + replications=10) R> print(res) test replications elapsed relative 2 npyLoad(pyfile) 10 1.241 1.00000 3 npyLoad(pygzfile) 10 3.098 2.49637 1 read.table(txtfile) 10 96.744 77.95649 R>
Summing up, this post demonstrated how the RcppCNPy package can be a useful to access data in numpy files (which may even be compressed). Data can also be written from R to be accessed later by numpy.
Bringing such a C++ library to R is done very easily via Rcpp modules.
The resulting package contains a single R file with a single line:
loadModule("cnpy", TRUE)
. And it relies on the following
module declarations in the a C++ file:
which give us at the R promptRCPP_MODULE(cnpy){ using namespace Rcpp; function("npyLoad", // name of the identifier at the R level &npyLoad, // function pointer to helper function defined above List::create( Named("filename"), // function arguments including default value Named("type") = "numeric"), "read an npy file into a numeric or integer vector or matrix"); function("npySave", // name of the identifier at the R level &npySave, // function pointer to helper function defined above List::create( Named("filename"), // function arguments including default value Named("object"), Named("mode") = "w"), "save an R object (vector or matrix of type integer or numeric) to an npy file"); }
these two functions (and their docstrings) defined above. That's all! Well there are about one hundred more lines dealing with whether we have integer or numeric data, and whether we use a vector or a matrix. But all in all pretty simple...R> library(RcppCNPy) Loading required package: Rcpp R> npyLoad internal C++ function <0x243af70> docstring : read an npy file into a numeric or integer vector or matrix signature : Rcpp::RObject npyLoad(std::string, std::string) R> npySave internal C++ function <0x23033e0> docstring : save an R object (vector or matrix of type integer or numeric) to an npy file signature : void npySave(std::string, Rcpp::RObject, std::string) R>
So version 0.1.0 of this new package RcppCNPy completes the initial release 0.0.1 from earlier in the week by adding
.npy.gz
The NEWS entry for this release (as well as the initial one) follow:
I will follow up with a little usage example later.News for Package RcppCNPy
Changes in version 0.1.0 (2012-07-07)
Added automatic use of transpose to automagically account for Fortran-vs-C major storage defaults between Python and R.
Support for integer types in dependent on the
int64_t
type which is available only when the-std=c++0x
switch is used at build-time (and CRAN still discourages use of it)Added support for reading gzip'ed files ending in ".npy.gz"
Added regression tests in directory
tests/
Added a vignette describing the package
Added a timing benchmark in demo/timings.R
Changes in version 0.0.1 (2012-07-04)
Initial version, as a straightforward Rcpp modules wrap around the
cpny
library by Carl Rogers (on github under a MIT license).At present,
npy
files can be read and written for vectors and matrices of eithernumeric
orinteger
type. Note however that matrices are currently transposed because of the default Fortran ordering done by numpy.
CRANberries also provides a diffstat report for 0.1.0 relative to 0.0.1. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.
RcppBDT uses Rcpp, and in particular the nifty Rcpp modules feature of wrapping C++ code for R just by declaring the (class or function) interfaces. It uses this to bring in some useful functions from Boost Date.Time to R so that one can do things like
to compute the next five Labor Day dates in the US, given the year and the first Monday of September requirement. More examples are e.g. on the earlier blog post announcing version 0.1.0.R> library(RcppBDT) R> sapply(2012:2016, function(year) + format(getNthDayOfWeek(first, Mon, Sep, year))) [1] "2012-09-03" "2013-09-02" "2014-09-01" "2015-09-07" "2016-09-05" R>
Changes are mostly internal. R 2.15.1 brough a better / easier way to load such 'modules' into R, and Rcpp 0.9.13 allows us to use this. So RcppBDT continues to be useful as an example package for Rcpp modules. I also streamlines the interface a little: identifiers are now directly accessible in the package's NAMESPACE rather than just via the an instantiated object.
I also added a NEWS file, using the .Rd
format so that we can
import the marked-up text:
News for Package RcppBDT
Changes in version 0.2.0 (2012-07-02)
The core module, which wraps what in C++ is boost::gregorian::date, is now exposed as an Rcpp module bdtDate. As all example and demos operated off a (package-global) variable 'bdt', no user visible change was needed outside of the code instantiating at package load.
Updated package instantiation to the new facilities offered by the current versions R 2.15.1 and Rcpp 0.9.13 which make Rcpp module initialization easier and more streamlined.
Changes in version 0.1.0 (2011-01-17)
First CRAN upload (and see ChangeLog for more granular details) bug fix in svm cross validation
Courtesy of CRANberries, there is also a diffstat report for 0.2.0 relative to 0.1.0. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.
The short NEWS entry follows below. This version fixes some issues related to g++ 4.7; however the current Debian testing version of g++-4.7.1 required that I rolled back three header files to the version from the previous release. Conrad has been in contact with the gcc upstream maintainers and a fix may appear in time for g++-4.7.2.
This release also contains a new introductory pdf vignette based on a paper Conrad and I just submitted. It introduces Armadillo to R programmers and demonstrates with a simple Kalman filtering example how the code can be written in the same concise matrix-oriented style, yet runs orders of magnitude faster.
Courtesy of CRANberries, there is also a diffstat report for 0.3.2.3 relative to 0.3.2.0 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.2.3 2012-07-01 o Upgraded to Armadillo release 3.2.3 * minor correction for declaration of fixed size vectors and matrices o Reverted three header files {Mat,Row,Col}_bones.hpp back to previous release due to compilation failures under g++-4.7 o Added new vignette 'RcppArmadillo-intro' based on a just-submitted introductory paper (by Eddelbuettel and Sanderson) about RcppArmadillo o Change from release 3.2.2 which we skipped as it did not really affect builds under R: * minor fix for compiling without debugging enabled (aka release mode) * better detection of ATLAS during installation on Fedora and Red Hat systems o Small enhancement to fastLm
gzfile()
connections---as well as (compressed)
numpy files.
The numpy can be read very efficiently into Python. We can do the same in R
via save()
and load()
, of course. But the trouble
is that you need to read them first. And reading hundreds of megabytes from ascii is
slow, no matter which language you use. Concerning R, I poked aound scan()
,
played with the colClasses
argument and looked at the recent LaF package written just for
this purpose. And all these solutions were still orders of magnitude slower than
reading numpy. Which is no surprise as it is really hard to beat binary
formats when you have to parse countless ascii tokens.
So the obvious next idea was to read the numpy file in Python, and to write a simple binary format. One helpful feature with this data set was that it contained only regular (rectangular) matrices of floats. So we could just store two integers for the dimensions, followed by the total data in either one large binary blob, or a sequence of column vectors.
But one minor trouble was that the Intertubes lead to no easy solution to unpack the numpy format. StackOverflow had plenty of question around this topic converned with, say, how to serialize in language-independent way. But no converters. And nobody local knew how to undo the "pickle" format underlying numpy.
But a remote friend did:
Laurent,
well-known for his Rpy2
package, pointed me towards using the struct
module and steered
me towards the solution shown below. So a shameless plug: if you need a very
experienced Python or R consultant for sciece work, consider
his consulting firm.
Finally, to round out this post, let's show the simple solution we crafted so that the next guy searching the Intertubes will have an easier. Let us start with a minimal Python program writing numpy data to disk:
#!/usr/bin/env python # # simple example for creating numpy data to demonstrate converter import numpy as np # simple float array a = np.arange(15).reshape(3,5) * 1.1 outfile = "/tmp/data.npy" np.save(outfile, a)
Next, the simple Python converter to create a binary file containing two integers for row and column dimension, followed by row times columns of floats:
#!/usr/bin/python # # read a numpy file, and write a simple binary file containing # two integers 'n' and 'k' for rows and columns # n times k floats with the actual matrix # which can be read by any application or language that can read binary import struct import numpy as np inputfile = "/tmp/data.npy" outputfile = "/tmp/data.bin" # load from the file mat = np.load(inputfile) # create a binary file binfile = file(outputfile, 'wb') # and write out two integers with the row and column dimension header = struct.pack('2I', mat.shape[0], mat.shape[1]) binfile.write(header) # then loop over columns and write each for i in range(mat.shape[1]): data = struct.pack('%id' % mat.shape[0], *mat[:,i]) binfile.write(data) binfile.close()
Lastly, a quick littler script showing how R can read the data in a handful of lines:
That did the job---and I already used to converter to read a few weeks worth of data for further analysis in R. This obviously isn't the last word on possible solutions as the additional temporary file can be wasteful (unless it forms a cache for data read multiple times). If someone has nice solutions, please don't hold back and contact me. Thanks again to Laurent for the winning suggestion concerning#!/usr/bin/r infile <- "/tmp/data.bin" con <- file(infile, "rb") dim <- readBin(con, "integer", 2) Mat <- matrix( readBin(con, "numeric", prod(dim)), dim[1], dim[2]) close(con) print(Mat)
struct
, and help in
getting the examples shown here to work.
The Rcpp::Enviroment
constructor is now properly fixed (using
the global environment as a default value). As well, a new
#define
was created to detect the clang++ compiler before/after
version 3.0.0 as one exceptions header changed location. Unit tests files were
also once more updated.
The complete NEWS entry for 0.9.13 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.12. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page0.9.13 2012-06-28 o Truly corrected Rcpp::Environment class by having default constructor use the global environment, and removing the default argument of global environment from the SEXP constructor o Added tests for clang++ version to include bits/exception_defines.h for versions 3.0 or higher (similar to g++ 4.6.0 or later), needed to include one particular exceptions header o Made more regression tests conditional on the RunAllRcppTests to come closer to the CRAN mandate of running tests in sixty seconds o Updated unit test wrapper tests/doRUnit.R as well as unitTests/runTests.R
This fixes a minor snafu with the Rcpp::Enviroment
constructor
following a small change made for 0.9.11. It also reduces the number of unit
tests running by default as CRAN was complaining that it took too long to run
the checks. I am not impressed---tests are for running, not skipping. Lastly, the
cleanup
needed to be extended to another directory.
The complete NEWS entry for 0.9.12 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.11. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page0.9.12 2012-06-23 o Corrected Rcpp::Environment class by removing (empty) ctor following rev3592 (on May 2) where default argument for ctor was moved o Unit testing now supports argument --allTests to impose that all tests are executed; otherwise some expensive tests are skipped. This is arguably not the right thing to do, but CRAN maintainers insist on faster tests. o The cleanup script now also considers inst/unitTests/testRcppClass/src
This is a somewhat incremental release with a few internal improvements and
few new features. One interesting new development has been contributed by
John Chambers who is extending 'Rcpp modules' into 'Rcpp classes' which
allows R code to modify and extend C++ classes loaded via Rcpp modules; see
help(setRcppClass)
for more.
This also lead to some changes in the code for loading modules which however
requires the brand-new R version 2.15.1 released today as well.
The complete NEWS entry for 0.9.11 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.10. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page0.9.11 2012-06-22 o New member function for vectors (and lists etc) containsElementNamed() which returns a boolean indicating if the given element name is present o Updated the Rcpp.package.skeleton() support for Rcpp modules by carrying functions already present from the corresponding unit test which was also slightly expanded; and added more comments to the code o Rcpp modules can now be loaded via loadRcppModules() from .onLoad(), or via loadModule("moduleName") from any R file o Extended functionality to let R modify C++ clases imported via modules documented in help(setRcppClass) o Support compilation in Cygwin thanks to a patch by Dario Buttari o Extensions to the Rcpp-FAQ and the Rcpp-modules vignettes o The minium version of R is now 2.15.1 which is required for some of the Rcpp modules support
The short NEWS entry follows below. For those interested in following RcppArmadillo eevn more closely, we generally track Conrad's Armadillo release candidates as well in SVN on R-Forge but do no longer submit these to CRAN (as the CRAN maintainers have enough incoming packages).
Courtesy of CRANberries, there is also a diffstat report for 0.3.2.0 relative to 0.3.0.3 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.2.0 2012-05-21 o Upgraded to Armadillo release 3.2.0 "Creamfields" * faster eigen decomposition via "divide and conquer" algorithm * faster transpose of vectors and compound expressions * faster handling of diagonal views * faster handling of tiny fixed size vectors (≤ 4 elements) * added unique(), for finding unique elements of a matrix
This release once again contains a number of patches kindly contributed by Murray Stokely, as well as an added header file needed to build with the g++ 4.7 version which has become the build standard on CRAN.
The NEWS file entry follows below:
CRANberries also provides a diff to the previous release 0.2.3. More information is at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge. Updated to show NEWS rather than ChangeLog0.2.4 2012-05-15 o Applied several patches kindly supplied by Murray Stokely to - properly work with repeated strings - correct C++ function naming in a few instances - add an example of ascii export/import of messages o Suppport g++-4.7 and stricter #include file checking by adding unistd o Made small improvements to the startup code
g++-4.7
(as this version has become the default on
Debian's testing variant. This compiler
switch once again triggered a set of build failures, mostly from
include files now deemed missing. For RcppSMC, it came
down to a five-character patch of explicitly stating one max()
call
as std::max()
No other changes were made at this point. The NEWS entry is below:
0.1.1 2012-05-14 o Version 0.1.1 o Minor g++-4.7 build fix of using std::max() explicitly
Courtesy of CRANberries, there is also a diffstat report for 0.1.1 relative to 0.1.0 As always, more detailed information is on the RcppSMC page,
fastLm
and faster linear model fits.
The short NEWS entry follows below.
Courtesy of CRANberries, there is also a diffstat report for 0.3.0.3 relative to 0.3.0.2 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.0.3 2012-05-03 o Upgraded to Armadillo release 3.0.3 * fixes for inplace transpose of complex number matrices * fixes for complex number version of svd_econ() * fixes for potential aliasing issues with submatrix views o New example script fastLm
But heat makes for dreadful running conditions. I was pretty well trained, but I ended up running 3:43:55, or a tad slower than in Chicago 2007 (also record heat, with the race stopped around the four-hour mark) and Chicago 2009 (hot, though not that hot---but I was somewhat undertrained). Which makes this a personal worst. Oh well, stuff happens. And it was a pretty rough day---reports claimed that 2100 runners, or about 10%, were seeking some form of medical attention during or after the race which is a much higher proportion than under normal conditions.
The chart below is an updated version of a chart which had appeared on my running page before. I regrouped my prior Bostons (2007, 2009 and 2010) on the left, and the Chicago runs 2007 and 2009 on the right---and each with the current Boston marathon.
So compared to the prior Boston races, I did indeed relatively poorly. Starting slower, and slowing down even more leading to about 1:45 for the first half and 1:58 for the second. Compared to the two poor Chicago races (put of the six times I have ran that race), things are not quite as bad as at least I didn't peak at over 10 min/mile. Weather will always remain a challenge. The previous marathon (which was last fall's Fox Valley 2011) was unseasonally cold and wet---making for a time fast enough to get me back to Boston. I am registered again for this one, so we will see what September brings this year.
Beside Conrad's bugfix in
Armadillo itself, he also
convinced us to unroll a change imposed by
R: NDEBUG
is now
being defined unconditionally when compiling R packages. In Armadillo's case,
this suppresses a number of useful things including bounds-checking. So we
now undefine this symbol in the initial RcppArmadillo headers. Users can
still set it manually, and/or define ARMA_NO_DEBUG
.
The short NEWS entry follows below.
Courtesy of CRANberries, there is also a diffstat report for 0.3.0.2 relative to 0.3.0.1 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.0.2 2012-04-19 o Upgraded to Armadillo release 3.0.2 * fixes for handling diagonal matrices o Undefine NDEBUG if it has been set (as R does) as this prevents a number of useful debugging checks. Users can still define it or define ARMA_NO_DEBUG if they want a 'non-development' build
The package still comes with a vignette describing both DieHarder and the RDieHarder package. And because pictures speak louder than a thousand (blogged) words, here is the first chart from the vignette:
ran0
function. The histogram illustrating the distribution of
test scores is somewhat uneven. An ideal (and asymptotic) outcode is a
uniform distribution of p-values from the test. The empirical cumulative
distribution function (ECDF) below indicates a somewhat pronounced departure from
the diagonal. Informally speaking, this is what the
(Kuiper-)Kolmogorov-Smirnov test quantifies, and we see (in the text in the
chart) that the null of can be rejected an conventional levels. Based on
this example (which had a short run-time with few samples) we would indeed mistrust
this (known bad) RNG.
On the right, we have a more recent and trusted RNG, the well-known Mersenne Twister. The ten histogram buckets are all closer to the expected value of one-tenth, the estimated density is closer to flat, the ECDF is closer to the diagonal and the tests don't reject---so no reason to mistrust this RNG based on this test alone.
RDieHarder lets you run a battery of such tests against a boatload of known RNGs. Here is a second example, comparing the six RNGs built into R itself:
Courtesy of CRANberries, there is also a diffstat report for 0.1.2 relative to the older 0.1.1 release. More detailed information is on the RDieHarder page.
The short NEWS entry follows below.
Courtesy of CRANberries, there is also a diffstat report for 0.3.0.1 relative to 0.3.0 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.0.1 2012-04-12 o Upgraded to Armadillo release 3.0.1 * fixes for compilation errors * fixes for potential aliasing issues
All that said, here are the NEWS entries for all release since the last non-test release:
Courtesy of CRANberries, there is also a diffstat report for 0.3.0 relative to 0.2.36 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.3.0 2012-04-10 o Upgraded to Armadillo release 3.0.0 "Antarctic Chilli Ranch" * added non-contiguous submatrix views * added shorthand for inverse: .i() * added hist() and histc() * faster repmat() * faster handling of submatrix views with one row or column * faster generation of random numbers * faster element access in fixed size matrices * better detection of vector expressions by sum(), cumsum(), prod(), min(), max(), mean(), median(), stddev(), var() * expressions X=A.i()*B and X=inv(A)*B are automatically converted to X=solve(A,B) 0.2.40 2012-04-04 o Upgraded to Armadillo release 2.99.4 "Antarctic Chilli Ranch (Beta 4)" * fixes for handling expressions with fixed size matrices 0.2.39 2012-04-02 o Upgraded to Armadillo release 2.99.3 "Antarctic Chilli Ranch (Beta 3)" * faster repmat() * workarounds for braindead compilers (eg. Visual Studio) 0.2.38 2012-03-28 o Upgraded to Armadillo release 2.99.2 "Antarctic Chilli Ranch (Beta 2)" * added .i() * much faster handling of .col() and .row() * expressions X=A.i()*B and X=inv(A)*B are automatically converted to X=solve(A,B) 0.2.37 2012-03-19 o Upgraded to Armadillo release 2.99.1 "Antarctic Chilli Ranch (Beta 1)" * added non-contiguous submatrix views * added hist() and histc() * faster handling of submatrix views * faster generation of random numbers * faster element access in fixed size matrices * better detection of vector expressions by sum(), cumsum(), prod(), min(), max(), mean(), median(), stddev(), var()
2012-04-08 Dirk Eddelbuettel* DESCRIPTION: Release 0.1.1 * tests/compTest.R: With the just-release DEoptim 2.2.0, results are no longer identical so we commented-out the stopifnot() comparison to not break any automated tests (as requested by the CRAN maintainers) * .Rbuildignore: Added a few more files which R CMD check does not want to see in the tarball 2011-03-07 Dirk Eddelbuettel * src/evaluate.h: Also reflect '...' argument from R function we pass in, with thanks to Josh Ulrich for the one-line patch * R/DEoptim.R: No longer pass environment 'env' down * man/DEoptim.Rd: No longer document now unused 'env' * src/deoptim.cpp: Minor tweak to RcppArmadillo object creation
But then the good folks at CRAN did not want it. That is at the same time disappointing to us as developers / maintainers: work we do will not get pushed to largest number of users immediately. On the other hand, we do understand that CRAN is under a lot of strain. Years ago John Fox demonstrated exponential growth of packages at CRAN, the rate was then around 40% per year. Right now the number of packages exceeds 3600, and there are often several dozen daily updates (which you can follow with the CRANberries service for html and rss I had set up years ago). And the three CRAN maintainers all have full-time jobs as academics, and they also happen to be busy R Core members. Plus the new R version 2.15.0 is to be released at the end of the week, and this usually entails a number of package changes, putting extra strain on CRAN right now. For more on new or updated CRAN policies, see this long thread.
Now, as I wrote on the rcpp-devel list,
there is of course always install.packages("RcppArmadillo", repos="http://R-Forge.R-project.org")
for a direct installation from R-Forge. But as I just put
0.2.38 there, it may take the running of the batch updates on the site so if
you want it now, follow this link to my local Rcpp archive.
All that said, here are the NEWS entries for these two last releases:
As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.2.38 2012-03-28 o Upgraded to Armadillo release 2.99.2 "Antarctic Chilli Ranch (Beta 2)" * added .i() * much faster handling of .col() and .row() * expressions X=A.i()*B and X=inv(A)*B are automatically converted to X=solve(A,B) 0.2.37 2012-03-19 o Upgraded to Armadillo release 2.99.1 "Antarctic Chilli Ranch (Beta 1)" * added non-contiguous submatrix views * added hist() and histc() * faster handling of submatrix views * faster generation of random numbers * faster element access in fixed size matrices * better detection of vector expressions by sum(), cumsum(), prod(), min(), max(), mean(), median(), stddev(), var()
Sequential Monte Carlo / Particle Filter is a (to quote the Wikipedia page I just linked to) sophisticated model estimation technique based on simulation. They are related to both Kalman Filters, and Markov Chain Monte Carlo methods.
Adam Johansen has a rather nice set of C++ classes documentated in his 2009 paper in the Journal of Statistical Software (JSS). I started to play with these classes and realized that, once again, this would make perfect sense in an R extension built with the Rcpp package by Romain and myself (and in JSS too). So I put a first prototype onto R-Forge and emailed Adam who, to my pleasant surprise, was quite interested. And a couple of emails, and commits later, we are happy to present a very first release 0.1.0.
I wrote a few words on a RcppSMC page on my website where you can find a few more details. But in short, we already have example functions demonstrating the backend classes by reproducing examples from
pfLineartBS()
for a linear
bootstrap example;
blockpfGaussianOpt()
pfNonlinBS()
.
And to illustrate just why Rcpp is so cool for this, here is a little animation of a callback from the C++ code when doing the filtering on Adam's example 5.1. By passing a simple plotting function, written in R, to the C++ code, we can get a plot updated on every iteration. Here I cheated a little and used our old plot function with fixed ranges, the package now uses a more general function:
The animation is of course due to ImageMagick glueing one hundred files into a single animated gif.
More information about RcppSMC is on its page, and we intend to add more examples and extensions over time.
Now open for registrations:
R / Finance 2012: Applied Finance with R
May 11 and 12, 2012
Chicago, IL, USA
The registration for R/Finance 2012 -- which will take place May 11 and 12 in Chicago -- is NOW OPEN!
Building on the success of the three previous conferences in 2009, 2010, and 2011, we expect more than 250 attendees from around the world. R users from industry, academia, and government will join 40+ presenters covering all areas of finance with R.
This year's conference will start earlier in the day on Friday, to accommodate the tremendous line up of speakers for 2012, as well as to provide more time between talks for networking.
We are very excited about the four keynotes by Paul Gilbert, Blair Hull, Rob McCulloch, and Simon Urbanek. The main agenda includes nineteen full presentations and eighteen shorter "lightning talks". We are also excited to offer six optional pre-conference seminars on Friday morning.
Once again, we are hosting the R/Finance conference dinner on Friday evening, where you can continue conversations while dining and drinking atop a West Loop restaurant overlooking the Chicago skyline.
More details of the agenda are available at:
http://www.RinFinance.com/agenda/Registration information is available at
http://www.RinFinance.com/register/and can also be directly accessed by going tohttp://www.regonline.com/RFinance2012On behalf of the committee and sponsors, we look forward to seeing you in Chicago!
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,Our 2012 Sponsors:
Dale Rosenthal, Jeffrey Ryan, Joshua UlrichInternational Center for Futures and Derivatives at UICRevolution Analytics
Sybase
MS-Computational Finance at University of Washington
lemnica
OpenGamma
OneTick
RStudio
Tick Data
See you in Chicago in May!!
Murray Stokely noticed a corner case where a file that was inaccessible lead to a segmentation fault. Thanks to his patch, this is now caught and flagged. He even added a test for it and fixed another one of my idioms. Nice work, so keep the patches coming folks :)
CRANberries provides the usual summary of changes to version 0.5.1.
As usual, our package is available via the R-Forge page leading to svn and tarball access, my digest page and the local directory here.
Courtesy of CRANberries, there is also a diffstat reports for 0.2.36 relative to 0.2.35 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.2.36 2012-03-05 o Upgraded to Armadillo release 2.4.4 * fixes for qr() and syl() * more portable wall_clock class * faster relational operators on submatrices
Rcpp::Rcout
output stream device. Based on a contributed
Rcpp patch by Jelper Ypma, the
Rcpp::Rcout
output stream gets redirected to R's buffered output. In other words, R's own output and that
eminating from C++ code using Rcpp::Rcout
are now both in sync. This avoids a stern warning
from Section 5.6 in the Writing R Extensions manual:
Using C++ iostreams, as in this example, is best avoided. There is no guarantee that the output will appear in the R console, and indeed it will not on the R for Windows console. Use R code or the C entry points (*note Printing) for all I/O if at all possible.and does in fact provide exactly what is recommended: the same entry points R itself uses.
Below is a sample program, once again using the wonderful inline
package to compile, load and link C++ code into R from a simple text variable submitted to the cxxfunction
.
What is shown in R code to load the package, the definition of the C++ code as assigned to a variable src
and the creation of the dynamically-loadaded R function called fun
which contains the code from we
compiled, link and load via a single call to cxxfunction()
given src
.
We then switch to a temporary directory (as the example code, taken from one of the two examples in Conrad's Armadillo sources, creates a temporary file) and run the new function. To demontrate how it does in fact now mesh perfectly with R, we create an output 'sink' (which catches all output) and re-run.library library(inline) src <- ' Rcpp::Rcout << "Armadillo version: " << arma::arma_version::as_string() << std::endl; // directly specify the matrix size (elements are uninitialised) arma::mat A(2,3); // .n_rows = number of rows (read only) // .n_cols = number of columns (read only) Rcpp::Rcout << "A.n_rows = " << A.n_rows << std::endl; Rcpp::Rcout << "A.n_cols = " << A.n_cols << std::endl; // directly access an element (indexing starts at 0) A(1,2) = 456.0; A.print("A:"); // scalars are treated as a 1x1 matrix, // hence the code below will set A to have a size of 1x1 A = 5.0; A.print("A:"); // if you want a matrix with all elements set to a particular value // the .fill() member function can be used A.set_size(3,3); A.fill(5.0); A.print("A:"); arma::mat B; // endr indicates "end of row" B << 0.555950 << 0.274690 << 0.540605 << 0.798938 << arma::endr << 0.108929 << 0.830123 << 0.891726 << 0.895283 << arma::endr << 0.948014 << 0.973234 << 0.216504 << 0.883152 << arma::endr << 0.023787 << 0.675382 << 0.231751 << 0.450332 << arma::endr; // print to the cout stream // with an optional string before the contents of the matrix B.print("B:"); // the << operator can also be used to print the matrix // to an arbitrary stream (cout in this case) Rcpp::Rcout << "B:" << std::endl << B << std::endl; // save to disk B.save("B.txt", arma::raw_ascii); // load from disk arma::mat C; C.load("B.txt"); C += 2.0 * B; C.print("C:"); // submatrix types: // // .submat(first_row, first_column, last_row, last_column) // .row(row_number) // .col(column_number) // .cols(first_column, last_column) // .rows(first_row, last_row) Rcpp::Rcout << "C.submat(0,0,3,1) =" << std::endl; Rcpp::Rcout << C.submat(0,0,3,1) << std::endl; // generate the identity matrix arma::mat D = arma::eye<arma::mat>(4,4); D.submat(0,0,3,1) = C.cols(1,2); D.print("D:"); // transpose Rcpp::Rcout << "trans(B) =" << std::endl; Rcpp::Rcout << trans(B) << std::endl; // maximum from each column (traverse along rows) Rcpp::Rcout << "max(B) =" << std::endl; Rcpp::Rcout << max(B) << std::endl; // maximum from each row (traverse along columns) Rcpp::Rcout << "max(B,1) =" << std::endl; Rcpp::Rcout << max(B,1) << std::endl; // maximum value in B Rcpp::Rcout << "max(max(B)) = " << max(max(B)) << std::endl; // sum of each column (traverse along rows) Rcpp::Rcout << "sum(B) =" << std::endl; Rcpp::Rcout << sum(B) << std::endl; // sum of each row (traverse along columns) Rcpp::Rcout << "sum(B,1) =" << std::endl; Rcpp::Rcout << sum(B,1) << std::endl; // sum of all elements Rcpp::Rcout << "sum(sum(B)) = " << sum(sum(B)) << std::endl; Rcpp::Rcout << "accu(B) = " << accu(B) << std::endl; // trace = sum along diagonal Rcpp::Rcout << "trace(B) = " << trace(B) << std::endl; Rcpp::Rcout << std::endl; 'fun <- cxxfunction(signature(), body=src, plugin="RcppArmadillo") setwd("/tmp") # adjust on other OSs fun() # output to stdout sink("rcpparma.log.txt") # start 'sink' to output to file fun() # no output to screen sink() # stop 'sink'
This simple example demonstrated how we can use the new Rcout
output stream from
Rcpp to have dynamically-loaded C++ code cooperate more
cleanly with the (buffered) R output. It also demontrated some of the nice features in Armadillo
which we bring to R via RcppArmadillo.
Rcpp::Rcout
. So the new release 0.2.35 of
RcppArmadillo does just that---and input/output
from Armadillo, the wonderful linear algebra / math library for C++ written chiefly by
Conrad Sanderson now comes through in a coordinated buffered fashion. As as
std::cout
is no longer used by default, the R CMD check
no longer flags this.
There are two more bugfixes in the releases for issues noticed by attentive RcppArmadillo users. Teo Guo Ci spotted a missing semicolon when C++0x code was activated, and Gershon Bialer suggested a fix for an issue plagueing users of the jurassic g++ 4.2.1 compiler which Apple forces on users of OS X. We thank Martin Renner for testing these fixes. The NEWS entry below summarises the changes.
Courtesy of CRANberries, there is also a diffstat reports for 0.2.35 relative to 0.2.34 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.0.2.35 2012-02-17 o Upgraded to Armadillo release 2.4.3 * Support for ARMA_DEFAULT_OSTREAM using Rcpp::Rcout added o Minor bug fix release improving corner cases affecting builds: * Missing semicolon added in Mat_meat (when in C++0x mode), with thanks to Teo Guo Ci * Armadillo version vars now instantiated in RcppArmadillo.cpp which helps older g++ versions, with thanks to Gershon Bialer * Thanks also to Martin Renner for testing these changes * Unit tests output fallback directory changed per Brian Ripley's request to not ever use /tmp * Minor update to version numbers in RcppArmadillo-package.Rd
R CMD check
happy, and one or two new features.
The complete NEWS entry for 0.9.10 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.
Thanks to CRANberries, you can also look at a diff to the previous release 0.9.9. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page0.9.10 2012-02-16 o Rearrange headers so that Rcpp::Rcout can be used by RcppArmadillo et al o New Rcpp sugar function mapply (for limited to two or three input vectors) o Added custom version of the Rcpp sugar diff function for numeric vectors skipping unncesserry checks for NA o Some internal code changes to reflect changes and stricter requirements in R CMD check in the current R-devel versions o Corrected fixed-value initialization for IntegerVector (with thanks to Gregor Kastner for spotting this) o New Rcpp-FAQ entry on simple way to a set compiler option for cxxfunction
The program was titled Passion World and featured songs from around the world. Quite impressively, Elling actually sang in French, Spanish, Portuguese, German and of cause English. The program lacked a little bit of cohesion at times, but it was still a most enjoyable and solid two-hour set. A similar concert will take place in Minneapolis on Saturday, so for those near the Twin Cities I would recommended looking for tickets.
This release has an additional fix for the Windows use case, adds cmake support files for the examples, switches to using the same RNG initialization of time and process id that R uses, and other minor fixes.
All changes since the last release are summarized below:
CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.2012-01-11 Dirk Eddelbuettel* DESCRIPTION: Release 0.2.6 * DESCRIPTION: Updated Description: text 2012-01-08 Dirk Eddelbuettel * src/RInside.cpp: Correct console writer on Windows to not use Rprintf (with thanks to both James Bates and John Brzustowski) * src/RInside.cpp: Update RNG seeding to same scheme now used by R which blends both (millisecond) time and process id * src/RInside.cpp: Replace fprintf(stderr,...) with Rf_error(...) * src/setenv/setenv.c: Idem * inst/examples/*: Added CMake build support for all example directories kindly provided by Peter Aberline; this helps when coding inside of IDEs such as Eclipse, KDevelop or Code::Blocks * inst/examples/standard/Makefile.win: Allow for an R_ARCH variable to be set to enforce either i386 or x64 builds