One week ago, I sent the updated announcement below to the r-sig-finance list; this was kindly blogged about by fellow committee member Josh and by our pal Dave @ REvo. By now. I also updated the R / Finance conference website. So to round things off, a quick post here is in order as well. It may even get a few of the esteemed reader to make a New Year's resolution about submitting a paper :)
Dear R / Finance community,The preparations for R/Finance 2011 are progressing, and due to favourable responses from the different sponsors we contacted, we are now able to offer
More details are below in the updated Call for Papers. Please feel free to re-circulate this Call for Papers with collegues, students and other associations.
Cheers, and Season's Greeting,
Dirk (on behalf of the organizing / program committee)
The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Complete papers or one-page abstracts (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated lightning talks, and for a limited number of pre-conference (longer) seminar sessions.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two $1000 prizes for best paper: one for best practitioner-oriented paper and one for best academic-oriented paper. Further, to defray costs for graduate students, two travel and expense grants of up to $500 each will be awarded to graduate students whose papers are accepted. To be eligible, a submission must be a full paper; extended abstracts are not eligible.
Please send submissions to: committee at RinFinance.com
The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling. The graduate student grant winners will be notified by February 23rd, 2011.
Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.
R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011---including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
The text below went out as a post to the r-packages list a few days ago, but I thought it would make sense to post it on the blog too. So with a little html markup...
Rcpp::IntegerVector,
Rcpp:NumericVector, Rcpp::Function, Rcpp::Environment, ...) that makes it
easier to manipulate R objects of matching types (integer vectors, functions,
environments, etc ...).
Rcpp takes advantage of C++ language features such as the explicit
constructor / destructor lifecycle of objects to manage garbage collection
automatically and transparently. We believe this is a major improvement over
use of PROTECT / UNPROTECT. When an Rcpp object is created, it protects the
underlying SEXP so that the garbage collector does not attempt to reclaim the
memory. This protection is withdrawn when the object goes out of
scope. Moreover, users generally do not need to manage memory directly (via
calls to new / delete or malloc / free) as this is done by the Rcpp classes
or the corresponding STL containers.
A few key points about Rcpp:
which deploys the sugar 'ifelse' function modeled after the corresponding R function. Another simple example isSEXP foo( SEXP xx, SEXP yy){ NumericVector x(xx), y(yy) ; return ifelse( x < y, x*x, -(y*y) ) ; }
where use the sugar function 'sapply' to sweep a simple C++ function which operates elementwise across the supplied vector. The Rcpp-sugar vignette describes sugar in more detail.double square( double x){ return x*x ; } SEXP foo( SEXP xx ){ NumericVector x(xx) ; return sapply( x, square ) ; }
which (after compiling and loading) we can access in R asconst char* hello( const std::string& who ){ std::string result( "hello " ) ; result += who ; return result.c_str() ; } RCPP_MODULE(yada){ using namespace Rcpp ; function( "hello", &hello ) ; }
In a similar way, C++ classes can be exposed very easily. Rcpp modules are also described in more detail in their own vignette.yada <- Module( "yada" ) yada$hello( "world" )
The RcppGSL package permits easy use of the GNU Scientific Library (GSL), a collection of numerical routines for scientifc computing. It is particularly useful for C and C++ programs as it provides a standard C interface to a wide range of mathematical routines such as special functions, permutations, combinations, fast fourier transforms, eigensystems, random numbers, quadrature, random distributions, quasi-random sequences, Monte Carlo integration, N-tuples, differential equations, simulated annealing, numerical differentiation, interpolation, series acceleration, Chebyshev approximations, root-finding, discrete Hankel transforms physical constants, basis splines and wavelets. There are over 1000 functions in total with an extensive test suite. The RcppGSL package provides an easy-to-use interface between GSL data structures and R using concepts from Rcpp. The RcppGSL package also contains a vignette with more documentation.
Dirk Eddelbuettel, Romain Francois, Doug Bates and John Chambers
December 2010
RcppExamples
contains a few illustrations of how to use
Rcpp. It grew out
of documentation for the classic API (now in its own package RcppClassic) and
we added more functions documenting how to do the same with the new API we
have been focusing on for the last year or so. One of the things I added in
the last few days was the example below showing how to use
Rcpp::List with lookups to replace use of the old and deprecated
RcppParams. It also show how to return values to
R rather easily
#include <Rcpp.h>
RcppExport SEXP newRcppParamsExample(SEXP params) {
try { // or use BEGIN_RCPP macro
Rcpp::List rparam(params); // Get parameters in params.
std::string method = Rcpp::as<std::string>(rparam["method"]);
double tolerance = Rcpp::as<double>(rparam["tolerance"]);
int maxIter = Rcpp::as<int>(rparam["maxIter"]);
Rcpp::Date startDate = Rcpp::Date(Rcpp::as<int>(rparam["startDate"])); // ctor from int
Rprintf("\nIn C++, seeing the following value\n");
Rprintf("Method argument : %s\n", method.c_str());
Rprintf("Tolerance argument : %f\n", tolerance);
Rprintf("MaxIter argument : %d\n", maxIter);
Rprintf("Start date argument: %04d-%02d-%02d\n",
startDate.getYear(), startDate.getMonth(), startDate.getDay());
return Rcpp::List::create(Rcpp::Named("method", method),
Rcpp::Named("tolerance", tolerance),
Rcpp::Named("maxIter", maxIter),
Rcpp::Named("startDate", startDate),
Rcpp::Named("params", params)); // or use rparam
} catch( std::exception &ex ) { // or use END_RCPP macro
forward_exception_to_r( ex );
} catch(...) {
::Rf_error( "c++ exception (unknown reason)" );
}
return R_NilValue; // -Wall
}
The package is work-in-progress and needs way more general usage examples for Rcpp and particularly the new API. But it's a start.
A few more details on the page are on the RcppExamples page.
With this release, the older API which we have been referring to as the classic Rcpp API has been split off into its own new package RcppClassic to ensure backwards compatibility. Rcpp will now contain only the new API.
We also fixes a number a minor bugs and applied a few contributed patches which extended functionality or documentation as detailed below in the NEWS entry:
0.9.0 2010-12-19
o The classic API was factored out into its own package RcppClassic which
is released concurrently with this version.
o If an object is created but not initialized, attempting to use
it now gives a more sensible error message (by forwarding an
Rcpp::not_initialized exception to R).
o SubMatrix fixed, and Matrix types now have a nested ::Sub typedef.
o New unexported function SHLIB() to aid in creating a shared library on
the command-line or in Makefile (similar to CxxFlags() / LdFlags()).
o Module gets a seven-argument ctor thanks to a patch from Tama Ma.
o The (still incomplete) QuickRef vignette has grown thanks to a patch
by Christian Gunning.
o Added a sprintf template intended for logging and error messages.
o Date::getYear() corrected (where addition of 1900 was not called for);
corresponding change in constructor from three ints made as well.
o Date() and Datetime() constructors from string received a missing
conversion to int and double following strptime. The default format
string for the Datetime() strptime call was also corrected.
o A few minor fixes throughout, see ChangeLog.
Thanks to
CRANberries, there is
also a
diff to the previous release 0.8.9.
As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
The user group meetings have a meme of showing how to use R with different editors, UIs, IDEs,... It started with a presentation on Eclipse and its StatET plugin. So a while ago I had offered to present on ESS, the wonderful Emacs mode for R (and as well as SAS, Stata, BUGS, JAGS, ...). And now I owe a big thanks to the ESS Core team for keeping all their documentation, talks, papers etc in their SVN archive, and particularly to Stephen Eglen for putting the source code to Tony Rossini's tutorial from useR! 2006 in Vienna there. This allowed me to quickly whip up a few slides though a good part of the presentation did involve a live demo missing from the slides. Again, big thanks to Tony for the old slides and to Stephen for making them accessible when I mentioned the idea of this talk a while back -- it allowed to put this together on short notice.
And for those going to useR! 2011 in Warwick next summer, Stephen will present a full three-hour ESS tutorial which will cover ESS in much more detail.
I worked on this on for a few evenings and weekends in October and November and then spent a few more evenings writing a paper / vignette (which is finished as a very first draft now) about it. This was an interesting and captivating problem as I had worked on genetic algorithms going back quite some time to the beginning and then again the end of graduate school (and traces of that early work are near the bottom of my presentations page). So what got me started? DEoptim is a really nice package, but it is implemented in old-school C. There is nothing wrong with that per se, but at the same time that I was wrestling with GAs, I also taught myself C++ which, to put it simply, offers a few more choices to the programmer. I like having those choices.
And with all the work that Romain and I have put into Rcpp, I was curious how far I could push this cart if I were to move it along. I made a bet with myself starting from the old saw shorter, easier, faster: pick any two. Would it be possible to achieve all three of these goals?
DEoptim, and I take version 2.0-7 as my reference point here, is pretty efficiently yet verbosely coded. Copying a vector takes a loop with an assignment for each element, copying a matrix does the same using two loops. Replacing that with a single statement in C++ is pretty easy. We also have a few little optimisations behind the scenes here and there in Rcpp: would all that be enough to move the needle in terms of performance? And the same time, DEoptim is also full of the uses of the old R API which we often point to in the Rcpp documentation so fixing readibility should be a relatively low-hanging fruit.
To cut a long story short, I was able to reduce code size quite easily by using a combination of C++ and Rcpp idioms. I was also able to get to faster: the paper / vignette demostrates consistent speed improvements on all setups that I tested (three standard functions on three small and three larger parameter vectors). More important speed gains were achieved by allowing use of objective functions that are written in C++ which again is both possible and easy thanks to Rcpp.
That leaves easier to prove: adding compiled objective functions is one indication; further proof could be provided by, say, moving the inner loop to parallel execution thanks to Open MP which I may attempt over the next few months. So far I'd like to give myself about half a point here. So not quite yet shorter, easier, faster: pick any three, but working on it.
Over the next few days I may try to follow up with a blog post or two contrasting some code examples and maybe showing a chart from the vignette.
Regina Carter was presenting material from her current record 'Reverse Thread'. This was a real nice set of African-themed world music featuring Carter herself on violin, Yacouba Sissoko on kora, Will Holshouser on accordion, Chris Lightcap on bass and Alvester Garnett on drums. Some of pieces were really, really nicely done and I particularly enjoyed Holshouser on the accordion.
After the break, Esperanza Spalding come on for her `Chamber Music Society'. Lovely setup with Spalding on acoustic bass and vocals, Leo Genovese on piano/keyboards, Sara Caswell on violin, Lois Martin on viola, Jody Redhage on cello, the always impressice Terry Lyne Carrington on drums and Leala Cyr on backing vocals (and one co-lead in a really nice duet with Spalding). This was clearly more experimental and a chunk of the audience left during the act. But there is room for improvided chamber music, and it was a good modern music act. And Spalding is really quite impressive and I will gladly go and see her again.
This version adds an internal performance enhancement which is obtained by making due with fewer reads. The short NEWS file entry follows:
0.3.8 2010-12-07
o faster cfunction and cxxfunction by loading and resolving the routine
at "compile" time
We have now found some time to finish this work for a first release, together with a nicely detailed eleven page package vignette. As of today, the package is now a CRAN package, and Romain already posted a nice announcement on his blog and on the rcpp-devel list.
So what does RcppGSL do? I gave the package its own webpage here as well and listed these points as key features of RcppGSL:
Also provided is a simple example which is a simple implementation of a column norm (which we could easily compute directly in R, but we are simply re-using an example from Section 8.4.14 of the GSL manual):
#include <RcppGSL.h> #include <gsl/gsl_matrix.h> #include <gsl/gsl_blas.h> extern "C" SEXP colNorm(SEXP sM) { try { RcppGSL::matrix<double> M = sM; // create gsl data structures from SEXP int k = M.ncol(); Rcpp::NumericVector n(k); // to store results for (int j = 0; j < k; j++) { RcppGSL::vector_view<double> colview = gsl_matrix_column (M, j); n[j] = gsl_blas_dnrm2(colview); } M.free() ; return n; // return vector } catch( std::exception &ex ) { forward_exception_to_r( ex ); } catch(...) { ::Rf_error( "c++ exception (unknown reason)" ); } return R_NilValue; // -Wall }
This example function is implemented in an example package contained in the RcppGSL package itself -- so that users have a complete stanza to use in their packages. This will then build a user package on Linux, OS X and Windows provided the GSL is installed (and on Windows you have to do all the extra steps of defining an environment variable pointing to and of course install Rtools to build in the first place---Linux and OS X are so much easier for development).
Another complete example is in the package itself and provides a faster (compiled) alternative to the standard lm() function in R; this example is the continuation of the same example I had in several versions of my Intro to HPC with R tutorials and in the Rcpp package itself as an early example.
We will try to touch base with CRAN package authors using both GSL and Rcpp to see how this can help them. The API in our package may well be incomplete, but we are always happy to try to respond to requests for additional features brought to our attention, preferably via the rcpp-devel list.
More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
Most of the changes were made two and four weeks ago: first in response to some warnings triggered by R 2.12.0 on the included manual pages which needed a brush-up, and then again is some consolidation of manual pages and some other minor tweaks. The release was then held back at CRAN as we noticed that manual pages, when collated to a single large document, triggered a segmentation fault in the latex compiler. Oddly enough only in Europe (if the a4paper option was used) and not here (where I use uspaper). Long story short, this turns out to be a bug in the latex toolchain (which we reported as Debian bug report 604754) which is apparently is known but has no known fix yet (a sample file was supplied with the bug report if you want to take a look).
With that, special thanks go to Kurt Hornik and Brian Ripley on the R Core team who made a change to how R processes the manual which made it resilient to the latex bug so that normal release of the package could proceed (and the shiny manual is available too).
Thanks to CRANberries, there is also a diff to the previous release 0.3.4. Full changelog details, examples and more details about this package are at my RQuantLib page.
This release comes a few weeks after the preceding 0.8.8 release and continues with a number of enhancements mostly to what we call Rcpp modules, our even-easier C++/R integration which follow some ideas from Boost.Python. Our corresponding Rcpp-modules vignette has been updated too.
The NEWS entry follows below:
0.8.9 2010-11-27
o Many improvements were made to in 'Rcpp modules':
- exposing multiple constructors
- overloaded methods
- self-documentation of classes, methods, constructors, fields and
functions.
- new R function "populate" to facilitate working with modules in
packages.
- formal argument specification of functions.
- updated support for Rcpp.package.skeleton.
- constructors can now take many more arguments.
o The 'Rcpp-modules' vignette was updated as well and describe many
of the new features
o New template class Rcpp::SubMatrix and support syntax in Matrix
to extract a submatrix:
NumericMatrix x = ... ;
// extract the first three columns
SubMatrix y = x( _ , Range(0,2) ) ;
// extract the first three rows
SubMatrix y = x( Range(0,2), _ ) ;
// extract the top 3x3 sub matrix
SubMatrix y = x( Range(0,2), Range(0,2) ) ;
o Reference Classes no longer require a default constructor for
subclasses of C++ classes
o Consistently revert to using backticks rather than shell expansion
to compute library file location when building packages against Rcpp
on the default platforms; this has been applied to internal test
packages as well as CRAN/BioC packages using Rcpp
Thanks to
CRANberries, there is
also a
diff to the previous release 0.8.8:
As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
summary() function for the fastLm function.
The short NEWS file extract follows, also containing Conrad's entry for 1.0.0:
0.2.10 2010-11-25
o Upgraded to Armadillo 1.0.0 "Antipodean Antileech"
* After 2 1/2 years of collaborative development, we are proud to
release the 1.0 milestone version.
* Many thanks are extended to all contributors and bug reporters.
o R/RcppArmadillo.package.skeleton.R: Updated to no longer rely on GNU
make for builds of packages using RcppArmadillo
o summary() for fastLm() objects now returns r.squared and adj.r.squared
And courtesy of
CRANberries, here is
the
diff to the previous release 0.2.9:
ChangeLog | 17 ++++++++ DESCRIPTION | 25 +++++------ R/RcppArmadillo.package.skeleton.R | 4 - R/fastLm.R | 21 +++++++++ inst/NEWS | 13 ++++++ inst/doc/RcppArmadillo-unitTests.pdf |binary inst/doc/unitTests-results/RcppArmadillo-unitTests.html | 6 +- inst/doc/unitTests-results/RcppArmadillo-unitTests.txt | 34 ++++++++-------- inst/include/armadillo_bits/arma_version.hpp | 15 +++++-- inst/skeleton/Makevars | 2 src/Makevars | 2 src/Makevars.win | 2 12 files changed, 97 insertions(+), 44 deletions(-)
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
The short NEWS file extract follows, also containing Conrad's entry for 0.9.92::
0.2.9 2010-11-11
o Upgraded to Armadillo 0.9.92 "Wall Street Gangster":
* Fixes for compilation issues under the Intel C++ compiler
* Added matrix norms
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
It fixes a minor bug: when package.skeleton() was called to
convert one or more functions created with this package into a package, the
corner case of just a single submitted function failed. This is now corrected.
Otherwise this release is unchanged from the previous release 0.3.6 from
August.
This release follows on the heels of 0.8.7, but contains fixes for a few small things Romain and I had noticed over the last two weeks since releasing 0.8.7 and contains only a small number of new tweaks. The NEWS entry follows below:
0.8.8 2010-11-01
o New syntactic shortcut to extract rows and columns of a Matrix.
x(i,_) extracts the i-th row and x(_,i) extracts the i-th column.
o Matrix indexing is more efficient. However, faster indexing is
disabled if g++ 4.5.0 or later is used.
o A few new Rcpp operators such as cumsum, operator=(sugar)
o Variety of bug fixes:
- column indexing was incorrect in some cases
- compilation using clang/llvm (thanks to Karl Millar for the patch)
- instantation order of Module corrected
- POSIXct, POSIXt now correctly ordered for R 2.12.0
As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
A video recording of our ninety-minute talk is already available via the YouTube channel for Google Tech Talks. The (large) pdf with slides (which Romain had already posted on slideshare) is also available from my presentations page.
The remainder of the weekend was nice too (with the notably exception of the extremly sucky weather). We got to to spend some time at the Google Summer of Code Mentor Summit which is always a fun event and a great way to meet other open source folks in person. And we also took one afternoon off to spend some with John Chambers discussing further work involving Rcpp and the new ReferenceClasses that appeared in the just-released R version 2.12.0. This should be a nice avenue to further integrate R and C++ in the near future.
This releases extends the recent 0.2.0 release of RProtoBuf with a patch for raw bytes serialization which Koert Kuipers kindly contributed. This helps RProtoBuf for RPC communication where raw bytes are often a preferred form.
As always, there is more information at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.
This version fixes a number of issues that had been compiled in the issue tracker on the project site at Google Code. Tomoaki Nishiyama, who joined our small development group for his package a few weeks ago, was instrumental in a number of these fixes, with assistance from Joe Conway.
The relevant NEWS file entry follows below:
Version 0.1-7 -- 2010-10-17
o Several potential buffer overruns were fixed
o dbWriteTable now writes a data.frame to database through a network
connection rather than a temporary file. Note that row_names may be
changed in future releases. Also, passing in filenames instead of
data.frame is not supported at this time.
o When no host is specified, a connection to the PostgreSQL server
is made via UNIX domain socket (just like psql does)
o Table and column names are case sensitive, and identifiers are escaped
or quoted appropriately, so that any form of table/column names can be
created, searched, or removed, including upper-, lower- and mixed-case.
o nullOk in dbColumnInfo has a return value of NA when the column does
not correspond to a column in the table. The utility of nullOk is
doubtful but not removed at this time.
o Correct Windows getpid() declaration (with thanks to Brian D. Ripley)
o A call of as.POSIXct() with a time format string wrongly passed to TZ
has been corrected; this should help with intra-day timestamps (with
thanks to Steve Eick)
o Usage of tmpdir has been improved on similarly to Linux (with thanks
to Robert McGehee)
More information is on the my RPostgreSQL page, and on project site at Google Code.
The short NEWS file extract follows, also containing Conrad's entry for 0.9.90::
o Upgraded to Armadillo 0.9.90 "Water Dragon":
* Added unsafe_col()
* Speedups and bugfixes in lu()
* Minimisation of pedantic compiler warnings
o Switched NEWS and ChangeLog between inst/ and the top-level directory
so that NEWS (this file) gets installed with the package
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
This Rcpp release depends on R 2.12.0 as two things have changed. First, we play along with change in R concerning the ordering of inheritance for time classes. But secondly, and more importantly, we support in Rcpp the corresponding change R itself which brings the new ReferenceClasses. Here is corresponding bit from R's NEWS file for R 2.12.0:
o A facility for defining reference-based S4 classes (in the OOP
style of Java, C++, etc.) has been added experimentally to
package methods; see ?ReferenceClasses.
[...]
o An experimental new programming model has been added to package
methods for reference (OOP-style) classes and methods. See
?ReferenceClasses.
This was made possible in large part by code committed by
John Chambers
(whom we had welcomed recently as a co-author to
Rcpp) building on
the changes he made to R 2.12.0 itself, as well on the work Romain had done
with 'Rcpp Modules'. The R help page for ReferenceClasses
carries a reference (bad pun) to Rcpp 0.8.7 so these two releases do go
together. This should be a lot of fun over the next little while:
S3, S4, and now ReferenceClasses.
We also made a number of internal changes some of which leads to speed-ups and internal improvement. The NEWS entry follows below:
0.8.7 2010-10-15
o As of this version, Rcpp depends on R 2.12 or greater as it interfaces
the new reference classes (see below) and also reflects the POSIXt class
reordering both of which appeared with R version 2.12.0
o new Rcpp::Reference class, that allows internal manipulation of R 2.12.0
reference classes. The class exposes a constructor that takes the name
of the target reference class and a field(string) method that implements
the proxy pattern to get/set reference fields using callbacks to the
R operators "$" and "$<-" in order to preserve the R-level encapsulation
o the R side of the preceding item allows methods to be written
in R as per ?ReferenceClasses, accessing fields by name and
assigning them using "<<-". Classes extracted from modules
are R reference classes. They can be subclassed in R, and/or R methods
can be defined using the $methods(...) mechanism.
o internal performance improvements for Rcpp sugar as well as an added
'noNA()' wrapper to omit tests for NA values -- see the included
examples in inst/examples/convolveBenchmarks for the speedups
o more internal performance gains with Functions and Environments
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
This was the sixth time I ran this race (and my 14th marathon overall). And I still can't run this course all that well: never got a Boston qualification here. As I had mentioned when I blogged about my third Boston Marathon earlier in the year and the recent Chicago Half-Marathon, I have had some recurrent issue with a sore achilles which limited my running throughout the year. It had gotten better but a quick summary of the miles in my running log showed that I had been running only about 80% of the training miles I had in prior years. And not a single 20-miler. I knew I'd have to pay for that.
Plus, as so often, the weather. Not quite as hot as the record-heat of 2007. But close enough: high 60s at the start and high 70s or even low 80s towards the end. But I have to compliment to the race organisers. The race was very well organised (following the experience of 2007) with extra water stops, extra sponges handed out at several spots (!!) and very good communication when during the race the alert level was raised to yellow given the heat and humidity. The searchable results now show a fair number of non-finishers, but at least nobody seems to have died. But it looked ugly on the course. I think I ran by three or four sets of paramedics assisting runners who were 'down and out'..
So how did I do? Fair, I suppose -- I ran pretty well for sixteen miles, then needed a first short walking break and continued to run well towards and past the 18 mile waterstop where a bunch of friends and fellow Oak Park runners were helping. But not long after that, I crumbled and needed to alternate walking and running for most of the remainder. With that I came in at 3:41:41, or a 8:28 min/mile pace. And which is by two seconds slower than the previous 'worst' from 2007. But heck, at least it's still more than three minutes faster than Dubya in Houston in 1993 ... I also got beat by a few local running friends as well as by Chicago's own marathon juggler. So there. Maybe I'll train a bit more next time.
There is now a new version 0.2.4 of gcbd on CRAN. I revised the paper ever so slightly based on some more feedback, and focussed the results sections by concentrating on just the log-axes lattice blot and the corresponding lattice plot of raw results---where the y-axis is capped at 30 seconds:
This chart--in levels rather than using logarithmic axes is done here--nicely illustrates just how large the performance difference can be for for matrix multiplication and LU decomposition. QR and SVD are closer but accelerated BLAS libraries still win. GPUs can be compelling for some tasks and large sizes.
More discussion is still available in the paper which is also included in the gcbd package for R.
The short NEWS file extract follows, also containing Conrad's entry for 0.9.80::
0.2.7 2010-09-25
o Upgraded to Armadillo 0.9.80 "Chihuahua Muncher":
* Added join_slices(), insert_slices(), shed_slices()
* Added in-place operations on diagonals
* Various speedups due to internal architecture improvements
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
Call for Papers:
R/Finance 2011: Applied Finance with R
April 29 and 30, 2011
Chicago, IL, USA
The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
One-page abstracts or complete papers (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated "lightning talks", and for a limited number of pre-conference (longer) seminar sessions.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Please send submissions to: committee at RinFinance.com.
The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling.
Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.
R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011 including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua UlrichSo see you in Chicago in April!
As in 2008 and 2009, the R Project has again participated in the Google Summer of Code during 2010.
Based on ideas collected and disussed on the R Wiki, the projects and students listed below (and sorted alphabetically by student) were selected for participation and have been sponsored by Google during the summer 2010.
The finished projects are available via the R / GSoC 2010 repository at Google Code, and in several cases also via their individual repos (see below). Informal updates and final summaries on the work was also provided via the GSoC 2010 R group blog.
Proposal:
radx is a package to compute derivatives (of any order) of native R code for multivariate functions with vector outputs,
f:R^m -> R^n, through Automatic Differentiation (AD). Numerical evaluation of derivatives has widespread uses in many
fields. rdx will implement two modes for the computation of derivatives, the Forward and Reverse modes of AD, combining
which we can efficiently compute Jacobians and Hessians. Higher order derivatives will be evaluated through Univariate
Taylor Propagation.
Delivered: Two packages radx: forward automatic differentiation in R and tada: templated automatic differentiation in C++ were created; see this blog post for details.
Proposal: R puts the latest statistical techniques at one's fingertips through thousands of add-on packages available on the CRAN download servers. The price for all of this power is complexity. Deducer is a cross-platform cross-console graphical user interface built on top of R designed to reduce this complexity. This project proposes to extend the scope of Deducer by creating an innovative yet intuitive system for generating statistical graphics based on the ggplot2 package.
Delivered: All of the major features have been implemented, and are outlined in the video links in this blog post.
Proposal: At present there does not exist a robust geometry engine available to R, the tools that are available tend to be limited in scope and do not easily integrate with existing spatial analysis tools. GEOS is a powerful open source geometry engine written in C++ that implements spatial functions and operators from the OpenGIS Simple Features for SQL specification. rgeos will make these tools available within R and will integrate with existing spatial data tools through the sp package.
Delivered: The rgeos project on R-Forge; see the final update blog post.
Proposal: Social Relations Analyses (SRAs; Kenny, 1994) are a hot topic both in personality and in social psychology. While more and more research groups adopt the methodology, software solutions are lacking far behind - the main software for calculating SRAs are two DOS programs from 1995, which have a lot of restrictions. My GSOC project will extend the functionality of these existing programs and bring the power of SRAs into the R Environment for Statistical Computing as a state-of-the-art package.
Delivered: The TripleR package is now on CRAN and hosted on RForge.Net; see this blog post for updates.
Proposal: So-called NoSQL databases are becoming increasingly popular. They generally provide very efficient lookup of key/value pairs. I'll provide several implementation of NoSQL interface for R. Beyond a sample interface package, I'll try to support generic interface similar to what the DBI package does for SQL backends
Status: An initial prototype is available via RTokyoCabinet on Github. No updates were made since June; no communication occurred with anybody related to the GSoC project since June and the project earned a fail.
Last modified: Wed Sep 22 19:39:43 CDT 2010
Another issue that I felt needed addressing was a comparison between the different alternatives available, quite possibly including GPU computing. So a few weeks ago I sat down and wrote a small package to run, collect, analyse and visualize some benchmarks. That package, called gcbd (more about the name below) is now on CRAN as of this morning. The package both facilitates the data collection for the paper it also contains (in the vignette form common among R packages) and provides code to analyse the data---which is also included as a SQLite database. All this is done in the Debian and Ubuntu context by transparently installing and removing suitable packages providing BLAS implementations: that we can fully automate data collection over several competing implementations via a single script (which is also included). Contributions of benchmark results is encouraged---that is the idea of the package.
The paper itself describes the background and technical details before presenting the results. The benchmark compares the basic reference BLAS, Atlas (both single- and multithreaded), Goto, Intel MKL and a GPU-based approach. This blog post is not the place to recap all results, so please do see the paper for more details. But one summary chart regrouping the main results fits well here:
This chart, in a log/log form, shows how reference BLAS lags everything, how multithreaded newer Atlas improves over the standard Atlas package currently still the default in both distros, how the Intel MKL (available via Ubuntu) is fairly good but how Goto wins almost everything. GPU computing is compelling for really large sizes (at double precision) and too costly at small ones. It also illustrates variability and different computational cost across the methods tested: svd is more expensive than level-3 matrix multiplication, and the different implementations are less spread apart. More details are in the paper; code, data etc are in the package gcbd.
The larger context is to do something like this benchmarking exercise, but across distributions, operating systems and possibly also GPU cards. Mark and I started to talk about this during and after R/Finance earlier this year and have some ideas. Time permitting, that work should be happening in the GPU/CPU Benchmarks (gdb) project, and that's why this got called gcbd as a simpler GPU/CPU Benchmarks on Debian Systems study.
The short NEWS file extract follows:
0.2.6 2010-09-12
o Upgraded to Armadillo 0.9.70 "Subtropical Winter Safari"
o arma::Mat, arma::Row and arma::Col get constructor that take vector
or matrix sugar expressions. See the unit test "test.armadillo.sugar.ctor"
and "test.armadillo.sugar.matrix.ctor" for examples.
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
Race conditions were fantastic. We had a rainy and gray day yesterday but today is pure bliss. Temperatures around 60 degrees at the 7:00am start, no wind, sunshine and not a cloud in the sky.
The race itself went well. I had a pretty brutal running year suffering most of the time from some archilles tendon inflammation. It has gotten better in the last few weeks possibly thanks to some heel cups I now put in the shoes. But I had exactly one run longer than ten miles since the Boston Marathon. So I lost a lot of speed, as well as endurance and was a little nervous as to how I'd do. And considering all this, it went pretty well. I fininished in 1;41:50 or a 7:47 pace. While is easily the slowest half in a number of years, at least I got to run it evenly, pain free and with a negative split (== faster second half) and some gas left for a fast last half mile or so. So maybe I don't have to retire from running just yet. We'll see if I get some speed back in 2011.
This is only the second release after 0.1-0 more than six months ago. Given that Rcpp is such a key ingedrient for RProtoBuf, and that Rcpp underwent so many exciting changes itself, Romain and I never got around to releasing new versions of RProtoBuf. This version is now much closer to the actual C++ API and fairly feature rich. We summarised a few of these new things in the presentation at useR! 2010.
There is more information at the RProtoBuf page; there is a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.
Turned out I could, and it became a nice evening out. Darren Hanlon started up the evening as the opener for a good half hour, and was quite decent; somewhat charming in a good natured way, not taking himself too too seriously. I'd gladlt see him again.
After a longer-than-needed break Billy Bragg came on stage and played for two straight hours, alternating between an electric and acoustic guitar. And also alternating between some newer material and (especially towards the end and the encore) some old crowd-pleasure. I don't know his material all that well but have of course know of his career over these last 25 years and am quite glad I went to see him. Nice way to end the week.
This release adds quite few things. The main one may be the addition of
density, distribution, quantile and random number functions for a rather large
number of statistical distribution. Usage is pretty much as it would be in R,
yet it is vectorised at the C++ level. A fair number of unit tests were
added too, but some work is left to do there too.
Support for complex number was enhanced both in the expressive 'sugar'
context and via a few binary operators that had been missing. We also started a
new vignette to provide a 'quick reference'; unfortunately this is not quite
complete yet.
The NEWS entry follows below:
0.8.6 2010-09-09
o new macro RCPP_VERSION and Rcpp_Version to allow conditional compiling
based on the version of Rcpp
#if defined(RCPP_VERSION) && RCPP_VERSION >= Rcpp_Version(0,8,6)
...
#endif
o new sugar functions for statistical distributions (d-p-q-r functions)
with distributions : unif, norm, gamma, chisq, lnorm, weibull, logis,
f, pois, binom, t, beta.
o new ctor for Vector taking size and function pointer so that for example
NumericVector( 10, norm_rand )
generates a N(0,1) vector of size 10
o added binary operators for complex numbers, as well as sugar support
o more sugar math functions: sqrt, log, log10, exp, sin, cos, ...
o started new vignette Rcpp-quickref : quick reference guide of Rcpp API
(still work in progress)
o various patches to comply with solaris/suncc stricter standards
o minor enhancements to ConvolutionBenchmark example
o simplified src/Makefile to no longer require GNU make; packages using
Rcpp still do for the compile-time test of library locations
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Now, let me prefix this by saying that I really enjoyed Radford's posts. He obviously put a lot of time into finding a number of (all somewhat small in isolation) inefficiencies in R which, when taken together, can make a difference in performance. I already spotted one commit by Duncan in the SVN logs for R so this is being looked at.
Yet Christian, on the other hand, goes a little overboard in bemoaning performance differences somewhere between ten and fifteen percent -- the difference between curly and straight braces (as noticed in Radford's first post). Maybe he spent too much time waiting for his MCMC runs to finish to realize the obvious: compiled code is evidently much faster.
And before everybody goes and moans and groans that that is hard, allow me to just interject and note that it is not. It really
doesn't have to be. Here is a quick
cleaned up version of Christian's example code, with proper assigment operators and a second variable x. We then get to the
meat and potatoes and load our
Rcpp package as well as
inline to define the same little test function in C++. Throw in
rbenchmark which I am becoming increasingly fond of for these little timing tests,
et voila, we have ourselves a horserace:
# Xian's code, using <- for assignments and passing x down f <- function(n, x=1) for (i in 1:n) x=1/(1+x) g <- function(n, x=1) for (i in 1:n) x=(1/(1+x)) h <- function(n, x=1) for (i in 1:n) x=(1+x)^(-1) j <- function(n, x=1) for (i in 1:n) x={1/{1+x}} k <- function(n, x=1) for (i in 1:n) x=1/{1+x} # now load some tools library(Rcpp) library(inline) # and define our version in C++ l <- cxxfunction(signature(ns="integer", xs="numeric"), 'int n = as<int>(ns); double x=as<double>(xs); for (int i=0; i<n; i++) x=1/(1+x); return wrap(x); ', plugin="Rcpp") # more tools library(rbenchmark) # now run the benchmark N <- 1e6 benchmark(f(N, 1), g(N, 1), h(N, 1), j(N, 1), k(N, 1), l(N, 1), columns=c("test", "replications", "elapsed", "relative"), order="relative", replications=10)
And how does it do? Well, glad you asked. On my i7, which the other three cores standing around and watching, we get an eighty-fold increase relative to the best interpreted version:
/tmp$ Rscript xian.R
Loading required package: methods
test replications elapsed relative
6 l(N, 1) 10 0.122 1.000
5 k(N, 1) 10 9.880 80.984
1 f(N, 1) 10 9.978 81.787
4 j(N, 1) 10 11.293 92.566
2 g(N, 1) 10 12.027 98.582
3 h(N, 1) 10 15.372 126.000
/tmp$
So do we really want to spend time arguing about the ten and fifteen percent differences? Moore's law gets you
those gains in a couple of weeks anyway. I'd much rather have a conversation about how we can get people speed increases that are orders of
magnitude, not fractions. Rcpp is one such tool. Let's get more of them.
The film, which is written, directed and producted by Dan Pritzker, is based loosely on the early years of Louis Armstrong in New Orleans. The movie is shot beautifully by Vilmos Zsigmond in blend of colour and black-and-white which works very well for invoking the early days of film. A key part of the production is of course the score, and the live music with both a thirteen-piece orchestra featuring Wynton Marsalis as well as piano solo recitals by Cecile Licad with an emphasis on pieces by 19th-century composer Louis Moreau Gottschalk. The combination of a silent movie with a stong live band is something to behold -- if you can catch the movie and performance in a city nearby, go!
This follows the 0.3.3 release from last week and has again a number of internal changes. All uses of objects from external namespaces are now explicit as I removed the remaining using namespace QuantLib;. This makes things a little more verbose, but should be much clearer to read, especially for those not yet up to speed on whether a given object comes from any one of the Boost, QuantLib or Rcpp namespaces. We also generalized an older three-dimensional plotting function used for option surfaces -- which had already been used in the demo() code -- and improved the code underlying this: arrays of option prices and analytics given two input vectors are now computed at the C++ level for a nice little gain in efficiency. This also illustrates the possible improvements from working with the new Rcpp API that is now used throughout the package,
Full changelog details, examples and more details about this package are at my RQuantLib page.
This is the first release since March when we released 0.2.2. A few things got added to Rcpp in the meantime, and RInside is taking advantage of some of these as illustrated in several of the included examples.
More details and the changelog are on the RInside page which also leads
This release upgrades the included Armadillo version to Conrad's just-released version 0.9.60. This overcomes some of minor issues we had with 'older' compilers such as g++ 4.2.x with x being 1 or 2. No other changes were made from our end.
The short NEWS file extract follows:
0.2.5 2010-08-05
o Upgraded to Armadillo 0.9.60 "Killer Bush Turkey"
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
Many of the changes in this new version are internal. The code was re-written using the new Rcpp API throughout, and the build system was further simplified using the LinkingTo: mechanism. The arithmetic average-price asian option pricer was added. A few other code updates were made as well.
Full changelog details, examples and more details about this package are at my RQuantLib page.
But at least this package now joins RcppArmadillo is using the highly-recommened LinkingTo: Rcpp directive in the DESCRIPTION file to let R find the Rcpp headers, making the build process a little more robust.
A few more details on the page are on the RcppExamples page.
This release upgrades the included Armadillo version to 0.9.52 (see here for Conrad's high-level changes). We had to make two minor tweaks. In the fastLm() help page example we switched from inv() to pinv() The short NEWS file extract follows:
0.2.4 2010-07-27
o Upgraded to Armadillo 0.9.52 'Monkey Wrench'
o src/fastLm.cpp: Switch from inv() to pinv() as inv() now tests for
singular matrices and warns and returns an empty matrix which stops
the example fastLm() implementation on the manual page -- and while
this is generally reasonably it makes sense here to continue which
the Moore-Penrose pseudo-inverse allows us to do this
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
This release constitutes a quick follow-up to the last release 0.8.4 which we got out just before CRAN closed for summer vacations. Some fixes were made right after last release: two harmless warnings from the help file parser of the development version of R are now addressed, and we stopped using shell expansions in the Makefile snippets. We also added to some internal speedups we discovered while prepapring the talk about RProtoBuf for last week's useR! meeting.
The NEWS entry follows below:
0.8.5 2010-07-25
o speed improvements. Vector::names, RObject::slot have been improved
to take advantage of R API functions instead of callbacks to R
o Some small updates to the Rd-based documentation which now points to
content in the vignettes. Also a small formatting change to suppress
a warning from the development version of R.
o Minor changes to Date() code which may reenable SunStudio builds
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
As at the preceding useR! 2008 in Dortmund and useR! 2009 in Rennes, I presented a three-hour tutorial on high-performance computing with R. This covers scripting/automation, profiling, vectorisation, interfacing compiled code, parallel computing and large-memory approaches. The slides, as well as a condensed 2-up version, are now on my presentations page.
On Wednesday, Romain and I had a chance to talk about recent work on Rcpp, our R and C++ integration. Thursday, we followed up with a presentation on RProtoBuf -- a project integrating Google's Protocol Buffers with R which much to our delight already seems to be in use at Google itself! It was quite fun to do these two talks jointly with Romain. But my other coauthor Khanh had to be at a conference related to his actual PhD work. So on Friday it was just me to give a presentation about RQuantLib which brings QuantLib to R.
Slides from all these talks have now been added to my presentations page. I will also upload them via the conference form so that they can be part of the conference's collection of presentations which should be forthcoming.
This release builds upon release 0.8.3. Highlights include changes to the sugar framework for highly expressive C++ constructs which gained new vector function as well as a first set of matrix function. As well, unit tests have been reorganised in such a way that we end up with a lot fewer compilations (but of several files at once) which reaps significant speed gains. Date calculation now use the same mktime() function R itself uses (and which comes from Arthur Olson's tzone library). The NEWS entry follows below:
0.8.4 2010-07-09
o new sugar vector functions: rep, rep_len, rep_each, rev, head, tail,
diag
o sugar has been extended to matrices: The Matrix class now extends the
Matrix_Base template that implements CRTP. Currently sugar functions
for matrices are: outer, col, row, lower_tri, upper_tri, diag
o The unit tests have been reorganised into fewer files with one call
each to cxxfunction() (covering multiple tests) resulting in a
significant speedup
o The Date class now uses the same mktime() replacement that R uses
(based on original code from the timezone library by Arthur Olson)
permitting wide dates ranges on all operating systems
o The FastLM example has been updated, a new benchmark based on the
historical Longley data set has been added
o RcppStringVector now uses std::vector internally
o setting the .Data slot of S4 objects did not work properly
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
It comes about three weeks after the 0.8.2 release. And even though we promised to concentrate on documentation, it contains a raft of new features:
The main thing here is Rcpp sugar for which we also have a new (seventh !!) vignette Rcpp-sugar. As a quick example, consider this simple C++ function that takes two vectors from R and creates a new one conditional on the relative values:
export "C" SEXP foo( SEXP xs, SEXP ys) { Rcpp::NumericVector x(xs), y(ys); int n = x.size(); Rcpp::NumericVector res( n ); double xd = 0.0, yd = 0.0 ; for( int i=0; i<n; i++){ xd = x[i]; yd = y[i]; if( xd < yd ){ res[i] = xd * xd ; } else { res[i] = -( yd * yd); } } return res ; }Now, if you use R, you really want to writes this more compactly. And now you can, thanks to Rcpp sugar:
extern "C" SEXP foo( SEXP xs, SEXP ys){ Rcpp::NumericVector x(xs), y(xs); return ifelse( x < y, x*x, -(y*y)); }Same great taste, but much less filling! More details are in the Rcpp-sugar vignette. Doug Bates is already a fan of this and is employing it in the lme4a development version of the well-known lme4 package.
The full NEWS entry for this release follows below:
0.8.3 2010-06-27
o This release adds Rcpp sugar which brings (a subset of) the R syntax
into C++. This supports :
- binary operators : <,>,<=,>=,==,!= between R vectors
- arithmetic operators: +,-,*,/ between compatible R vectors
- several functions that are similar to the R function of the same name:
abs, all, any, ceiling, diff, exp, ifelse, is_na, lapply, pmin, pmax,
pow, sapply, seq_along, seq_len, sign
Simple examples :
// two numeric vector of the same size
NumericVector x ;
NumericVector y ;
NumericVector res = ifelse( x < y, x*x, -(y*y) ) ;
// sapply'ing a C++ function
double square( double x ){ return x*x ; }
NumericVector res = sapply( x, square ) ;
Rcpp sugar uses the technique of expression templates, pioneered by the
Blitz++ library and used in many libraries (Boost::uBlas, Armadillo).
Expression templates allow lazy evaluation of expressions, which
coupled with inlining generates very efficient code, very closely
approaching the performance of hand written loop code, and often
much more efficient than the equivalent (vectorized) R code.
Rcpp sugar is curently limited to vectors, future releases will
include support for matrices with sugar functions such as outer, etc ...
Rcpp sugar is documented in the Rcpp-sugar vignette, which contains
implementation details.
o New helper function so that "Rcpp?something" brings up Rcpp help
o Rcpp Modules can now expose public data members
o New classes Date, Datetime, DateVector and DatetimeVector with proper
'new' API integration such as as(), wrap(), iterators, ...
o The so-called classic API headers have been moved to a subdirectory
classic/ This should not affect client-code as only Rcpp.h was ever
included.
o RcppDate now has a constructor from SEXP as well
o RcppDateVector and RcppDatetimeVector get constructors from int
and both const / non-const operator(int i) functions
o New API class Rcpp::InternalFunction that can expose C++ functions
to R without modules. The function is exposed as an S4 object of
class C++Function
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
It adds a tiny bit of configuration to permit Sun Studio / suncc to successfully build the package. There is no code change, and no configuration change for the other platforms. Thanks for Brian Ripley for additional testing, and of course for running those build instances (and everything else he does) for the R project, and to Conrad Sanderson as upstream author of the Armadillo C++ library for linear algebra.
As usual, more information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
The new version is now in Debian, at CPAN and on my beancounter page here. Enjoy!
The full NEWS entry for this release follows below:
0.8.2 2010-06-09
o Bug-fix release for suncc compiler with thanks to Brian Ripley for
additional testing.
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
This release works well with the most recent inline release 0.3.5. One can now employ inlined R code as we generalized how/which headers are included and how library / linking information is added thanks a plugin mechanism. This is the first RcppArmadillo version to provide such a plugin, We also updated the included Armadillo headers to its most recent release 0.9.10, added some more operators and provide a utility function RcppArmadillo:::CxxFlags() to provide include directory information on the fly.
An example of the direct inline approach for the fastLm function:
library(inline) library(RcppArmadillo) src <- ' Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP int n = Xr.nrow(), k = Xr.ncol(); arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy arma::colvec y(yr.begin(), yr.size(), false); arma::colvec coef = arma::solve(X, y); // fit model y ~ X arma::colvec res = y - X*coef; // residuals double s2 = std::inner_product(res.begin(), res.end(), res.begin(), double())/(n - k); // std.errors of coefficients arma::colvec std_err = arma::sqrt(s2 * arma::diagvec( arma::inv(arma::trans(X)*X) )); return Rcpp::List::create(Rcpp::Named("coefficients") = coef, Rcpp::Named("stderr") = std_err, Rcpp::Named("df") = n - k ); ' fun <- cxxfunction(signature(ys="numeric", Xs="numeric"), src, plugin="RcppArmadillo")
This creates a compiled function fun which, by using Armadillo, regresses a vector ys on a matrix Xs (just how the fastLmPure() function in the package does) --- yet is constructed on the fly using cxxfunction from inline.
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
There are a few fairly visible new things in this release. As we want to focus the next few minor releases on completing the documentation, we started by adding a total of four (!!) new vignettes:
The most interesting new feature is what we call Rcpp modules and is modeled after Boost::Python. This makes it pretty easy to expose C++ functions and classes to R -- without having to write glue code. This is pretty new and may change a tad over the coming releases, but it is also quite exciting.
Other changes concern more improvements for use of inline which should now allow packages like our RcppArmadillo to be used with it, and some bug fixes. The full NEWS entry for this release follows below:
0.8.1 2010-06-08
o This release adds Rcpp modules. An Rcpp module is a collection of
internal (C++) functions and classes that are exposed to R. This
functionality has been inspired by Boost.Python.
Modules are created internally using the RCPP_MODULE macro and
retrieved in the R side with the Module function. This is a preview
release of the module functionality, which will keep improving until
the Rcpp 0.9.0 release.
The new vignette "Rcpp-modules" documents the current feature set of
Rcpp modules.
o The new vignette "Rcpp-package" details the steps involved in making a
package that uses Rcpp.
o The new vignette "Rcpp-FAQ" collects a number of frequently asked
questions and answers about Rcpp.
o The new vignette "Rcpp-extending" documents how to extend Rcpp
with user defined types or types from third party libraries. Based on
our experience with RcppArmadillo
o Rcpp.package.skeleton has been improved to generate a package using
an Rcpp module, controlled by the "module" argument
o Evaluating a call inside an environment did not work properly
o cppfunction has been withdrawn since the introduction of the more
flexible cxxfunction in the inline package (0.3.5). Rcpp no longer
depends on inline since many uses of Rcpp do not require inline at
all. We still use inline for unit tests but this is now handled
locally in the unit tests loader runTests.R.
Users of the now-withdrawn function cppfunction can redefine it as:
cppfunction <- function(...) cxxfunction( ..., plugin = "Rcpp" )
o Support for std::complex was incomplete and has been enhanced.
o The methods XPtr::getTag and XPtr::getProtected are deprecated,
and will be removed in Rcpp 0.8.2. The methods tag() and prot() should
be used instead. tag() and prot() support both LHS and RHS use.
o END_RCPP now returns the R Nil values; new macro VOID_END_RCPP
replicates prior behabiour
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
This is some ways a continuation of the 0.3.4 release I had made in December. That release had opened the door for the wide use of inline in our Rcpp package. And just how Rcpp has grown, we now have needs beyond the initial change. See the post on Romain's blog for details, but in a nutshell we are now gaining
Last but not least, our thanks to Oleg Sklyar for letting us extend his amazing inline package for use by Rcpp.
This time we all got chip-timing via a small (rfid ?) strip tagged to back of the bib number. Which is handy as I managed to not stop my time by hand correctly. Given that I am still nursing a sore Achilles tendon and don't train well or much, the time of 23:51 (or 6:48 min/mile) was ok compared to the other seven previous times I have run this.
On Friday, I also gave an informal lecture / tutorial / workshop to some of the Stats and Finance Ph.D. students, drawing largely from the section on parallel computing of the most recent Introduction to High-Performance Computing with R tutorial.
My sincere thanks to Kurt Hornik and Stefan Theussl for the invite -- it was a great trip, notwithstanding the mostly unseasonally cold and wet weather.
This new release offers a number of key improvements:
fastLm() function is now generic and provides a default and formula
interface just like lm() along with standard methods
print, summary and predict. The
documentation is enhanced as well and now contains an example of a
rank-deficient model matrix where the non-pivoting scheme of
fastLm() fails.While we had tested this quite rigourously, the combination of some last minute changes that were meant to be stylistic-only, some troubles with the tests and builds at CRAN that were not apparent in all our tests (hint: do not yet use dynamic help features referencing other packages even if you have a Depends: on them) and an upcoming travel deadline meant that we missed a gotcha on Windows---so release 0.2.1 had to follow a few hours after the short-lived 0.2.0.
More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.
This release brings a number of changes that are detailed below. Of particular interest may be the much more robust treatment of exceptions, the new classes for data frames and formulae, and the availability of the new helper function cppfunction for use with inline. Also of note is the new support for the 'LinkingTo' directive with which packages using Rcpp will get automatic access to the header files.
An announcement email went to the r-packages list (ETH Zuerich, Gmane); Romain also blogged about the release.
The full NEWS entry for this release follows below:
0.8.0 2010-05-17
o All Rcpp headers have been moved to the inst/include directory,
allowing use of 'LinkingTo: Rcpp'. But the Makevars and Makevars.win
are still needed to link against the user library.
o Automatic exception forwarding has been withdrawn because of
portability issues (as it did not work on the Windows platform).
Exception forwarding is still possible but is now based on explicit
code of the form:
try {
// user code
} catch( std::exception& __ex__){
forward_exception_to_r( __ex___ ) ;
}
Alternatively, the macro BEGIN_RCPP and END_RCPP can use used to enclose
code so that it captures exceptions and forward them to R.
BEGIN_RCPP
// user code
END_RCPP
o new __experimental__ macros
The macros RCPP_FUNCTION_0, ..., RCPP_FUNCTION_65 to help creating C++
functions hiding some code repetition:
RCPP_FUNCTION_2( int, foobar, int x, int y){
return x + y ;
}
The first argument is the output type, the second argument is the
name of the function, and the other arguments are arguments of the C++
function. Behind the scenes, the RCPP_FUNCTION_2 macro creates
an intermediate function compatible with the .Call interface and handles
exceptions
Similarly, the macros RCPP_FUNCTION_VOID_0, ..., RCPP_FUNCTION_VOID_65
can be used when the C++ function to create returns void. The generated
R function will return R_NilValue in this case.
RCPP_FUNCTION_VOID_2( foobar, std::string foo ){
// do something with foo
}
The macro RCPP_XP_FIELD_GET generates a .Call compatible function that
can be used to access the value of a field of a class handled by an
external pointer. For example with a class like this:
class Foo{
public:
int bar ;
}
RCPP_XP_FIELD_GET( Foo_bar_get, Foo, bar ) ;
RCPP_XP_FIELD_GET will generate the .Call compatible function called
Foo_bar_get that can be used to retrieved the value of bar.
The macro RCPP_FIELD_SET generates a .Call compatible function that
can be used to set the value of a field. For example:
RCPP_XP_FIELD_SET( Foo_bar_set, Foo, bar ) ;
generates the .Call compatible function called "Foo_bar_set" that
can be used to set the value of bar
The macro RCPP_XP_FIELD generates both getter and setter. For example
RCPP_XP_FIELD( Foo_bar, Foo, bar )
generates the .Call compatible Foo_bar_get and Foo_bar_set using the
macros RCPP_XP_FIELD_GET and RCPP_XP_FIELD_SET previously described
The macros RCPP_XP_METHOD_0, ..., RCPP_XP_METHOD_65 faciliate
calling a method of an object that is stored in an external pointer. For
example:
RCPP_XP_METHOD_0( foobar, std::vector , size )
creates the .Call compatible function called foobar that calls the
size method of the std::vector class. This uses the Rcpp::XPtr<
std::vector > class.
The macros RCPP_XP_METHOD_CAST_0, ... is similar but the result of
the method called is first passed to another function before being
wrapped to a SEXP. For example, if one wanted the result as a double
RCPP_XP_METHOD_CAST_0( foobar, std::vector , size, double )
The macros RCPP_XP_METHOD_VOID_0, ... are used when calling the
method is only used for its side effect.
RCPP_XP_METHOD_VOID_1( foobar, std::vector, push_back )
Assuming xp is an external pointer to a std::vector, this could
be called like this :
.Call( "foobar", xp, 2L )
o Rcpp now depends on inline (>= 0.3.4)
o A new R function "cppfunction" was added which invokes cfunction from
inline with focus on Rcpp usage (enforcing .Call, adding the Rcpp
namespace, set up exception forwarding). cppfunction uses BEGIN_RCPP
and END_RCPP macros to enclose the user code
o new class Rcpp::Formula to help building formulae in C++
o new class Rcpp::DataFrame to help building data frames in C++
o Rcpp.package.skeleton gains an argument "example_code" and can now be
used with an empty list, so that only the skeleton is generated. It
has also been reworked to show how to use LinkingTo: Rcpp
o wrap now supports containers of the following types: long, long double,
unsigned long, short and unsigned short which are silently converted
to the most acceptable R type.
o Revert to not double-quote protecting the path on Windows as this
breaks backticks expansion used n Makevars.win etc
o Exceptions classes have been moved out of Rcpp classes,
e.g. Rcpp::RObject::not_a_matrix is now Rcpp::not_a_matrix
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Date: Mon, 26 Apr 2010 15:27:29 -0500 To: R Development ListCC: gsoc-r Subject: R and the Google Summer of Code 2010 -- Please welcome our new students! From: Dirk Eddelbuettel Earlier today Google finalised student / mentor pairings and allocations for the Google Summer of Code 2010 (GSoC 2010). The R Project is happy to announce that the following students have been accepted: Colin Rundel, "rgeos - an R wrapper for GEOS", mentored by Roger Bivand of the Norges Handelshoyskole, Norway Ian Fellows, "A GUI for Graphics using ggplot2 and Deducer", mentored by Hadley Wickham of Rice University, USA Chidambaram Annamalai, "rdx - Automatic Differentiation in R", mentored by John Nash of University of Ottawa, Canada Yasuhisa Yoshida, "NoSQL interface for R", mentored by Dirk Eddelbuettel, Chicago, USA Felix Schoenbrodt, "Social Relations Analyses in R", mentored by Stefan Schmukle, Universitaet Muenster, Germany Details about all proposals are on the R Wiki page for the GSoC 2010 at http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010 The R Project is honoured to have received its highest number of student allocations yet, and looks forward to an exciting Summer of Code. Please join me in welcoming our new students. At this time, I would also like to thank all the other students who have applied for working with R in this Summer of Code. With a limited number of available slots, not all proposals can be accepted -- but I hope that those not lucky enough to have been granted a slot will continue to work with R and towards making contributions within the R world. I would also like to express my thanks to all other mentors who provided for a record number of proposals. Without mentors and their project ideas we would not have a Summer of Code -- so hopefully we will see you again next year. Regards, Dirk (acting as R/GSoC 2010 admin)
Which seems to have worked. My pace was more even, and I conserved some energy and made it past the last of the hills around mile 21 without walking a single step while staying a few second under a 8 min/mile average pace. But then around mile 23 and 24 I had two sharp short cramps which forced me to walk. Interestingly enough, Bob Richards writes about cramps as a main theme for many runners in this year's race. Maybe the wind and temperature combined with the hills to get us after all! Anyway, I ended up with 3:29:14 which, at a 7:59 pace, is right between the 2007 and 2009 results and quite decent given the circumstances.
And of course the weekend as whole was again a hoot even if I had only a short stay of around 30 hours in Boston given our R/Finance conference on Friday and Saturday. We'll see if I will manage to qualify once more for next year.
As a co-organizer, it was a great pleasure to see so many users of R in Finance---from both industry and academia---come to Chicago to discuss and share recent work. There is a lot going on, and it is always good to exchange ideas with others sharing the same infrastructure. Participants appeared to enjoy the conference. My thanks to everybody who helped to put it together, from the local committee to the helping hands at UIC and of course the sponsors.
I just put my slides from the Extending and Embedding R with C++ tutorial preceding the conference, as well as the RQuantLib: Interfacing QuantLin from R presentation (with Khanh Nguyen), up onto my presentations page. I do have a usb-drive with all conference presentations and will provide them via the R / Finance site in a few days.
The only truly sour note is the fact that several presenters from Europe had their travels schedules turned upside down by the disruption to international air travel caused by the Icelandic volcano eruption and the resulting ash clouds. While we are glad to have had them for a little longer in Chicago, we understand that they are getting eager to return home. I hope this extended stay in the Windy City does not take away from the overall usefulness of the trip.
This is another bug-fix version related solely to a build failure on Windows. Trying to protect paths with spaces has the side-effect of breaking backticks use, which unfortunately is already in use by a number of package that since broke during CRAN autobuilds. No other changes were made.
As always, full details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Thanks also to David Smith (at the REvolutions blog) and Drew Conway (at his blog) for spreading the word about the presentation video and slides -- quite a few folks have come to my presentations page to get them.
The talks centered around R and C++ integration using both Rcpp and RInside and summarise where both projects stand after all the recent work Romain and I put in over the last few months. The presentations went fairly well; I received some favourable comments.
Szilard and the R User Group had also suggested a group discussion about CRAN, its growth and how to maximise its usefulness. Given my CRANberries feed, my work on the CRAN Task Views for Empirical Finance and High-Performance Computing with R as well as our cran2deb binary package generator, I had some views and ideas that helped frame the discussion which turned out to very useful and informed. So maybe we should do this User Group thing in Chicago too!
Special thanks to Jan de Leeuw and Szilard Pafka for organising the meeting, talks and discussion.
Anyway, a new version 0.24 of Finance::YahooQuote which addresses the issue that required upload 0.23 yesterday is now in the Debian queue and on CPAN and my local yahooquote page. This time it may even work. A big thanks to the CPAN Testers for getting me reports on this one too.
This version fixes a somewhat serious bug uncovered by Doug Bates when working with vectors of strings. We also added a few new accessor functions as well as a new convenience function create that is particularly useful for creating (possibly named) list objects that are returned to R.
Here is the full NEWS entry for this release:
0.7.11 2010-03-26
o Vector<> gains a set of templated factory methods "create" which
takes up to 20 arguments and can create named or unnamed vectors.
This greatly facilitates creating objects that are returned to R.
o Matrix now has a diag() method to create diagonal matrices, and
a new constructor using a single int to create square matrices
o Vector now has a new fill() method to propagate a single value
o Named is no more a class but a templated function. Both interfaces
Named(.,.) and Named(.)=. are preserved, and extended to work also on
simple vectors (through Vector<>::create)
o Applied patch by Alistair Gee to make ColDatum more robust
o Fixed a bug in Vector that caused random behavior due to the lack of
copy constructor in the Vector template
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Which lead the automated Perl test scripts to remind me for a few days now that the full company name for symbol IBM no longer corresponded to what I had encoded. Not really a bug, but a failure in tests anyway.
So without further ado: a new version 0.23 of Finance::YahooQuote which addressed this issue is now in the Debian queue and on CPAN and my local yahooquote page.
RInside is a set of convenience classes to facilitate embedding of R inside of C++ applications. It works particularly well with Rcpp and now depends on it.
This is the first release since version 0.2.1 in early January. Romain and I made numerous changes to Rcpp in the meantime. With this release, RInside is starting to catch up by taking advantage of many new automatic (templated) type converters. We have updated the existing examples, and added several new ones. These are all visibile directly via the Doxygen-generated documentation under the Files heading. Two examples are also shown directly on the RInside page.
Also added are new examples showing how to use RInside to embed R inside C++ applications using MPI for parallel computing. This was contributed via two examples files by Jianping Hua, and we reworked the examples slightly (and added two variants that use MPI's C++ API).
As it is so short, here is the basic 'Hello, World' example now showing the simpler Rcpp-based variable assignment:
// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; tab-width: 8; -*- // // Simple example showing how to do the standard 'hello, world' using embedded R // // Copyright (C) 2009 Dirk Eddelbuettel // Copyright (C) 2010 Dirk Eddelbuettel and Romain Francois // // GPL'ed #include <RInside.h> // for the embedded R via RInside int main(int argc, char *argv[]) { RInside R(argc, argv); // create an embedded R instance R["txt"] = "Hello, world!\n"; // assign a char* (string) to 'txt' R.parseEvalQ("cat(txt)"); // eval the init string, ignoring any returns exit(0); }
One minor setback is that the examples currently segfault on Windows. That may be an issue with linking and class instantiation or something related. Romain and I focus much more on Linux and OS X, so this has not gotten a lot of attention. Debugging help would be appreciated.
As for the race conditions, we had fantastic weather all week with temperatures up to the sixties and then all of a sudden a forecast of rain, snow and even sleet for the weekend. Luckily, and while yesterday was sucky, today was allright or better. A little chilly and damp, but neither rain nor snow --- or even wind. So the conditions were good, with the course challenging as usual.
The race itself went fine. I ran more or less steadily, never had to stop but was not particularly fast at 1:39:38 or a pace of 7:36.3. I had aimed for beating 1:40, had missed that target by miles 4 to 6 and was about 10 or 15 seconds behind but managed to get a negative split on the second half of the course to reach that goal. Which is nice, but the time is still the slowest I've ever run that race, and my slowest half-marathon since 2004.
Training had been sluggish all winter. Oddly enough, already in last year's post I stated pretty much the same and feared that Boston may become tough --- which it did. But this year may well be a lot worse as I had no spring in my step all winter long. No fire in the belly for training will make for a long race. We'll see how it goes. Four weeks to go.
An R Wiki page had been created and serves as the central
point of reference for the R Project
and the GSoC 2010. It contains a list of project ideas, currently counting
eleven and spanning everything from research-oriented topics (such as spatial
statistics or automatic differentiation) to R community-support (regarding
CRAN statistics and the CRANtastic site) to extensions (NoSQL, RPy2 data interfaces, Rserve browser integration) and more. I also just created a
mailing list gsoc-r@googlegroups.com where prospective students and mentors can exchange ideas and discuss. As for other
details, the Google
Summer of Code 2010 site has most of the answers, and we will try to keep
R-related information on the aforementioned
R Wiki page.
shQuote() function instead helped. Our
thanks to the tireless R-on-Windows maintainer Uwe Ligges for an earlier
heads-up about the problem. So another quick bug-fix release 0.7.10 is now in
Debian and should be on CRAN some time tomorrow.
We also put two small improvements in, see the full NEWS entry for this release:
0.7.10 2010-03-15
o new class Rcpp::S4 whose constructor checks if the object is an S4 object
o maximum number of templated arguments to the pairlist function,
the DottedPair constructor, the Language constructor and the
Pairlist constructor has been updated to 20 (was 5) and a script has been
added to the source tree should we want to change it again
o use shQuotes() to protect Windows path names (which may contain spaces)
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
So a quick bug-fix release 0.7.9 is now in Debian and should be on CRAN shortly.
The full NEWS entry for this release follows:
0.7.9 2010-03-12
o Another small improvement to Windows build flags
o bugfix on 64 bit platforms. The traits classes (wrap_type_traits, etc)
used size_t when they needed to actually use unsigned int
o fixed pre gcc 4.3 compatibility. The trait class that was used to
identify if a type is convertible to another had too many false positives
on pre gcc 4.3 (no tr1 or c++0x features). fixed by implementing the
section 2.7 of "Modern C++ Design" book.
As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Update: First version number corrected to 0.7.8.
Romain and I already had an example of a simple but fast linear model fit using the (very clever) Armadillo C++ library by Conrad Sanderson. In fact, I had used this as a motivational example of why Rcpp rocks in a recent talk to the ACM chapter at U of Chicago which, thanks to David Smith at REvo, got some further exposure.
Now this example is more refined as further glue got added. Given that both Armadillo and Rcpp make use of C++ templates, the actual amount of code in RcppArmadillo is not that large: just over 200 lines in a header file, and a little less for some testing accessor and example functions in a source file. And this makes for some really nice example code: the 'fast regression' example becomes this (where I simply removed two blocks with conditional on the Armadillo version):
#include <RcppArmadillo.h> extern "C" SEXP fastLm(SEXP ys, SEXP Xs) { Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP int n = Xr.nrow(), k = Xr.ncol(); arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy arma::colvec y(yr.begin(), yr.size(), false); arma::colvec coef = solve(X, y); // fit model y ~ X arma::colvec resid = y - X*coef; // residuals double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) ); // std.error of estimate arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) ); Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef), Rcpp::Named( "stderr", stderrest)); return res; }
No extra copies! Armadillo instantiates directly from the underlying R objects for the vector and matrix, solves the regression equations, computes the standard error of the estimates and returns the two vectors. Leaving us to write about eleven lines of code. Moreover, as Armadillo is well designed and uses template meta-programming to avoid extra copies (see these lecture notes for details), it is about as efficient as it can be (and will use Atlas or other BLAS where available).
And, this is just one example. Rcpp should be suitable for other C++ libraries, and provides an easy to use seamless interface between C++ and R.
However, we should note that (at about the last minute) we found out about some unit test failures in OS X as well as some issues in a Debian chroot -- cran2deb ran into some build issues on i386 and amd64 in the testing chroot even this 'it all works' swimmingly on our Debian, Ubuntu and Fedora build environments. A follow-up with fixes for either Rcpp and/or RcppArmadillo appears likely.
Update: The build issues seems to be with 64-bit systems and everything appears cool in 32-bit.
As mentioned in the post about release 0.7.8 of Rcpp, Romain and I carved this out of Rcpp itself to provide a cleaner separation of code that implements our R / C++ interfaces (which remain in Rcpp) and code that illustrates how to use it --- which is now in RcppExamples. This also provides an easier template for people wanting to use Rcpp in their packages as it will be easier to wrap one's head around the much smaller RcppExamples package.
A simple example (using the newer API) may illustrate this:
#include <Rcpp.h> RcppExport SEXP newRcppVectorExample(SEXP vector) { Rcpp::NumericVector orig(vector); // keep a copy (as the classic version does) Rcpp::NumericVector vec(orig.size()); // create a target vector of the same size // we could query size via // int n = vec.size(); // and loop over the vector, but using the STL is so much nicer // so we use a STL transform() algorithm on each element std::transform(orig.begin(), orig.end(), vec.begin(), sqrt); Rcpp::Pairlist res(Rcpp::Named( "result", vec), Rcpp::Named( "original", orig)); return res; }
With essentially five lines of code, we provide a function that takes any numeric vector and returns both the original vector and a tranformed version---here by applying a square root operation. Even the looping along the vector is implicit thanks to the generic programming idioms of the Standard Template Library.
Nicer still, even on misuse, exceptions get caught cleanly and we get returned to the R prompt without any explicit coding on the part of the user:
R> library(RcppExamples) Loading required package: Rcpp R> print(RcppVectorExample( 1:5, "new" )) # select new API $result [1] 1.000 1.414 1.732 2.000 2.236 $original [1] 1 2 3 4 5 R> RcppVectorExample( c("foo", "bar"), "new" ) Error in RcppVectorExample(c("foo", "bar"), "new") : not compatible with INTSXP R>
There is also analogous code for the older API in the package, but it is about three times as long, has to loop over the vector and needs to set up the execption handling explicitly.
As of right now, RcppExamples does not document every class but it should already provide a fairly decent start for using Rcpp. And many more actual usage examples are ... in the over two-hundred unit tests in Rcpp.
Update: Now actually showing new rather than classic API.
This is a minor feature release based on a over three weeks of changes that are summarised below in the extract from the NEWS file. Some noteworthy highlights are
The full NEWS entry for this release follows:
0.7.8 2010-03-09
o All vector classes are now generated from the same template class
Rcpp::Vector where RTYPE is one of LGLSXP, RAWSXP, STRSXP,
INTSXP, REALSXP, CPLXSXP, VECSXP and EXPRSXP. typedef are still
available : IntegerVector, ... All vector classes gain methods
inspired from the std::vector template : push_back, push_front,
erase, insert
o New template class Rcpp::Matrix deriving from
Rcpp::Vector. These classes have the same functionality
as Vector but have a different set of constructors which checks
that the input SEXP is a matrix. Matrix<> however does/can not
guarantee that the object will allways be a matrix. typedef
are defined for convenience: Matrix is IntegerMatrix, etc...
o New class Rcpp::Row that represents a row of a matrix
of the same type. Row contains a reference to the underlying
Vector and exposes a nested iterator type that allows use of
STL algorithms on each element of a matrix row. The Vector class
gains a row(int) method that returns a Row instance. Usage
examples are available in the runit.Row.R unit test file
o New class Rcpp::Column that represents a column of a
matrix. (similar to Rcpp::Row). Usage examples are
available in the runit.Column.R unit test file
o The Rcpp::as template function has been reworked to be more
generic. It now handles more STL containers, such as deque and
list, and the genericity can be used to implement as for more
types. The package RcppArmadillo has examples of this
o new template class Rcpp::fixed_call that can be used in STL algorithms
such as std::generate.
o RcppExample et al have been moved to a new package RcppExamples;
src/Makevars and src/Makevars.win simplified accordingly
o New class Rcpp::StringTransformer and helper function
Rcpp::make_string_transformer that can be used to create a function
that transforms a string character by character. For example
Rcpp::make_string_transformer(tolower) transforms each character
using tolower. The RcppExamples package has an example of this.
o Improved src/Makevars.win thanks to Brian Ripley
o New examples for 'fast lm' using compiled code:
- using GNU GSL and a C interface
- using Armadillo (http://arma.sf.net) and a C++ interface
Armadillo is seen as faster for lack of extra copying
o A new package RcppArmadillo (to be released shortly) now serves
as a concrete example on how to extend Rcpp to work with a modern
C++ library such as the heavily-templated Armadillo library
o Added a new vignette 'Rcpp-introduction' based on a just-submitted
overview article on Rcpp
As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Update: Two links corrected.
But what everybody seems to be forgetting is that R has had a Sudoku solver for years, thanks to the sudoku package by David Brahm and Greg Snow which was first posted four years ago. What comes around, goes around.
With that, and about one minute of Emacs editing to get the Le Monde puzzle into the required ascii-art form, all we need to do is this:
That took all of five seconds while my computer was also compiling a particularly resource-hungry C++ package....R> library(sudoku) R> s <- readSudoku("/tmp/sudoku.txt") R> s [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 8 0 0 0 0 1 2 0 0 [2,] 0 7 5 0 0 0 0 0 0 [3,] 0 0 0 0 5 0 0 6 4 [4,] 0 0 7 0 0 0 0 0 6 [5,] 9 0 0 7 0 0 0 0 0 [6,] 5 2 0 0 0 9 0 4 7 [7,] 2 3 1 0 0 0 0 0 0 [8,] 0 0 6 0 2 0 1 0 9 [9,] 0 0 0 0 0 0 0 0 0 R> system.time(solveSudoku(s)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 8 4 9 6 7 1 2 5 3 [2,] 6 7 5 2 4 3 9 1 8 [3,] 3 1 2 9 5 8 7 6 4 [4,] 1 8 7 4 3 2 5 9 6 [5,] 9 6 4 7 8 5 3 2 1 [6,] 5 2 3 1 6 9 8 4 7 [7,] 2 3 1 8 9 4 6 7 5 [8,] 4 5 6 3 2 7 1 8 9 [9,] 7 9 8 5 1 6 4 3 2 user system elapsed 5.288 0.004 5.951 R>
Just in case we needed another illustration that it is hard to navigate the riches and wonders that is CRAN...
Language class had a real bug leading to this new release
just two days after
0.7.6.
0.7.7 2010-02-14
o new template classes Rcpp::unary_call and Rcpp::binary_call
that facilitates using R language calls together
with STL algorithms.
o fixed a bug in Language constructors taking a string as their
first argument. The created call was wrong.
As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Makefile.win which is now fixed.
A few other things sneaked in while were at it, see the snippet from the NEWS file
below or look at
Romain's blog
where he highlights named-based indexing in vectors and the addition of
iterator as well as begin() and end() members that
now allow the use of STL algorithms on our R objects which is nifty.
The changes are summarised below in the NEWS file snippet, more details are in the ChangeLog as well.
0.7.6 2010-02-12
o SEXP_Vector (and ExpressionVector and GenericVector, a.k.a List) now
have methods push_front, push_back and insert that are templated
o SEXP_Vector now has int- and range-valued erase() members
o Environment class has a default constructor (for RInside)
o SEXP_Vector_Base factored out of SEXP_Vector (Effect. C++ #44)
o SEXP_Vector_Base::iterator added as well as begin() and end()
so that STL algorithms can be applied to Rcpp objects
o CharacterVector gains a random access iterator, begin() and end() to
support STL algorithmsl; iterator dereferences to a StringProxy
o Restore Windows build; successfully tested on 32 and 64 bit;
o Small fixes to inst/skeleton files for bootstrapping a package
o RObject::asFoo deprecated in favour of Rcpp::as
As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
The changes are summarised below in the NEWS file snippet, more details are in the ChangeLog as well.
0.7.5 2010-02-08
o wrap has been much improved. wrappable types now are :
- primitive types : int, double, Rbyte, Rcomplex, float, bool
- std::string
- STL containers which have iterators over wrappable types:
(e.g. std::vector, std::deque, std::list, etc ...).
- STL maps keyed by std::string, e.g std::map
- classes that have implicit conversion to SEXP
- classes for which the wrap template if fully or partly specialized
This allows composition, so for example this class is wrappable:
std::vector< std::map > (if T is wrappable)
o The range based version of wrap is now exposed at the Rcpp::
level with the following interface :
Rcpp::wrap( InputIterator first, InputIterator last )
This is dispatched internally to the most appropriate implementation
using traits
o a new namespace Rcpp::traits has been added to host the various
type traits used by wrap
o The doxygen documentation now shows the examples
o A new file inst/THANKS acknowledges the kind help we got from others
o The RcppSexp has been removed from the library.
o The methods RObject::asFoo are deprecated and will be removed
in the next version. The alternative is to use as.
o The method RObject::slot can now be used to get or set the
associated slot. This is one more example of the proxy pattern
o Rcpp::VectorBase gains a names() method that allows getting/setting
the names of a vector. This is yet another example of the
proxy pattern.
o Rcpp::DottedPair gains templated operator<< and operator>> that
allow wrap and push_back or wrap and push_front of an object
o Rcpp::DottedPair, Rcpp::Language, Rcpp::Pairlist are less
dependent on C++0x features. They gain constructors with up
to 5 templated arguments. 5 was choosed arbitrarily and might
be updated upon request.
o function calls by the Rcpp::Function class is less dependent
on C++0x. It is now possible to call a function with up to
5 templated arguments (candidate for implicit wrap)
o added support for 64-bit Windows (thanks to Brian Ripley and Uwe Ligges)
As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
Now open for registrations:
R / Finance 2010: Applied Finance with R
April 16 and 17, 2010
Chicago, IL, USA
The second annual R / Finance conference for applied finance using R, the premier free software system for statistical computation and graphics, will be held this spring in Chicago, IL, USA on Friday April 16 and Saturday April 17.
Building on the success of the inaugural R / Finance 2009 event, this two-day conference will cover topics as diverse as portfolio theory, time-series analysis, as well as advanced risk tools, high-performance computing, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management and trading.
Invited keynote presentations by Bernhard Pfaff, Ralph Vince, Mark Wildi and Achim Zeileis are complemented by over twenty talks (both full-length and 'lightning') selected from the submissions. Four optional tutorials are also offered on Friday April 16.
R / Finance 2010 is organized by a local group of R package authors and community contributors, and hosted by the International Center for Futures and Derivatives (ICFD) at the University of Illinois at Chicago.
Conference registration is now open. Special advanced registration pricing is available, as well as discounted pricing for academic and student registrations.
More details and registration information can be found at the website at
http://www.RinFinance.com
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan
See you in Chicago in April!
RProtoBuf had a funny start. I had blogged about the 12 hour passage from proof of concept to R-Forge project following the ORD session hackfest in October. What happened next was as good. Romain emailed within hours of the blog post and reminded me of a similar project that is part of Saptarshi Guha's RHIPE R/Hadoop implementation. So the three of us--Romain, Saptarshi and I---started emailing and before long it becomes clear that Romain is both rather intrigued by this (whereas Saptarshi has slightly different needs for the inner workings of his Hadoop bindings) and was able to devote some time to it. So the code kept growing and growing at a fairly rapid clip. Til that stopped as we switched to working feverishly on Rcpp to both support the needs of this project, and to implement ideas we had while working on this. That now lead to the point where Rcpp is maturing in terms of features, so we will probably have time come back to more work on RProtoBuf to take advantage of the nice templated autoconversions we now have in Rcpp. Oddly enough, the initial blog post seemed to anticipate changes in Rcpp.
Anyway --
RProtoBuf
is finally here and it already does a fair amount of magic based of code reflection
using the proto files. The Google documentation has a simple
example of a 'person' entry in an 'addressbook' which, when translated to R,
goes like this:
R> library( RProtoBuf ) ## load the package R> readProtoFiles( "addressbook.proto" ) ## acquire protobuf information R> bob <- new( tutorial.Person, ## create new object + email = "bob@example.com", + name = "Bob", + id = 123 ) R> writeLines( bob$toString() ) ## serialize to stdout name: "Bob" id: 123 email: "bob@example.com" R> bob$email ## access and/or override [1] "bob@example.com" R> bob$id <- 5 R> bob$id [1] 5 R> serialize( bob, "person.pb" ) ## serialize to compact binary format
There is more information at the RProtoBuf page, and we already have a draft package vignette, a 'quick' overview vignette and a unit test summary vignette.
More changes should be forthcoming as Romain and I find time to code them up. Feedback is as always welcome.
The release once again combines a number of necessary fixes with numerous new features:
Lastly, we had a remaining Windows build issue. Also, Brian Ripley and Uwe Ligges kindly sent us a small patch supporting the new Windows 64-bit builds using the new MinGW 64-bit compiler for Windows -- so release 0.7.5 may follow in due course.
The NEWS file entry for release 0.7.4 is as follows:
0.7.4 2010-01-30
o matrix matrix-like indexing using operator() for all vector
types : IntegerVector, NumericVector, RawVector, CharacterVector
LogicalVector, GenericVector and ExpressionVector.
o new class Rcpp::Dimension to support creation of vectors with
dimensions. All vector classes gain a constructor taking a
Dimension reference.
o an intermediate template class "SimpleVector" has been added. All
simple vector classes are now generated from the SimpleVector
template : IntegerVector, NumericVector, RawVector, CharacterVector
LogicalVector.
o an intermediate template class "SEXP_Vector" has been added to
generate GenericVector and ExpressionVector.
o the clone template function was introduced to explicitely
clone an RObject by duplicating the SEXP it encapsulates.
o even smarter wrap programming using traits and template
meta-programming using a private header to be include only
RcppCommon.h
o the as template is now smarter. The template now attempts to
build an object of the requested template parameter T by using the
constructor for the type taking a SEXP. This allows third party code
to create a class Foo with a constructor Foo(SEXP) to have
as for free.
o wrap becomes a template. For an object of type T, wrap uses
implicit conversion to SEXP to first convert the object to a SEXP
and then uses the wrap(SEXP) function. This allows third party
code creating a class Bar with an operator SEXP() to have
wrap for free.
o all specializations of wrap : wrap, wrap< vector >
use coercion to deal with missing values (NA) appropriately.
o configure has been withdrawn. C++0x features can now be activated
by setting the RCPP_CXX0X environment variable to "yes".
o new template r_cast to facilitate conversion of one SEXP
type to another. This is mostly intended for internal use and
is used on all vector classes
o Environment now takes advantage of the augmented smartness
of as and wrap templates. If as makes sense, one can
directly extract a Foo from the environment. If wrap makes
sense then one can insert a Bar directly into the environment.
Foo foo = env["x"] ; /* as is used */
Bar bar ;
env["y"] = bar ; /* wrap is used */
o Environment::assign becomes a template and also uses wrap to
create a suitable SEXP
o Many more unit tests for the new features; also added unit tests
for older API
As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
This release combines a number of under-the-hood fixes and enhancements with one bug fix:
Rcpp:::LdFlags() helper function to dynamically provide
linker options for packages using
Rcpp now defaults
to static linking on OS X as well. For installation from source dynamic
linking always worked, but not for binary installation (as e.g. from
CRAN). As on the other platforms,
this default can be overridden. Thanks to the
phylobase
team for patient help in tracking this down.[] should now be faster due to some
enhancements in the internal representations.configure now has a command-line option (as well as an
environment variable) to select support for the draft of the upcoming C++0x
standard.Rcpp.package.skeleton(), modelled after
package.skeleton() in R itself, helps to set up a new package
with support for using
Rcpp.As always, full details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page
This versions brings a few cleanups due to minor
Rcpp changes (in
essence: we now define the macro R_NO_REMAP before including R's
headers and this separate non-namespaced functions like error()
or length() out into prefixed-versions Rf_error()
and Rf_length() which is a good thing).
It also adds a number of calendaring and holiday utilities that Khanh just added: tests for weekend, holiday, endOfMonth as well dayCount, date advancement and year fraction functions commonly used in fixed income.
Full changelog details, examples and more details about this package are at my RQuantLib page.
A lot of the momentum for the new API is continuing, thanks in large part to Romain. A number of new classes have been added, and existing ones have been enhanced. There are more unit tests than ever, and more documentation. We have better build support (with g++ version detection so that we can add some C++0x support where available) and a new examples sub-directory.
We did take one toy away, though. The Doxygen-generated docs were getting so big that we decided to keep them out of the source tarball. (And arguably, they are also too volatile.) We still have the browseable html docs as well as the pdf version (now at over 300 pages!). And we added zip archives of the docs in html, latex, and man format for download.
As always, full details are in the ChangeLog on the Rcpp page. Questions, comments etc: bring them to the rcpp-devel mailing list off the R-Forge page
This is a maintenance release building on the recent
0.2.0 release
which added Windows support (provided you use the Rtools toolchain for
Windows). In this release, we changed the startup initialization so that
interactive() comes out FALSE (just as we had done for
littler just yesterday)
and with that no longer call Rf_KillAllDevices() from the destructor as we may not have
had devices in the first place. A few minor things were tweaked around the
code organisation and build process, see the ChangeLog for details.
The new release should hit CRAN mirrors tomorrow, and is (as always) available from my machine too.
littler provides r
(pronounced littler), a shebang / scripting / quick eval / pipelining
front-end to the the R language and system.
This version adds a few minor behind-the-scenes improvements:
interactive() now evaluates to false as you'd expect in a
non-interactive scripting front-end. To restore the previous behaviour,
new switches -i or --interactive have been added.
install.r and update.r
received an update based on lessons learned from the R 2.10.0 roll-out and
package rebuilding.
As usual, our code is available via our svn archive or from tarballs off my littler page and the local directory here. A fresh package is in Debian's incoming queue and will hit mirrors shortly.
A lot has changed under the hood since 0.7.0, and this is the first release that really reflects many of Romain's additions. Some of the changes are
Rcpp::RObject that replaces
RcppSexp (which is still provided for compatibility); it
provides basic R object handling and other new classes derive from it.Rcpp::RObject has real simple wrappers for object creation and a SEXP
operator for transfer back to R that make simple interfaces even easier.Rcpp::Evaluator and
Rcpp::Environment for expression evaluation and R environment
access, respectively.Rcpp::XPtr for external pointer access and management.RUnit
package, and several new examples.inline (>= 0.3.4) as our patch is now part of the current
inline package
as mentioned <here.