Fri, 31 Dec 2010

R / Finance 2011 Call for Papers: Updated and expanded

One week ago, I sent the updated announcement below to the r-sig-finance list; this was kindly blogged about by fellow committee member Josh and by our pal Dave @ REvo. By now. I also updated the R / Finance conference website. So to round things off, a quick post here is in order as well. It may even get a few of the esteemed reader to make a New Year's resolution about submitting a paper :)

Dear R / Finance community,

The preparations for R/Finance 2011 are progressing, and due to favourable responses from the different sponsors we contacted, we are now able to offer

  1. a competition for best paper, which given the focus of the conference will award for both an 'academic' paper and an 'industry' paper
  2. availability of travel grants for up to two graduate students provided suitable papers were accepted for presentations

More details are below in the updated Call for Papers. Please feel free to re-circulate this Call for Papers with collegues, students and other associations.

Cheers, and Season's Greeting,

Dirk (on behalf of the organizing / program committee)

Call for Papers:

R/Finance 2011: Applied Finance with R

April 29 and 30, 2011
Chicago, IL, USA

The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Complete papers or one-page abstracts (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated lightning talks, and for a limited number of pre-conference (longer) seminar sessions.

Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two $1000 prizes for best paper: one for best practitioner-oriented paper and one for best academic-oriented paper. Further, to defray costs for graduate students, two travel and expense grants of up to $500 each will be awarded to graduate students whose papers are accepted. To be eligible, a submission must be a full paper; extended abstracts are not eligible.

Please send submissions to: committee at RinFinance.com

The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling. The graduate student grant winners will be notified by February 23rd, 2011.

Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.

R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011---including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

/computers/R | permanent link

Sat, 25 Dec 2010

Rcpp 0.9.0 announcement

The text below went out as a post to the r-packages list a few days ago, but I thought it would make sense to post it on the blog too. So with a little html markup...

Summary

Version 0.9.0 of the Rcpp package is now on CRAN and its mirrors. This release marks another step in the development of the package, and a few key points are highlighted below. More details are in the NEWS and ChangeLog files included in the package.

Overview

Rcpp is an R package and associated C++ library that facilitates integration of C++ code in R packages. The package features a complete set of C++ classes (Rcpp::IntegerVector, Rcpp:NumericVector, Rcpp::Function, Rcpp::Environment, ...) that makes it easier to manipulate R objects of matching types (integer vectors, functions, environments, etc ...). Rcpp takes advantage of C++ language features such as the explicit constructor / destructor lifecycle of objects to manage garbage collection automatically and transparently. We believe this is a major improvement over use of PROTECT / UNPROTECT. When an Rcpp object is created, it protects the underlying SEXP so that the garbage collector does not attempt to reclaim the memory. This protection is withdrawn when the object goes out of scope. Moreover, users generally do not need to manage memory directly (via calls to new / delete or malloc / free) as this is done by the Rcpp classes or the corresponding STL containers. A few key points about Rcpp:
  • a rich API covering all core R data types including vectors, matrices, functions, environments, ... (with the exeception of factors which are less useful in C++)
  • seamless (bi-directional) data interchange between R and C++
  • possibility of inline use permitting definition, compilation, linking and loading of C++ functions directly from R
  • extensive documentation now covering eight vignettes
  • exception handling and error propagation back to R
  • extensive test suite using RUnit covering over 700 tests
  • extension packages RcppArmadillo and RcppGSL provide easy-to-use integration with the Armadillo (linear algebra) and GNU GSL librasries
  • increasing adoption among R users and package developers with now twenty packages from CRAN or BioConductor depending on Rcpp
  • support for the legacy 'classic' Rcpp is now provided by the RcppClassic package which is being released concurrently with Rcpp 0.9.0
Several key features were added during the 0.8.* cycles and are described below.

Rcpp sugar

Rcpp now provides syntactic sugar: vectorised expressions at the C++ level which are motivated by the corresponding R expressions. This covers operators (binary arithmetic, binary logical, unary), functions (producing single logical results, mathematical functions and d/p/q/r statistical functions). Examples comprises anything from ifelse() to pmin()/pmax() or A really simply example is a function
    SEXP foo( SEXP xx, SEXP yy){
        NumericVector x(xx), y(yy) ;
        return ifelse( x < y, x*x, -(y*y) ) ;
    }
which deploys the sugar 'ifelse' function modeled after the corresponding R function. Another simple example is
    double square( double x){
        return x*x ;
    }

    SEXP foo( SEXP xx ){
        NumericVector x(xx) ;
        return sapply( x, square ) ;
    }
where use the sugar function 'sapply' to sweep a simple C++ function which operates elementwise across the supplied vector. The Rcpp-sugar vignette describes sugar in more detail.

Rcpp modules

Rcpp modules are inspired by Boost.Python and make exposing C++ functions or classes to R even easier. A first illustration is provided by this simple C++ code snippet
    const char* hello( const std::string& who ){
        std::string result( "hello " ) ;
        result += who ;
        return result.c_str() ;
    }

    RCPP_MODULE(yada){
        using namespace Rcpp ;
        function( "hello", &hello ) ;
    }
which (after compiling and loading) we can access in R as
    yada <- Module( "yada" )
    yada$hello( "world" )
In a similar way, C++ classes can be exposed very easily. Rcpp modules are also described in more detail in their own vignette.

Reference Classes

R release 2.12.0 introduced Reference Classes. These are formal S4 classes with the corresponding dispatch method, but passed by reference and easy to use. Reference Classes can also be exposed to R by using Rcpp modules.

Extension packackages

The RcppArmadillo package permits use of the advanced C++ library 'Armadillo, a C++ linear algebra library aiming towards a good balance between speed and ease of use, providing integer, floating point and complex matrices and vectors with lapack / blas support via R. Armadillo uses templates for a delayed evaluation approach is employed (during compile time) to combine several operations into one and reduce (or eliminate) the need for temporaries. Armadillo is useful if C++ has been decided as the language of choice, rather than another language like Matlab ® or Octave, and aims to be as expressive as the former. Via Rcpp and RcppArmadillo, R users now have easy access to this functionality. Examples are provided in the RcppArmadillo package.

The RcppGSL package permits easy use of the GNU Scientific Library (GSL), a collection of numerical routines for scientifc computing. It is particularly useful for C and C++ programs as it provides a standard C interface to a wide range of mathematical routines such as special functions, permutations, combinations, fast fourier transforms, eigensystems, random numbers, quadrature, random distributions, quasi-random sequences, Monte Carlo integration, N-tuples, differential equations, simulated annealing, numerical differentiation, interpolation, series acceleration, Chebyshev approximations, root-finding, discrete Hankel transforms physical constants, basis splines and wavelets. There are over 1000 functions in total with an extensive test suite. The RcppGSL package provides an easy-to-use interface between GSL data structures and R using concepts from Rcpp. The RcppGSL package also contains a vignette with more documentation.

Legacy 'classic' API

Packages still using code interfacing the initial 'classic' Rcpp API are encouraged to migrate to the new API. Should a code transition not be possible, backwards compatibility is provided by the RcppClassic package released alongside Rcpp 0.9.0. By including RcppClassic.h and building against the RcppClassic package and library, vintage code can remain operational using the classic API. The short vignette in the RcppClassic package has more details.

Documentation

The package contains a total of eight vignettes the first of which provides a short and succinct introduction to the Rcpp package along with several motivating examples.

Links

Support

Questions about Rcpp should be directed to the Rcpp-devel mailing list https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Dirk Eddelbuettel, Romain Francois, Doug Bates and John Chambers
December 2010

/code/rcpp | permanent link

Wed, 22 Dec 2010

RcppExamples 0.1.2

A new version of our RcppExamples, package is now on CRAN.

RcppExamples contains a few illustrations of how to use Rcpp. It grew out of documentation for the classic API (now in its own package RcppClassic) and we added more functions documenting how to do the same with the new API we have been focusing on for the last year or so. One of the things I added in the last few days was the example below showing how to use Rcpp::List with lookups to replace use of the old and deprecated RcppParams. It also show how to return values to R rather easily

#include <Rcpp.h>

RcppExport SEXP newRcppParamsExample(SEXP params) {

    try {                                       // or use BEGIN_RCPP macro

        Rcpp::List rparam(params);              // Get parameters in params.
        std::string method   = Rcpp::as<std::string>(rparam["method"]);
        double tolerance     = Rcpp::as<double>(rparam["tolerance"]);
        int    maxIter       = Rcpp::as<int>(rparam["maxIter"]);
        Rcpp::Date startDate = Rcpp::Date(Rcpp::as<int>(rparam["startDate"])); // ctor from int
        
        Rprintf("\nIn C++, seeing the following value\n");
        Rprintf("Method argument    : %s\n", method.c_str());
        Rprintf("Tolerance argument : %f\n", tolerance);
        Rprintf("MaxIter argument   : %d\n", maxIter);
        Rprintf("Start date argument: %04d-%02d-%02d\n", 
                startDate.getYear(), startDate.getMonth(), startDate.getDay());

        return Rcpp::List::create(Rcpp::Named("method", method),
                                  Rcpp::Named("tolerance", tolerance),
                                  Rcpp::Named("maxIter", maxIter),
                                  Rcpp::Named("startDate", startDate),
                                  Rcpp::Named("params", params));  // or use rparam

    } catch( std::exception &ex ) {             // or use END_RCPP macro
        forward_exception_to_r( ex );
    } catch(...) { 
        ::Rf_error( "c++ exception (unknown reason)" ); 
    }
    return R_NilValue; // -Wall
}

The package is work-in-progress and needs way more general usage examples for Rcpp and particularly the new API. But it's a start.

A few more details on the page are on the RcppExamples page.

/code/rcpp | permanent link

Mon, 20 Dec 2010

Rcpp 0.9.0 and RcppClassic 0.9.0

A new release 0.9.0 of Rcpp is now available at CRAN and has just been uploaded to Debian. As always, sources are also available from my local directory here.

With this release, the older API which we have been referring to as the classic Rcpp API has been split off into its own new package RcppClassic to ensure backwards compatibility. Rcpp will now contain only the new API.

We also fixes a number a minor bugs and applied a few contributed patches which extended functionality or documentation as detailed below in the NEWS entry:

0.9.0   2010-12-19

    o   The classic API was factored out into its own package RcppClassic which
        is released concurrently with this version.

    o   If an object is created but not initialized, attempting to use
        it now gives a more sensible error message (by forwarding an
        Rcpp::not_initialized exception to R).

    o   SubMatrix fixed, and Matrix types now have a nested ::Sub typedef.
  
    o   New unexported function SHLIB() to aid in creating a shared library on
        the command-line or in Makefile (similar to CxxFlags() / LdFlags()).

    o   Module gets a seven-argument ctor thanks to a patch from Tama Ma.

    o   The (still incomplete) QuickRef vignette has grown thanks to a patch
        by Christian Gunning.

    o   Added a sprintf template intended for logging and error messages.

    o   Date::getYear() corrected (where addition of 1900 was not called for); 
        corresponding change in constructor from three ints made as well.

    o   Date() and Datetime() constructors from string received a missing
        conversion to int and double following strptime. The default format
        string for the Datetime() strptime call was also corrected.

    o   A few minor fixes throughout, see ChangeLog.

Thanks to CRANberries, there is also a diff to the previous release 0.8.9.

As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Fri, 17 Dec 2010

Introduction to ESS: talk and slides

We had another meeting of the Chicago R User Group last evening. This was scheduled somewhat belatedly once we learned that Drew Conway would be in town. Drew gave a very nice talk about his brand new infochimps package (for accessing the eponymous infochimps data service and marketplace). Slides are available on Drew's blog. In fact, he had already blogged about his talk before I had even started to write my slides...

The user group meetings have a meme of showing how to use R with different editors, UIs, IDEs,... It started with a presentation on Eclipse and its StatET plugin. So a while ago I had offered to present on ESS, the wonderful Emacs mode for R (and as well as SAS, Stata, BUGS, JAGS, ...). And now I owe a big thanks to the ESS Core team for keeping all their documentation, talks, papers etc in their SVN archive, and particularly to Stephen Eglen for putting the source code to Tony Rossini's tutorial from useR! 2006 in Vienna there. This allowed me to quickly whip up a few slides though a good part of the presentation did involve a live demo missing from the slides. Again, big thanks to Tony for the old slides and to Stephen for making them accessible when I mentioned the idea of this talk a while back -- it allowed to put this together on short notice.

And for those going to useR! 2011 in Warwick next summer, Stephen will present a full three-hour ESS tutorial which will cover ESS in much more detail.

/computers/R | permanent link

Mon, 13 Dec 2010

RcppDE 0.1.0

A new package RcppDE has been uploaded in a first version 0.1.0 to CRAN. It provides differential evolution optimisation---a variant of stochastic optimisation that is similar to genetic algorithms but particularly suitable for the floating-point representations common in numerical optimisation. It builds of on the nice DEoptim package by Ardia et al, but reimplements the algorithm in C++ (rather than C) using a large serving of Rcpp and RcppArmadillo.

I worked on this on for a few evenings and weekends in October and November and then spent a few more evenings writing a paper / vignette (which is finished as a very first draft now) about it. This was an interesting and captivating problem as I had worked on genetic algorithms going back quite some time to the beginning and then again the end of graduate school (and traces of that early work are near the bottom of my presentations page). So what got me started? DEoptim is a really nice package, but it is implemented in old-school C. There is nothing wrong with that per se, but at the same time that I was wrestling with GAs, I also taught myself C++ which, to put it simply, offers a few more choices to the programmer. I like having those choices.

And with all the work that Romain and I have put into Rcpp, I was curious how far I could push this cart if I were to move it along. I made a bet with myself starting from the old saw shorter, easier, faster: pick any two. Would it be possible to achieve all three of these goals?

DEoptim, and I take version 2.0-7 as my reference point here, is pretty efficiently yet verbosely coded. Copying a vector takes a loop with an assignment for each element, copying a matrix does the same using two loops. Replacing that with a single statement in C++ is pretty easy. We also have a few little optimisations behind the scenes here and there in Rcpp: would all that be enough to move the needle in terms of performance? And the same time, DEoptim is also full of the uses of the old R API which we often point to in the Rcpp documentation so fixing readibility should be a relatively low-hanging fruit.

To cut a long story short, I was able to reduce code size quite easily by using a combination of C++ and Rcpp idioms. I was also able to get to faster: the paper / vignette demostrates consistent speed improvements on all setups that I tested (three standard functions on three small and three larger parameter vectors). More important speed gains were achieved by allowing use of objective functions that are written in C++ which again is both possible and easy thanks to Rcpp.

That leaves easier to prove: adding compiled objective functions is one indication; further proof could be provided by, say, moving the inner loop to parallel execution thanks to Open MP which I may attempt over the next few months. So far I'd like to give myself about half a point here. So not quite yet shorter, easier, faster: pick any three, but working on it.

Over the next few days I may try to follow up with a blog post or two contrasting some code examples and maybe showing a chart from the vignette.

/code/rcpp | permanent link

Sat, 11 Dec 2010

Regina Carter / Esperanza Spalding

Went to the Chicago Symphony yesterday to see Regina Carter as well as Jazz wunderkind Esperanza Spalding perform one seat each with their respective bands.

Regina Carter was presenting material from her current record 'Reverse Thread'. This was a real nice set of African-themed world music featuring Carter herself on violin, Yacouba Sissoko on kora, Will Holshouser on accordion, Chris Lightcap on bass and Alvester Garnett on drums. Some of pieces were really, really nicely done and I particularly enjoyed Holshouser on the accordion.

After the break, Esperanza Spalding come on for her `Chamber Music Society'. Lovely setup with Spalding on acoustic bass and vocals, Leo Genovese on piano/keyboards, Sara Caswell on violin, Lois Martin on viola, Jody Redhage on cello, the always impressice Terry Lyne Carrington on drums and Leala Cyr on backing vocals (and one co-lead in a really nice duet with Spalding). This was clearly more experimental and a chunk of the audience left during the act. But there is room for improvided chamber music, and it was a good modern music act. And Spalding is really quite impressive and I will gladly go and see her again.

/music/jazz/live | permanent link

Tue, 07 Dec 2010

inline 0.3.8

Romain pushed verion 0.3.8 of inline to CRAN earlier today, and I just updated the Debian package.

This version adds an internal performance enhancement which is obtained by making due with fewer reads. The short NEWS file entry follows:

0.3.8   2010-12-07

    o   faster cfunction and cxxfunction by loading and resolving the routine
        at "compile" time

/code/inline | permanent link

Wed, 01 Dec 2010

RcppGSL 0.1.0

Earlier in the year, Romain and I did a bunch of initial work on a wrapper from R to the GNU GSL by way of our Rcpp package for seamless R and C++ integration. But other work kept us busy and this fell a little to the side.

We have now found some time to finish this work for a first release, together with a nicely detailed eleven page package vignette. As of today, the package is now a CRAN package, and Romain already posted a nice announcement on his blog and on the rcpp-devel list.

So what does RcppGSL do? I gave the package its own webpage here as well and listed these points as key features of RcppGSL:

  • templated vector and matrix classes: these are similar to Rcpp's own vector and matrix classes, but really are just smart pointers around the C structure expected by the library
  • this means you can transfer data from R to your GSL-using programs in pretty much the same way you would in other C++ programs using Rcpp---by relying on the Rcpp::as() and Rcpp::wrap() converterrs
  • at the C++ level you can use these GSL vectors in a more C++-alike way (using eg foo[i] to access an element at index i)
  • yet at the same time you can just pass these vector and matrix objects to the GSL functions expecting its C objects: thanks to some cleverness in these classes they pass the right object on (see the example below)
  • we also provide the lightweight views for vectors and matrices as the GSL API uses these in many places.

Also provided is a simple example which is a simple implementation of a column norm (which we could easily compute directly in R, but we are simply re-using an example from Section 8.4.14 of the GSL manual):

#include <RcppGSL.h>
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>

extern "C" SEXP colNorm(SEXP sM) {

  try {

        RcppGSL::matrix<double> M = sM;     // create gsl data structures from SEXP
        int k = M.ncol();
        Rcpp::NumericVector n(k);           // to store results

        for (int j = 0; j < k; j++) {
            RcppGSL::vector_view<double> colview = gsl_matrix_column (M, j);
            n[j] = gsl_blas_dnrm2(colview);
        }
        M.free() ;
        return n;                           // return vector

  } catch( std::exception &ex ) {
        forward_exception_to_r( ex );

  } catch(...) {
        ::Rf_error( "c++ exception (unknown reason)" );
  }
  return R_NilValue; // -Wall
}

This example function is implemented in an example package contained in the RcppGSL package itself -- so that users have a complete stanza to use in their packages. This will then build a user package on Linux, OS X and Windows provided the GSL is installed (and on Windows you have to do all the extra steps of defining an environment variable pointing to and of course install Rtools to build in the first place---Linux and OS X are so much easier for development).

Another complete example is in the package itself and provides a faster (compiled) alternative to the standard lm() function in R; this example is the continuation of the same example I had in several versions of my Intro to HPC with R tutorials and in the Rcpp package itself as an early example.

We will try to touch base with CRAN package authors using both GSL and Rcpp to see how this can help them. The API in our package may well be incomplete, but we are always happy to try to respond to requests for additional features brought to our attention, preferably via the rcpp-devel list.

More information is on the RcppGSL page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Tue, 30 Nov 2010

RQuantLib 0.3.5

The new RQuantLib release 0.3.5 is now on CRAN and in Debian. RQuantLib combines (some of) the quantitative analytics of QuantLib with the R statistical computing environment and language.

Most of the changes were made two and four weeks ago: first in response to some warnings triggered by R 2.12.0 on the included manual pages which needed a brush-up, and then again is some consolidation of manual pages and some other minor tweaks. The release was then held back at CRAN as we noticed that manual pages, when collated to a single large document, triggered a segmentation fault in the latex compiler. Oddly enough only in Europe (if the a4paper option was used) and not here (where I use uspaper). Long story short, this turns out to be a bug in the latex toolchain (which we reported as Debian bug report 604754) which is apparently is known but has no known fix yet (a sample file was supplied with the bug report if you want to take a look).

With that, special thanks go to Kurt Hornik and Brian Ripley on the R Core team who made a change to how R processes the manual which made it resilient to the latex bug so that normal release of the package could proceed (and the shiny manual is available too).

Thanks to CRANberries, there is also a diff to the previous release 0.3.4. Full changelog details, examples and more details about this package are at my RQuantLib page.

/code/rquantlib | permanent link

Sun, 28 Nov 2010

Rcpp 0.8.9

A new release 0.8.9 of Rcpp is now available at CRAN and has just been uploaded to Debian. As always, sources are also available from my local directory here.

This release comes a few weeks after the preceding 0.8.8 release and continues with a number of enhancements mostly to what we call Rcpp modules, our even-easier C++/R integration which follow some ideas from Boost.Python. Our corresponding Rcpp-modules vignette has been updated too.

The NEWS entry follows below:

0.8.9   2010-11-27

    o   Many improvements were made to in 'Rcpp modules':

        - exposing multiple constructors

        - overloaded methods

        - self-documentation of classes, methods, constructors, fields and 
          functions.

        - new R function "populate" to facilitate working with modules in 
          packages. 

        - formal argument specification of functions.

        - updated support for Rcpp.package.skeleton.

        - constructors can now take many more arguments.
        
    o   The 'Rcpp-modules' vignette was updated as well and describe many
        of the new features

    o   New template class Rcpp::SubMatrix and support syntax in Matrix
        to extract a submatrix: 
        
           NumericMatrix x = ... ;
        
           // extract the first three columns
           SubMatrix y = x( _ , Range(0,2) ) ; 
        
           // extract the first three rows
           SubMatrix y = x( Range(0,2), _ ) ; 
        
           // extract the top 3x3 sub matrix
           SubMatrix y = x( Range(0,2), Range(0,2) ) ; 

    o   Reference Classes no longer require a default constructor for
        subclasses of C++ classes    

    o   Consistently revert to using backticks rather than shell expansion
        to compute library file location when building packages against Rcpp
        on the default platforms; this has been applied to internal test
        packages as well as CRAN/BioC packages using Rcpp

Thanks to CRANberries, there is also a diff to the previous release 0.8.8:

As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Fri, 26 Nov 2010

RcppArmadillo 0.2.10

Conrad Sanderson released version 1.0.0 of Armadillo, his templated C++ library for linear algebra, earlier this week. So congratulations to Conrad on reaching 1.0.0! I folded his version 1.0.0 into a new release 0.2.10 of RcppArmadillo, our Rcpp-based integration into R. Other small changes comprises a small fix to the build process, and additional output in the summary() function for the fastLm function.

The short NEWS file extract follows, also containing Conrad's entry for 1.0.0:

0.2.10  2010-11-25

    o   Upgraded to Armadillo 1.0.0 "Antipodean Antileech"

         * After 2 1/2 years of collaborative development, we are proud to
           release the 1.0 milestone version. 
         * Many thanks are extended to all contributors and bug reporters.

    o   R/RcppArmadillo.package.skeleton.R: Updated to no longer rely on GNU
        make for builds of packages using RcppArmadillo
 
    o   summary() for fastLm() objects now returns r.squared and adj.r.squared

And courtesy of CRANberries, here is the diff to the previous release 0.2.9:
 ChangeLog                                               |   17 ++++++++
 DESCRIPTION                                             |   25 +++++------
 R/RcppArmadillo.package.skeleton.R                      |    4 -
 R/fastLm.R                                              |   21 +++++++++
 inst/NEWS                                               |   13 ++++++
 inst/doc/RcppArmadillo-unitTests.pdf                    |binary
 inst/doc/unitTests-results/RcppArmadillo-unitTests.html |    6 +-
 inst/doc/unitTests-results/RcppArmadillo-unitTests.txt  |   34 ++++++++--------
 inst/include/armadillo_bits/arma_version.hpp            |   15 +++++--
 inst/skeleton/Makevars                                  |    2 
 src/Makevars                                            |    2 
 src/Makevars.win                                        |    2 
 12 files changed, 97 insertions(+), 44 deletions(-)

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Thu, 11 Nov 2010

RcppArmadillo 0.2.9

The new version 0.2.9 of RcppArmadillo has been uploaded to CRAN. The only change is an update of the included Armadillo template library to version 0.9.92 which Conrad released this week. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

The short NEWS file extract follows, also containing Conrad's entry for 0.9.92::

0.2.9   2010-11-11

    o   Upgraded to Armadillo 0.9.92 "Wall Street Gangster":

         * Fixes for compilation issues under the Intel C++ compiler
         * Added matrix norms

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 03 Nov 2010

inline 0.3.7

A bug-fix release 0.3.7 of inline is now on CRAN and at Debian.

It fixes a minor bug: when package.skeleton() was called to convert one or more functions created with this package into a package, the corner case of just a single submitted function failed. This is now corrected. Otherwise this release is unchanged from the previous release 0.3.6 from August.

/code/inline | permanent link

Tue, 02 Nov 2010

Rcpp 0.8.8

A bug-fix release 0.8.8 of Rcpp is now available. It is awaiting processing at CRAN, and will be uploaded to Debian once processed at CRAN. In the meantime, sources are available from my local directory here.

This release follows on the heels of 0.8.7, but contains fixes for a few small things Romain and I had noticed over the last two weeks since releasing 0.8.7 and contains only a small number of new tweaks. The NEWS entry follows below:

0.8.8   2010-11-01

    o   New syntactic shortcut to extract rows and columns of a Matrix. 
        x(i,_) extracts the i-th row and x(_,i) extracts the i-th column. 
    
    o   Matrix indexing is more efficient. However, faster indexing is
        disabled if g++ 4.5.0 or later is used.

    o   A few new Rcpp operators such as cumsum, operator=(sugar)

    o   Variety of bug fixes:

        - column indexing was incorrect in some cases

        - compilation using clang/llvm (thanks to Karl Millar for the patch)

        - instantation order of Module corrected

        - POSIXct, POSIXt now correctly ordered for R 2.12.0 

As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Wed, 27 Oct 2010

Google Tech Talk on Integrating R and C++: video and slides

Last Friday, Romain and I were guests of the R intergrouplet (what an adorable name!) at Google's headquarter in Mountain View. This arose out of discussions following useR! 2010 where we met Google's Murray Stokely. There appears to be ever increasing use of R at Google, and so it was a great opportunity to give a Google Tech Talk about R and C++ integration --- centered around our Rcpp, RInside and RProtoBuf packages which facilitate interoperability between R and C++.

A video recording of our ninety-minute talk is already available via the YouTube channel for Google Tech Talks. The (large) pdf with slides (which Romain had already posted on slideshare) is also available from my presentations page.

The remainder of the weekend was nice too (with the notably exception of the extremly sucky weather). We got to to spend some time at the Google Summer of Code Mentor Summit which is always a fun event and a great way to meet other open source folks in person. And we also took one afternoon off to spend some with John Chambers discussing further work involving Rcpp and the new ReferenceClasses that appeared in the just-released R version 2.12.0. This should be a nice avenue to further integrate R and C++ in the near future.

/computers/R | permanent link

Mon, 18 Oct 2010

RProtoBuf 0.2.1

A fresh minor release of RProtoBuf, now at version 0.2.1, has appeared earlier today on CRAN. RProtoBuf provides GNU R bindings for the Google Protobuf data encoding library used and released by Google.

This releases extends the recent 0.2.0 release of RProtoBuf with a patch for raw bytes serialization which Koert Kuipers kindly contributed. This helps RProtoBuf for RPC communication where raw bytes are often a preferred form.

As always, there is more information at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.

/code/rprotobuf | permanent link

Sun, 17 Oct 2010

RPostgreSQL 0.1-7

After a somewhat long hiatus, RPostgreSQL version 0.1-7 has now been released to CRAN. RPostgreSQL connects R to PostgreSQL database systems using the standard DBI interface.

This version fixes a number of issues that had been compiled in the issue tracker on the project site at Google Code. Tomoaki Nishiyama, who joined our small development group for his package a few weeks ago, was instrumental in a number of these fixes, with assistance from Joe Conway.

The relevant NEWS file entry follows below:

Version 0.1-7 -- 2010-10-17

    o   Several potential buffer overruns were fixed

    o   dbWriteTable now writes a data.frame to database through a network
        connection rather than a temporary file. Note that row_names may be
        changed in future releases.  Also, passing in filenames instead of
        data.frame is not supported at this time. 

    o   When no host is specified, a connection to the PostgreSQL server 
        is made via UNIX domain socket (just like psql does)

    o   Table and column names are case sensitive, and identifiers are escaped
        or quoted appropriately, so that any form of table/column names can be
        created, searched, or removed, including upper-, lower- and mixed-case.

    o   nullOk in dbColumnInfo has a return value of NA when the column does
        not correspond to a column in the table. The utility of nullOk is
        doubtful but not removed at this time.

    o   Correct Windows getpid() declaration (with thanks to Brian D. Ripley)

    o   A call of as.POSIXct() with a time format string wrongly passed to TZ
        has been corrected; this should help with intra-day timestamps (with
        thanks to Steve Eick)

    o   Usage of tmpdir has been improved on similarly to Linux (with thanks
        to Robert McGehee)

More information is on the my RPostgreSQL page, and on project site at Google Code.

/code/rpostgresql | permanent link

Sat, 16 Oct 2010

RcppArmadillo 0.2.8

Version 0.2.8 of RcppArmadillo is now on CRAN. This updates the included Armadillo template library to version 0.9.90 which Conrad released two few days ago. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

The short NEWS file extract follows, also containing Conrad's entry for 0.9.90::


    o   Upgraded to Armadillo 0.9.90 "Water Dragon":

         * Added unsafe_col()
         * Speedups and bugfixes in lu()
         * Minimisation of pedantic compiler warnings

    o   Switched NEWS and ChangeLog between inst/ and the top-level directory
        so that NEWS (this file) gets installed with the package

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Fri, 15 Oct 2010

Rcpp 0.8.7

With the scheduled R release of version 2.12.0 this morning, we have just uploaded version 0.8.7 of Rcpp to CRAN; Debian will follow shortly once the autobuilders have processed R 2.12.0

This Rcpp release depends on R 2.12.0 as two things have changed. First, we play along with change in R concerning the ordering of inheritance for time classes. But secondly, and more importantly, we support in Rcpp the corresponding change R itself which brings the new ReferenceClasses. Here is corresponding bit from R's NEWS file for R 2.12.0:

    o A facility for defining reference-based S4 classes (in the OOP
      style of Java, C++, etc.) has been added experimentally to
      package methods; see ?ReferenceClasses.

[...]

    o An experimental new programming model has been added to package
      methods for reference (OOP-style) classes and methods.  See
      ?ReferenceClasses.
This was made possible in large part by code committed by John Chambers (whom we had welcomed recently as a co-author to Rcpp) building on the changes he made to R 2.12.0 itself, as well on the work Romain had done with 'Rcpp Modules'. The R help page for ReferenceClasses carries a reference (bad pun) to Rcpp 0.8.7 so these two releases do go together. This should be a lot of fun over the next little while: S3, S4, and now ReferenceClasses.

We also made a number of internal changes some of which leads to speed-ups and internal improvement. The NEWS entry follows below:

0.8.7   2010-10-15

    o   As of this version, Rcpp depends on R 2.12 or greater as it interfaces 
        the new reference classes (see below) and also reflects the POSIXt class
        reordering both of which appeared with R version 2.12.0

    o   new Rcpp::Reference class, that allows internal manipulation of R 2.12.0
        reference classes. The class exposes a constructor that takes the name
        of the target reference class and a field(string) method that implements
        the proxy pattern to get/set reference fields using callbacks to the 
        R operators "$" and "$<-" in order to preserve the R-level encapsulation

    o   the R side of the preceding item allows methods to be written
        in R as per ?ReferenceClasses, accessing fields by name and
        assigning them using "<<-".  Classes extracted from modules
        are R reference classes.  They can be subclassed in R, and/or R methods
        can be defined using the $methods(...) mechanism.

    o   internal performance improvements for Rcpp sugar as well as an added
        'noNA()' wrapper to omit tests for NA values -- see the included
        examples in inst/examples/convolveBenchmarks for the speedups

    o   more internal performance gains with Functions and Environments

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Mon, 11 Oct 2010

Chicago Marathon 2010

It's the Monday of the Columbus Day weekend here, so I must have been running a Chicago Marathon yesterday. Indeed -- the 34th annual Chicago Marathon took place yesterday but everything was about its 10/10/10 date. The symmetric set of numbers was in all advertisements, posters, street signs, on our bibs, the medals, the (nice, for once) race shirt. Everywhere, and yes, I will admit that it was also part of the reason I finally registered earlier in the year shortly before the race sold out.

This was the sixth time I ran this race (and my 14th marathon overall). And I still can't run this course all that well: never got a Boston qualification here. As I had mentioned when I blogged about my third Boston Marathon earlier in the year and the recent Chicago Half-Marathon, I have had some recurrent issue with a sore achilles which limited my running throughout the year. It had gotten better but a quick summary of the miles in my running log showed that I had been running only about 80% of the training miles I had in prior years. And not a single 20-miler. I knew I'd have to pay for that.

Plus, as so often, the weather. Not quite as hot as the record-heat of 2007. But close enough: high 60s at the start and high 70s or even low 80s towards the end. But I have to compliment to the race organisers. The race was very well organised (following the experience of 2007) with extra water stops, extra sponges handed out at several spots (!!) and very good communication when during the race the alert level was raised to yellow given the heat and humidity. The searchable results now show a fair number of non-finishers, but at least nobody seems to have died. But it looked ugly on the course. I think I ran by three or four sets of paramedics assisting runners who were 'down and out'..

So how did I do? Fair, I suppose -- I ran pretty well for sixteen miles, then needed a first short walking break and continued to run well towards and past the 18 mile waterstop where a bunch of friends and fellow Oak Park runners were helping. But not long after that, I crumbled and needed to alternate walking and running for most of the remainder. With that I came in at 3:41:41, or a 8:28 min/mile pace. And which is by two seconds slower than the previous 'worst' from 2007. But heck, at least it's still more than three minutes faster than Dubya in Houston in 1993 ... I also got beat by a few local running friends as well as by Chicago's own marathon juggler. So there. Maybe I'll train a bit more next time.

/sports/running | permanent link

Sun, 03 Oct 2010

More BLAS, BLASter, BLAStest: Updates on gcbd

Following up on my initial post announcing gcbd, here is a brief note on a new version. The initial post announced version 0.2.2 which was the first CRAN version of gcbd. I updated to 0.2.3 when I made the aforementioned first blog post about gcbd with the lattice plot of the BLAS and GPU benchmark results across six different implementations (from reference BLAS to two Atlas versions, Goto, MKL and a GPU-based one).

There is now a new version 0.2.4 of gcbd on CRAN. I revised the paper ever so slightly based on some more feedback, and focussed the results sections by concentrating on just the log-axes lattice blot and the corresponding lattice plot of raw results---where the y-axis is capped at 30 seconds:

GPU/CPU Benchmark Results in levels

This chart--in levels rather than using logarithmic axes is done here--nicely illustrates just how large the performance difference can be for for matrix multiplication and LU decomposition. QR and SVD are closer but accelerated BLAS libraries still win. GPUs can be compelling for some tasks and large sizes.

More discussion is still available in the paper which is also included in the gcbd package for R.

/code/gcbd | permanent link

Sun, 26 Sep 2010

RcppArmadillo 0.2.7

Version 0.2.7 of RcppArmadillo is now on CRAN. This updates the included Armadillo template library to version 0.9.80 which Conrad released a few days ago. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

The short NEWS file extract follows, also containing Conrad's entry for 0.9.80::

0.2.7   2010-09-25

    o   Upgraded to Armadillo 0.9.80 "Chihuahua Muncher":

         * Added join_slices(), insert_slices(), shed_slices()
         * Added in-place operations on diagonals
         * Various speedups due to internal architecture improvements

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Fri, 24 Sep 2010

R / Finance 2011 Call for Papers

Brian announced it on r-help and r-sig-finance and I have since updated the R/Finance website and Call for Papers page. And as David Smith already outblogged me about it, without further ado our Call for Paper for next spring's R/Finance conference:

Call for Papers:

R/Finance 2011: Applied Finance with R
April 29 and 30, 2011
Chicago, IL, USA

The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

One-page abstracts or complete papers (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated "lightning talks", and for a limited number of pre-conference (longer) seminar sessions.

Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

Please send submissions to: committee at RinFinance.com.

The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling.

Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.

R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011 including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

So see you in Chicago in April!

/computers/R | permanent link

Thu, 23 Sep 2010

R Project and Google Summer of Code: Wrapping up

As this year's admin, I wrote up the following summary which has now been posted at the R site in the appropriate slot. My thanks to this year's students, fellow mentors and everybody else who helped to make it happen.

GSoC 2010 logo

Projects 2010

As in 2008 and 2009, the R Project has again participated in the Google Summer of Code during 2010.

Based on ideas collected and disussed on the R Wiki, the projects and students listed below (and sorted alphabetically by student) were selected for participation and have been sponsored by Google during the summer 2010.

The finished projects are available via the R / GSoC 2010 repository at Google Code, and in several cases also via their individual repos (see below). Informal updates and final summaries on the work was also provided via the GSoC 2010 R group blog.


rdx - Automatic Differentiation in R

Chidambaram Annamalai, mentored by John Nash.

Proposal: radx is a package to compute derivatives (of any order) of native R code for multivariate functions with vector outputs, f:R^m -> R^n, through Automatic Differentiation (AD). Numerical evaluation of derivatives has widespread uses in many fields. rdx will implement two modes for the computation of derivatives, the Forward and Reverse modes of AD, combining which we can efficiently compute Jacobians and Hessians. Higher order derivatives will be evaluated through Univariate Taylor Propagation.

Delivered: Two packages radx: forward automatic differentiation in R and tada: templated automatic differentiation in C++ were created; see this blog post for details.


A GUI for Graphics using ggplot and Deducer

by Ian Fellows, mentored by Hadley Wickham.

Proposal: R puts the latest statistical techniques at one's fingertips through thousands of add-on packages available on the CRAN download servers. The price for all of this power is complexity. Deducer is a cross-platform cross-console graphical user interface built on top of R designed to reduce this complexity. This project proposes to extend the scope of Deducer by creating an innovative yet intuitive system for generating statistical graphics based on the ggplot2 package.

Delivered: All of the major features have been implemented, and are outlined in the video links in this blog post.


rgeos - an R wrapper for GEOS

by Colin Rundel, mentored by Roger Bivand.

Proposal: At present there does not exist a robust geometry engine available to R, the tools that are available tend to be limited in scope and do not easily integrate with existing spatial analysis tools. GEOS is a powerful open source geometry engine written in C++ that implements spatial functions and operators from the OpenGIS Simple Features for SQL specification. rgeos will make these tools available within R and will integrate with existing spatial data tools through the sp package.

Delivered: The rgeos project on R-Forge; see the final update blog post.


Social Relations Analyses in R

by Felix Schoenbrodt, mentored by Stefan Schmukle.

Proposal: Social Relations Analyses (SRAs; Kenny, 1994) are a hot topic both in personality and in social psychology. While more and more research groups adopt the methodology, software solutions are lacking far behind - the main software for calculating SRAs are two DOS programs from 1995, which have a lot of restrictions. My GSOC project will extend the functionality of these existing programs and bring the power of SRAs into the R Environment for Statistical Computing as a state-of-the-art package.

Delivered: The TripleR package is now on CRAN and hosted on RForge.Net; see this blog post for updates.


NoSQL Interface for R

by Yasuhisa Yoshida, mentored by Dirk Eddelbuettel.

Proposal: So-called NoSQL databases are becoming increasingly popular. They generally provide very efficient lookup of key/value pairs. I'll provide several implementation of NoSQL interface for R. Beyond a sample interface package, I'll try to support generic interface similar to what the DBI package does for SQL backends

Status: An initial prototype is available via RTokyoCabinet on Github. No updates were made since June; no communication occurred with anybody related to the GSoC project since June and the project earned a fail.


Last modified: Wed Sep 22 19:39:43 CDT 2010

/computers/misc | permanent link

Wed, 15 Sep 2010

BLAS, BLASter, BLAStest: Some benchmark results, and a benchmarking framework

Usage of accelerated BLAS libraries seems to shrouded in some mystery, judging from somewhat regularly recurring requests for help on lists such as r-sig-hpc (gmane version), the R list dedicated to High-Performance Computing. Yet it doesn't have to be; installation can be really simple (on appropriate systems).

Another issue that I felt needed addressing was a comparison between the different alternatives available, quite possibly including GPU computing. So a few weeks ago I sat down and wrote a small package to run, collect, analyse and visualize some benchmarks. That package, called gcbd (more about the name below) is now on CRAN as of this morning. The package both facilitates the data collection for the paper it also contains (in the vignette form common among R packages) and provides code to analyse the data---which is also included as a SQLite database. All this is done in the Debian and Ubuntu context by transparently installing and removing suitable packages providing BLAS implementations: that we can fully automate data collection over several competing implementations via a single script (which is also included). Contributions of benchmark results is encouraged---that is the idea of the package.

The paper itself describes the background and technical details before presenting the results. The benchmark compares the basic reference BLAS, Atlas (both single- and multithreaded), Goto, Intel MKL and a GPU-based approach. This blog post is not the place to recap all results, so please do see the paper for more details. But one summary chart regrouping the main results fits well here:

GPU/CPU Benchmark Results

This chart, in a log/log form, shows how reference BLAS lags everything, how multithreaded newer Atlas improves over the standard Atlas package currently still the default in both distros, how the Intel MKL (available via Ubuntu) is fairly good but how Goto wins almost everything. GPU computing is compelling for really large sizes (at double precision) and too costly at small ones. It also illustrates variability and different computational cost across the methods tested: svd is more expensive than level-3 matrix multiplication, and the different implementations are less spread apart. More details are in the paper; code, data etc are in the package gcbd.

The larger context is to do something like this benchmarking exercise, but across distributions, operating systems and possibly also GPU cards. Mark and I started to talk about this during and after R/Finance earlier this year and have some ideas. Time permitting, that work should be happening in the GPU/CPU Benchmarks (gdb) project, and that's why this got called gcbd as a simpler GPU/CPU Benchmarks on Debian Systems study.

/code/gcbd | permanent link

Mon, 13 Sep 2010

RcppArmadillo 0.2.6

Now that Rcpp got updated to 0.8.6, we have an updated RcppArmadillo release containing mostly updates to Conrad's Armadillo version 0.9.70 as well as some more templated sugar magic. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

The short NEWS file extract follows:

0.2.6   2010-09-12

    o   Upgraded to Armadillo 0.9.70 "Subtropical Winter Safari"
    
    o   arma::Mat, arma::Row and arma::Col get constructor that take vector
        or matrix sugar expressions. See the unit test "test.armadillo.sugar.ctor" 
        and "test.armadillo.sugar.matrix.ctor" for examples.

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sun, 12 Sep 2010

Chicago Half Marathon 2010

Second Sunday in September -- time for the annual Chicago Half Marathon now in its fourteenth edition (and I have been running it in 2003, 2004, 2005, 2006, 2007, 2008 and 2009 making this and the JPM Chase Corporate Challenge the races I've run most often). And the course was altered this year, alongside with an earlier start as the Bears have their season home opener today. So we started north towards 57th, then running down to 67th, turning east towards the lake at around three miles --- and having the remaining ten miles along the lakefront running up to 31st (the usual turn) and back down to 63rd. I like this course better; let's hope it sticks.

Race conditions were fantastic. We had a rainy and gray day yesterday but today is pure bliss. Temperatures around 60 degrees at the 7:00am start, no wind, sunshine and not a cloud in the sky.

The race itself went well. I had a pretty brutal running year suffering most of the time from some archilles tendon inflammation. It has gotten better in the last few weeks possibly thanks to some heel cups I now put in the shoes. But I had exactly one run longer than ten miles since the Boston Marathon. So I lost a lot of speed, as well as endurance and was a little nervous as to how I'd do. And considering all this, it went pretty well. I fininished in 1;41:50 or a 7:47 pace. While is easily the slowest half in a number of years, at least I got to run it evenly, pain free and with a negative split (== faster second half) and some gas left for a fast last half mile or so. So maybe I don't have to retire from running just yet. We'll see if I get some speed back in 2011.

/sports/running | permanent link

Sat, 11 Sep 2010

RProtoBuf 0.2,0

A brand new and shiny release of RProtoBuf, now at version 0.2.0, arrived on CRAN earlier today. RProtoBuf provides GNU R bindings for the Google Protobuf data encoding library used and release by Google and others.

This is only the second release after 0.1-0 more than six months ago. Given that Rcpp is such a key ingedrient for RProtoBuf, and that Rcpp underwent so many exciting changes itself, Romain and I never got around to releasing new versions of RProtoBuf. This version is now much closer to the actual C++ API and fairly feature rich. We summarised a few of these new things in the presentation at useR! 2010.

There is more information at the RProtoBuf page; there is a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.

/code/rprotobuf | permanent link

Fri, 10 Sep 2010

Billy Bragg

In the spur of the moment, I cycled over to Dominican, one of the two small colleges in town, to see if I could snag a remaining ticket to see Billy Bragg perform.

Turned out I could, and it became a nice evening out. Darren Hanlon started up the evening as the opener for a good half hour, and was quite decent; somewhat charming in a good natured way, not taking himself too too seriously. I'd gladlt see him again.

After a longer-than-needed break Billy Bragg came on stage and played for two straight hours, alternating between an electric and acoustic guitar. And also alternating between some newer material and (especially towards the end and the encore) some old crowd-pleasure. I don't know his material all that well but have of course know of his career over these last 25 years and am quite glad I went to see him. Nice way to end the week.

/music/rock | permanent link

Rcpp 0.8.6

After a somewhat longer than usual break, we now have a new release of Rcpp on CRAN and in Debian.

This release adds quite few things. The main one may be the addition of density, distribution, quantile and random number functions for a rather large number of statistical distribution. Usage is pretty much as it would be in R, yet it is vectorised at the C++ level. A fair number of unit tests were added too, but some work is left to do there too.
Support for complex number was enhanced both in the expressive 'sugar' context and via a few binary operators that had been missing. We also started a new vignette to provide a 'quick reference'; unfortunately this is not quite complete yet.

The NEWS entry follows below:

0.8.6   2010-09-09

    o	new macro RCPP_VERSION and Rcpp_Version to allow conditional compiling
        based on the version of Rcpp
    
           #if defined(RCPP_VERSION) && RCPP_VERSION >= Rcpp_Version(0,8,6)
           ...
           #endif

    o   new sugar functions for statistical distributions (d-p-q-r functions)
	with distributions : unif, norm, gamma, chisq, lnorm, weibull, logis,
	f, pois, binom, t, beta.

    o   new ctor for Vector taking size and function pointer so that for example

    	   NumericVector( 10, norm_rand )

	generates a N(0,1) vector of size 10

    o   added binary operators for complex numbers, as well as sugar support 

    o   more sugar math functions: sqrt, log, log10, exp, sin, cos, ...

    o	started new vignette Rcpp-quickref : quick reference guide of Rcpp API
        (still work in progress)

    o	various patches to comply with solaris/suncc stricter standards

    o	minor enhancements to ConvolutionBenchmark example

    o	simplified src/Makefile to no longer require GNU make; packages using
        Rcpp still do for the compile-time test of library locations
    

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 07 Sep 2010

Straight, curly, or compiled?

Christian Robert, whose blog I commented-on here once before, had followed up on a recent set of posts by Radford Neal which had appeared both on Radford's blog and on the r-devel mailing list.

Now, let me prefix this by saying that I really enjoyed Radford's posts. He obviously put a lot of time into finding a number of (all somewhat small in isolation) inefficiencies in R which, when taken together, can make a difference in performance. I already spotted one commit by Duncan in the SVN logs for R so this is being looked at.

Yet Christian, on the other hand, goes a little overboard in bemoaning performance differences somewhere between ten and fifteen percent -- the difference between curly and straight braces (as noticed in Radford's first post). Maybe he spent too much time waiting for his MCMC runs to finish to realize the obvious: compiled code is evidently much faster.

And before everybody goes and moans and groans that that is hard, allow me to just interject and note that it is not. It really doesn't have to be. Here is a quick cleaned up version of Christian's example code, with proper assigment operators and a second variable x. We then get to the meat and potatoes and load our Rcpp package as well as inline to define the same little test function in C++. Throw in rbenchmark which I am becoming increasingly fond of for these little timing tests, et voila, we have ourselves a horserace:

# Xian's code, using <- for assignments and passing x down
f <- function(n, x=1) for (i in 1:n) x=1/(1+x)
g <- function(n, x=1) for (i in 1:n) x=(1/(1+x))
h <- function(n, x=1) for (i in 1:n) x=(1+x)^(-1)
j <- function(n, x=1) for (i in 1:n) x={1/{1+x}}
k <- function(n, x=1) for (i in 1:n) x=1/{1+x}

# now load some tools
library(Rcpp)
library(inline)

# and define our version in C++
l <- cxxfunction(signature(ns="integer", xs="numeric"),
                 'int n = as<int>(ns); double x=as<double>(xs);
                  for (int i=0; i<n; i++) x=1/(1+x);
                  return wrap(x); ',
                 plugin="Rcpp")

# more tools
library(rbenchmark)

# now run the benchmark
N <- 1e6
benchmark(f(N, 1), g(N, 1), h(N, 1), j(N, 1), k(N, 1), l(N, 1),
          columns=c("test", "replications", "elapsed", "relative"),
          order="relative", replications=10)

And how does it do? Well, glad you asked. On my i7, which the other three cores standing around and watching, we get an eighty-fold increase relative to the best interpreted version:

/tmp$ Rscript xian.R
Loading required package: methods
     test replications elapsed relative
6 l(N, 1)           10   0.122    1.000
5 k(N, 1)           10   9.880   80.984
1 f(N, 1)           10   9.978   81.787
4 j(N, 1)           10  11.293   92.566
2 g(N, 1)           10  12.027   98.582
3 h(N, 1)           10  15.372  126.000
/tmp$ 
So do we really want to spend time arguing about the ten and fifteen percent differences? Moore's law gets you those gains in a couple of weeks anyway. I'd much rather have a conversation about how we can get people speed increases that are orders of magnitude, not fractions. Rcpp is one such tool. Let's get more of them.

/computers/R | permanent link

Thu, 26 Aug 2010

Louis: A Silent Film with Live Music

The Chicago Symphony hosted the world premiere of Louis last evening, and I had snatched the (literally) last available ticket.

The film, which is written, directed and producted by Dan Pritzker, is based loosely on the early years of Louis Armstrong in New Orleans. The movie is shot beautifully by Vilmos Zsigmond in blend of colour and black-and-white which works very well for invoking the early days of film. A key part of the production is of course the score, and the live music with both a thirteen-piece orchestra featuring Wynton Marsalis as well as piano solo recitals by Cecile Licad with an emphasis on pieces by 19th-century composer Louis Moreau Gottschalk. The combination of a silent movie with a stong live band is something to behold -- if you can catch the movie and performance in a city nearby, go!

/music/jazz/live | permanent link

Mon, 09 Aug 2010

RQuantLib 0.3.4

A fresh release of RQuantLib is now on CRAN and in Debian. RQuantLib combines (some of) the quantitative analytics of QuantLib with the R statistical computing environment and language.

This follows the 0.3.3 release from last week and has again a number of internal changes. All uses of objects from external namespaces are now explicit as I removed the remaining using namespace QuantLib;. This makes things a little more verbose, but should be much clearer to read, especially for those not yet up to speed on whether a given object comes from any one of the Boost, QuantLib or Rcpp namespaces. We also generalized an older three-dimensional plotting function used for option surfaces -- which had already been used in the demo() code -- and improved the code underlying this: arrays of option prices and analytics given two input vectors are now computed at the C++ level for a nice little gain in efficiency. This also illustrates the possible improvements from working with the new Rcpp API that is now used throughout the package,

Full changelog details, examples and more details about this package are at my RQuantLib page.

/code/rquantlib | permanent link

Fri, 06 Aug 2010

RInside release 0.2.3

A new 0.2.3 release of RInside is now on CRAN. RInside is a set of convenience classes which facilitate embedding of R inside of C++ applications and programs. RInside works particularly well with Rcpp and now depends on it.

This is the first release since March when we released 0.2.2. A few things got added to Rcpp in the meantime, and RInside is taking advantage of some of these as illustrated in several of the included examples.

More details and the changelog are on the RInside page which also leads

/code/rinside | permanent link

Thu, 05 Aug 2010

RcppArmadillo 0.2.5

Hot on the heels of RcppArmadillo release 0.2.4 a few days ago comes a new release 0.2.5 which is now on CRAN. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

This release upgrades the included Armadillo version to Conrad's just-released version 0.9.60. This overcomes some of minor issues we had with 'older' compilers such as g++ 4.2.x with x being 1 or 2. No other changes were made from our end.

The short NEWS file extract follows:

0.2.5   2010-08-05

    o   Upgraded to Armadillo 0.9.60 "Killer Bush Turkey"

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 04 Aug 2010

RQuantLib 0.3.3

A new release (now at version 0.3.3) of RQuantLib is now on CRAN and in Debian. RQuantLib combines (some of) the quantitative analytics of QuantLib with the R statistical computing environment and language.

Many of the changes in this new version are internal. The code was re-written using the new Rcpp API throughout, and the build system was further simplified using the LinkingTo: mechanism. The arithmetic average-price asian option pricer was added. A few other code updates were made as well.

Full changelog details, examples and more details about this package are at my RQuantLib page.

/code/rquantlib | permanent link

Mon, 02 Aug 2010

RcppExamples 0.1.1

On Friday, I quickly provided a new release of RcppExamples, our example package for using Rcpp, to CRAN. Very little content has been updated --- and a new TODO file even notes that we need to add more documentation for the new Rcpp API. And update existing documentation as lots has changed since the initial release in March.

But at least this package now joins RcppArmadillo is using the highly-recommened LinkingTo: Rcpp directive in the DESCRIPTION file to let R find the Rcpp headers, making the build process a little more robust.

A few more details on the page are on the RcppExamples page.

/code/rcpp | permanent link

inline 0.3.6

A couple of days ago, Romain released inline release 0.3.6 to CRAN. This is a maintenance release with no user-visible changes. However, as it captures compiler errors more directly, it should help us debug Rcpp on recalcitrant platforms such as Solaris with suncc where we have no shell access and no build robot (though that may be changing with the rumoured bin-builder). More details on the release at Romain's blog.

/code/inline | permanent link

Thu, 29 Jul 2010

RcppArmadillo 0.2.4

A new release of RcppArmadillo is now on CRAN. RcppArmadillo makes it easy to write highly efficient and highly readable C++ code for linear algebra (based on Armadillo) in R extensions (using Rcpp for the interface).

This release upgrades the included Armadillo version to 0.9.52 (see here for Conrad's high-level changes). We had to make two minor tweaks. In the fastLm() help page example we switched from inv() to pinv() The short NEWS file extract follows:

0.2.4   2010-07-27

    o   Upgraded to Armadillo 0.9.52 'Monkey Wrench'

    o   src/fastLm.cpp: Switch from inv() to pinv() as inv() now tests for
        singular matrices and warns and returns an empty matrix which stops
        the example fastLm() implementation on the manual page -- and while
        this is generally reasonably it makes sense here to continue which
        the Moore-Penrose pseudo-inverse allows us to do this

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Mon, 26 Jul 2010

Rcpp 0.8.5

A new release (now at version 0.8.5) of Rcpp is now on CRAN and in Debian.

This release constitutes a quick follow-up to the last release 0.8.4 which we got out just before CRAN closed for summer vacations. Some fixes were made right after last release: two harmless warnings from the help file parser of the development version of R are now addressed, and we stopped using shell expansions in the Makefile snippets. We also added to some internal speedups we discovered while prepapring the talk about RProtoBuf for last week's useR! meeting.

The NEWS entry follows below:

0.8.5   2010-07-25

    o   speed improvements. Vector::names, RObject::slot have been improved
        to take advantage of R API functions instead of callbacks to R

    o   Some small updates to the Rd-based documentation which now points to 
        content in the vignettes.  Also a small formatting change to suppress
        a warning from the development version of R.

    o   Minor changes to Date() code which may reenable SunStudio builds

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Sat, 24 Jul 2010

useR 2010 at NIST in Gaithersburg

This past week, the annual R user conference useR! 2010 took place at the National Institute of Standards and Technology (NIST) in Gaithersburg, MD (which is a tad northwest of Washington, DC). Kate Mullen and her team of local organizers did a truly tremendous job in putting together a very smooth conference attended by almost 500 people. It is always nice to meet so many other R contributors and users in person. And needless to say it's also just plain fun to hang out with these folks.

As at the preceding useR! 2008 in Dortmund and useR! 2009 in Rennes, I presented a three-hour tutorial on high-performance computing with R. This covers scripting/automation, profiling, vectorisation, interfacing compiled code, parallel computing and large-memory approaches. The slides, as well as a condensed 2-up version, are now on my presentations page.

On Wednesday, Romain and I had a chance to talk about recent work on Rcpp, our R and C++ integration. Thursday, we followed up with a presentation on RProtoBuf -- a project integrating Google's Protocol Buffers with R which much to our delight already seems to be in use at Google itself! It was quite fun to do these two talks jointly with Romain. But my other coauthor Khanh had to be at a conference related to his actual PhD work. So on Friday it was just me to give a presentation about RQuantLib which brings QuantLib to R.

Slides from all these talks have now been added to my presentations page. I will also upload them via the conference form so that they can be part of the conference's collection of presentations which should be forthcoming.

/computers/R | permanent link

Thu, 15 Jul 2010

Rcpp 0.8.4

Romain and I wrapped up release 0.8.4 of Rcpp last Friday. However, given the time of year, it only appeared on CRAN this morning, and then only after some prodding as CRAN processing is more or less closed this week and probably next.

This release builds upon release 0.8.3. Highlights include changes to the sugar framework for highly expressive C++ constructs which gained new vector function as well as a first set of matrix function. As well, unit tests have been reorganised in such a way that we end up with a lot fewer compilations (but of several files at once) which reaps significant speed gains. Date calculation now use the same mktime() function R itself uses (and which comes from Arthur Olson's tzone library). The NEWS entry follows below:

0.8.4   2010-07-09

    o   new sugar vector functions: rep, rep_len, rep_each, rev, head, tail,
        diag
        
    o   sugar has been extended to matrices: The Matrix class now extends the 
        Matrix_Base template that implements CRTP. Currently sugar functions 
        for matrices are: outer, col, row, lower_tri, upper_tri, diag

    o   The unit tests have been reorganised into fewer files with one call
        each to cxxfunction() (covering multiple tests) resulting in a
        significant speedup

    o   The Date class now uses the same mktime() replacement that R uses
        (based on original code from the timezone library by Arthur Olson)
        permitting wide dates ranges on all operating systems

    o   The FastLM example has been updated, a new benchmark based on the
        historical Longley data set has been added

    o   RcppStringVector now uses std::vector internally
    
    o   setting the .Data slot of S4 objects did not work properly

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 29 Jun 2010

Rcpp 0.8.3

A new version 0.8.3 of Rcpp is now CRAN and in Debian.

It comes about three weeks after the 0.8.2 release. And even though we promised to concentrate on documentation, it contains a raft of new features:

  • The addition of what we dub Rcpp sugar: some syntactic sugar based on clever use of expression templates that lets us write C++ expression as neatly and compactly as vectorised R expressions (while getting C++ speed!!). More on that below.
  • New classes Date and Datetime with internal representations just like R's Date and POSIXct classes, plus vector versions DateVector and DatetimeVector which behave just like STL vectors. With that, the new API is now feature-complete compared to the older 'classic' API.
  • Rcpp Modules (our 'not unlike Boost::Python' feature introduced in version 0.8.1) can now expose public data members
  • A new API class InternalFunction which can expose C++ functions even without R modules.

The main thing here is Rcpp sugar for which we also have a new (seventh !!) vignette Rcpp-sugar. As a quick example, consider this simple C++ function that takes two vectors from R and creates a new one conditional on the relative values:

export "C" SEXP foo( SEXP xs, SEXP ys) {
    Rcpp::NumericVector x(xs), y(ys);
    int n = x.size();
    Rcpp::NumericVector res( n );
    double xd = 0.0, yd = 0.0 ;
    for( int i=0; i<n; i++){
        xd = x[i];
        yd = y[i];
        if( xd < yd ){
            res[i] = xd * xd ;
        } else {
            res[i] = -( yd * yd);
        }
    }
    return res ;
}
Now, if you use R, you really want to writes this more compactly. And now you can, thanks to Rcpp sugar:
extern "C" SEXP foo( SEXP xs, SEXP ys){
    Rcpp::NumericVector x(xs), y(xs);
    return ifelse( x < y, x*x, -(y*y));
}
Same great taste, but much less filling! More details are in the Rcpp-sugar vignette. Doug Bates is already a fan of this and is employing it in the lme4a development version of the well-known lme4 package.

The full NEWS entry for this release follows below:

0.8.3   2010-06-27

    o	This release adds Rcpp sugar which brings (a subset of) the R syntax
        into C++. This supports : 
         - binary operators : <,>,<=,>=,==,!= between R vectors
         - arithmetic operators: +,-,*,/ between compatible R vectors
         - several functions that are similar to the R function of the same name:
        abs, all, any, ceiling, diff, exp, ifelse, is_na, lapply, pmin, pmax, 
        pow, sapply, seq_along, seq_len, sign
        
        Simple examples :
        
          // two numeric vector of the same size
          NumericVector x ;
          NumericVector y ;
          NumericVector res = ifelse( x < y, x*x, -(y*y) ) ;
        
          // sapply'ing a C++ function
          double square( double x ){ return x*x ; }
          NumericVector res = sapply( x, square ) ;
        
        Rcpp sugar uses the technique of expression templates, pioneered by the 
        Blitz++ library and used in many libraries (Boost::uBlas, Armadillo). 
        Expression templates allow lazy evaluation of expressions, which 
        coupled with inlining generates very efficient code, very closely 
        approaching the performance of hand written loop code, and often
        much more efficient than the equivalent (vectorized) R code.
        
        Rcpp sugar is curently limited to vectors, future releases will 
        include support for matrices with sugar functions such as outer, etc ...
        
        Rcpp sugar is documented in the Rcpp-sugar vignette, which contains
        implementation details.

    o   New helper function so that "Rcpp?something" brings up Rcpp help

    o   Rcpp Modules can now expose public data members

    o   New classes Date, Datetime, DateVector and DatetimeVector with proper
        'new' API integration such as as(), wrap(), iterators, ...

    o   The so-called classic API headers have been moved to a subdirectory
        classic/ This should not affect client-code as only Rcpp.h was ever
        included.

    o   RcppDate now has a constructor from SEXP as well

    o   RcppDateVector and RcppDatetimeVector get constructors from int
        and both const / non-const operator(int i) functions
        
    o   New API class Rcpp::InternalFunction that can expose C++ functions
    	to R without modules. The function is exposed as an S4 object of 
    	class C++Function

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 15 Jun 2010

RcppArmadillo 0.2.3

The minor bug-fix release 0.2.3 of RcppArmadillo went to CRAN this morning.

It adds a tiny bit of configuration to permit Sun Studio / suncc to successfully build the package. There is no code change, and no configuration change for the other platforms. Thanks for Brian Ripley for additional testing, and of course for running those build instances (and everything else he does) for the R project, and to Conrad Sanderson as upstream author of the Armadillo C++ library for linear algebra.

As usual, more information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sun, 13 Jun 2010

Beancounter minor bug fix release 0.8.10

A few month after the 0.8.9 release, we have another small update to beancounter. This version fixes a minor infelicity in the manual page (thanks to an Ubuntu and then Debian bug report) as well as a small coding bug where 'USD' was hard-coded when the user-defined default home currency should have been used.

The new version is now in Debian, at CPAN and on my beancounter page here. Enjoy!

/code/beancounter | permanent link

Thu, 10 Jun 2010

Rcpp 0.8.2

A bug fix release Rcpp version 0.8.2 is now on CRAN and Debian. It contains some fixes for Sun compiler, but no user-visible changes and complements the Rcpp 0.8.1 release made Tuesday. Our thanks to Brian Ripley for help with the portability tests for that platform. It will be good to get an autobuilder for pre-emptive testing for that platform too as we make quite some use of the win-builder servive in Dortmund.

The full NEWS entry for this release follows below:

0.8.2   2010-06-09

    o   Bug-fix release for suncc compiler with thanks to Brian Ripley for
        additional testing.

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Wed, 09 Jun 2010

RcppArmadillo 0.2.2

Following the Rcpp 0.8.1 release we made yesterday, we released RcppArmadillo release 0.2.2 this morning. RcppArmadillo uses Rcpp (and some 'glue' code) to provide a transparent interface from R to Conrad Sanderson's impressive Armadillo library for linear algebra.

This release works well with the most recent inline release 0.3.5. One can now employ inlined R code as we generalized how/which headers are included and how library / linking information is added thanks a plugin mechanism. This is the first RcppArmadillo version to provide such a plugin, We also updated the included Armadillo headers to its most recent release 0.9.10, added some more operators and provide a utility function RcppArmadillo:::CxxFlags() to provide include directory information on the fly.

An example of the direct inline approach for the fastLm function:

library(inline)
library(RcppArmadillo)

src <- '
	Rcpp::NumericVector yr(ys);			// creates Rcpp vector from SEXP
	Rcpp::NumericMatrix Xr(Xs);			// creates Rcpp matrix from SEXP
	int n = Xr.nrow(), k = Xr.ncol();

	arma::mat X(Xr.begin(), n, k, false);   	// reuses memory and avoids extra copy
	arma::colvec y(yr.begin(), yr.size(), false);

	arma::colvec coef = arma::solve(X, y);      	// fit model y ~ X
	arma::colvec res = y - X*coef;			// residuals

	double s2 = std::inner_product(res.begin(), res.end(), res.begin(), double())/(n - k);
							// std.errors of coefficients
	arma::colvec std_err = arma::sqrt(s2 * arma::diagvec( arma::inv(arma::trans(X)*X) ));

	return Rcpp::List::create(Rcpp::Named("coefficients") = coef,
				  Rcpp::Named("stderr")       = std_err,
				  Rcpp::Named("df")           = n - k
				  );
'

fun <- cxxfunction(signature(ys="numeric", Xs="numeric"), src, plugin="RcppArmadillo")

This creates a compiled function fun which, by using Armadillo, regresses a vector ys on a matrix Xs (just how the fastLmPure() function in the package does) --- yet is constructed on the fly using cxxfunction from inline.

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Tue, 08 Jun 2010

Rcpp 0.8.1

Early this morning I sent Rcpp version 0.8.1 off to CRAN and Debian. In the meantime, Romain has already provided a very nice blog post about it.

There are a few fairly visible new things in this release. As we want to focus the next few minor releases on completing the documentation, we started by adding a total of four (!!) new vignettes:

  • Rcpp-package showing how to write your own package using Rcpp,
  • Rcpp-FAQ addressing several frequently asked questions,
  • Rcpp-modules discussing how to expose C++ functions and modules with ease using an idea borrowed from Boost::Python, and
  • Rcpp-extending detailing the steps needed to extend Rcpp with user-provided or third-party classes,

The most interesting new feature is what we call Rcpp modules and is modeled after Boost::Python. This makes it pretty easy to expose C++ functions and classes to R -- without having to write glue code. This is pretty new and may change a tad over the coming releases, but it is also quite exciting.

Other changes concern more improvements for use of inline which should now allow packages like our RcppArmadillo to be used with it, and some bug fixes. The full NEWS entry for this release follows below:

0.8.1   2010-06-08

    o   This release adds Rcpp modules. An Rcpp module is a collection of
        internal (C++) functions and classes that are exposed to R. This
        functionality has been inspired by Boost.Python.
        
        Modules are created internally using the RCPP_MODULE macro and
        retrieved in the R side with the Module function. This is a preview 
        release of the module functionality, which will keep improving until
        the Rcpp 0.9.0 release. 

        The new vignette "Rcpp-modules" documents the current feature set of
        Rcpp modules.
        
    o   The new vignette "Rcpp-package" details the steps involved in making a
        package that uses Rcpp.

    o   The new vignette "Rcpp-FAQ" collects a number of frequently asked
        questions and answers about Rcpp.

    o   The new vignette "Rcpp-extending" documents how to extend Rcpp
        with user defined types or types from third party libraries. Based on
        our experience with RcppArmadillo
        
    o   Rcpp.package.skeleton has been improved to generate a package using 
        an Rcpp module, controlled by the "module" argument

    o   Evaluating a call inside an environment did not work properly
        
    o   cppfunction has been withdrawn since the introduction of the more
        flexible cxxfunction in the inline package (0.3.5). Rcpp no longer
        depends on inline since many uses of Rcpp do not require inline at
        all. We still use inline for unit tests but this is now handled
        locally in the unit tests loader runTests.R. 

        Users of the now-withdrawn function cppfunction can redefine it as:
        
           cppfunction <- function(...) cxxfunction( ..., plugin = "Rcpp" )

    o   Support for std::complex was incomplete and has been enhanced.

    o   The methods XPtr::getTag and XPtr::getProtected are deprecated, 
        and will be removed in Rcpp 0.8.2. The methods tag() and prot() should
        be used instead. tag() and prot() support both LHS and RHS use. 

    o   END_RCPP now returns the R Nil values; new macro VOID_END_RCPP
        replicates prior behabiour
        

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Thu, 03 Jun 2010

inline 0.3.5

Yesterday morning, Romain pushed inline release 0.3.5 to CRAN.

This is some ways a continuation of the 0.3.4 release I had made in December. That release had opened the door for the wide use of inline in our Rcpp package. And just how Rcpp has grown, we now have needs beyond the initial change. See the post on Romain's blog for details, but in a nutshell we are now gaining

  • cxxfunction which extends cfunction further for C++ use and, among other things, adds a plugin system we can use from RcppArmadillo to permit use of inline
  • package.skeleton which makes it easy to carry a function that one has prototyped with inline over into its own package -- and how to do that was a question at my most recent Rcpp talk in Vienna), and
  • getDynLib which Romain will use to great effect in the next version of Rcpp to provide something not unlike Boost::Python. Stay tuned!

Last but not least, our thanks to Oleg Sklyar for letting us extend his amazing inline package for use by Rcpp.

/code/inline | permanent link

Mon, 31 May 2010

Bike The Drive 2010

Memorial Day weekend is also time for the annual Bike The Drive in Chicago. This time only half the household got up bright and early and enjoyed Lakeshore Drive free of cars. A highly recommended event.

/sports/cycling | permanent link

JPM Chase Corporate Challenge 2010

It's Memorial Day weekend so it was time for the Chicago's JP Morgan Chase Corporate Challenge on Thursday. The weather was glorious, the usual 20-some thousand runners participated and a good time was had. Work had arranged for a nice tent, food, music --- and a bunch of people showed up and enjoyed it. Nice one.

This time we all got chip-timing via a small (rfid ?) strip tagged to back of the bib number. Which is handy as I managed to not stop my time by hand correctly. Given that I am still nursing a sore Achilles tendon and don't train well or much, the time of 23:51 (or 6:48 min/mile) was ok compared to the other seven previous times I have run this.

/sports/running | permanent link

Thu, 27 May 2010

WU Wien presentations

Last week I had the opportunity to spend a few days at the Institute for Statistics and Mathematics of the WU Vienna / Wirtschaftsuniversitaet Wien. On Thursday, I gave a seminar on Rcpp and RInside introducing all the recent work with Romain on making R and C++ integration easier. Both (compact) handout and (full) presentation slides are now posted alongside the other presentations.

On Friday, I also gave an informal lecture / tutorial / workshop to some of the Stats and Finance Ph.D. students, drawing largely from the section on parallel computing of the most recent Introduction to High-Performance Computing with R tutorial.

My sincere thanks to Kurt Hornik and Stefan Theussl for the invite -- it was a great trip, notwithstanding the mostly unseasonally cold and wet weather.

/computers/R | permanent link

Thu, 20 May 2010

RcppArmadillo 0.2.0 (and 0.2.1)

With the Rcpp 0.8.0 release on Monday, Romain, Doug and I were able to follow-up with a new RcppArmadillo release. RcppArmadillo uses Rcpp (and a few dozen lines of 'glue') to provide a transparent interface from R to Conrad Sanderson's impressive Armadillo library for linear algebra.

This new release offers a number of key improvements:

  • Headers-only: given that Armadillo is a C++ template library, we now ship its headers in the package. In the previous release, we required Armadillo to be built as a library. As this Armadillo library mostly provides things we get from R for free (such as access to Blas, where available), we can do without it and stick to templates-only. The upside is that the usage requirements for RcppArmadillo have become much simpler: R, a C++ compiler and Rcpp. In practice, this also means that Windows users will now get pre-built binaries via CRAN
  • Update to Armadillo 0.9.8: We added the headers from Conrad's most recent release.
  • The fastLm() function is now generic and provides a default and formula interface just like lm() along with standard methods print, summary and predict. The documentation is enhanced as well and now contains an example of a rank-deficient model matrix where the non-pivoting scheme of fastLm() fails.
  • Doug Bates joined Romain and myself as an author of RcppArmadillo

While we had tested this quite rigourously, the combination of some last minute changes that were meant to be stylistic-only, some troubles with the tests and builds at CRAN that were not apparent in all our tests (hint: do not yet use dynamic help features referencing other packages even if you have a Depends: on them) and an upcoming travel deadline meant that we missed a gotcha on Windows---so release 0.2.1 had to follow a few hours after the short-lived 0.2.0.

More information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Mon, 17 May 2010

Rcpp 0.8.0

Romain and I are happy to announce the release of Rcpp version 0.8.0. It has been uploaded to CRAN. A Debian upload is delayed until the now-required inline package is accepted into Debian. The source package is also available from here.

This release brings a number of changes that are detailed below. Of particular interest may be the much more robust treatment of exceptions, the new classes for data frames and formulae, and the availability of the new helper function cppfunction for use with inline. Also of note is the new support for the 'LinkingTo' directive with which packages using Rcpp will get automatic access to the header files.

An announcement email went to the r-packages list (ETH Zuerich, Gmane); Romain also blogged about the release.

The full NEWS entry for this release follows below:

0.8.0   2010-05-17

    o   All Rcpp headers have been moved to the inst/include directory,
	allowing use of 'LinkingTo: Rcpp'. But the Makevars and Makevars.win
	are still needed to link against the user library.

    o   Automatic exception forwarding has been withdrawn because of
	portability issues (as it did not work on the Windows platform).
	Exception forwarding is still possible but is now based on explicit
	code of the form:
	
	  try {
	    // user code
	  } catch( std::exception& __ex__){ 
	    forward_exception_to_r( __ex___ ) ;
	  }
	
	Alternatively, the macro BEGIN_RCPP and END_RCPP can use used to enclose
	code so that it captures exceptions and forward them to R. 
	
	  BEGIN_RCPP
	  // user code
	  END_RCPP
	
    o   new __experimental__ macros 
	
	The macros RCPP_FUNCTION_0, ..., RCPP_FUNCTION_65 to help creating C++
	functions hiding some code repetition: 
	
	  RCPP_FUNCTION_2( int, foobar, int x, int y){
	    return x + y ;
	  }

	The first argument is the output type, the second argument is the 
	name of the function, and the other arguments are arguments of the C++ 
	function. Behind the scenes, the RCPP_FUNCTION_2 macro creates 
	an intermediate function compatible with the .Call interface and handles 
	exceptions
	
	Similarly, the macros RCPP_FUNCTION_VOID_0, ..., RCPP_FUNCTION_VOID_65 
	can be used when the C++ function to create returns void. The generated
	R function will return R_NilValue in this case.
	
	  RCPP_FUNCTION_VOID_2( foobar, std::string foo ){
	    // do something with foo
	  }

	  
	The macro RCPP_XP_FIELD_GET generates a .Call compatible function that
	can be used to access the value of a field of a class handled by an 
	external pointer. For example with a class like this: 
	
	  class Foo{
		public:
			int bar ;
	  }
	
	  RCPP_XP_FIELD_GET( Foo_bar_get, Foo, bar ) ;
	  
	RCPP_XP_FIELD_GET will generate the .Call compatible function called
	Foo_bar_get that can be used to retrieved the value of bar.
	
	
	The macro RCPP_FIELD_SET generates a .Call compatible function that 
	can be used to set the value of a field. For example:
	
	  RCPP_XP_FIELD_SET( Foo_bar_set, Foo, bar ) ;
	
	generates the .Call compatible function called "Foo_bar_set" that 
	can be used to set the value of bar
	
	
	The macro RCPP_XP_FIELD generates both getter and setter. For example
	
	  RCPP_XP_FIELD( Foo_bar, Foo, bar )
	  
	generates the .Call compatible Foo_bar_get and Foo_bar_set using the 
	macros RCPP_XP_FIELD_GET and RCPP_XP_FIELD_SET previously described
	
	  
	The macros RCPP_XP_METHOD_0, ..., RCPP_XP_METHOD_65 faciliate 
	calling a method of an object that is stored in an external pointer. For 
	example: 
	
	  RCPP_XP_METHOD_0( foobar, std::vector , size )
	
	creates the .Call compatible function called foobar that calls the
	size method of the std::vector class. This uses the Rcpp::XPtr<
	std::vector > class.
	
	The macros RCPP_XP_METHOD_CAST_0, ... is similar but the result of
	the method called is first passed to another function before being
	wrapped to a SEXP.  For example, if one wanted the result as a double
	
	   RCPP_XP_METHOD_CAST_0( foobar, std::vector , size, double )
	
	The macros RCPP_XP_METHOD_VOID_0, ... are used when calling the
	method is only used for its side effect.
	
	  RCPP_XP_METHOD_VOID_1( foobar, std::vector, push_back ) 
	
	Assuming xp is an external pointer to a std::vector, this could
	be called like this :
	
	  .Call( "foobar", xp, 2L )
	
    o	Rcpp now depends on inline (>= 0.3.4)
	
    o   A new R function "cppfunction" was added which invokes cfunction from
	inline with focus on Rcpp usage (enforcing .Call, adding the Rcpp
	namespace, set up exception forwarding). cppfunction uses BEGIN_RCPP
	and END_RCPP macros to enclose the user code

    o	new class Rcpp::Formula to help building formulae in C++
    
    o   new class Rcpp::DataFrame to help building data frames in C++
	
    o   Rcpp.package.skeleton gains an argument "example_code" and can now be
	used with an empty list, so that only the skeleton is generated. It
	has also been reworked to show how to use LinkingTo: Rcpp
	
    o   wrap now supports containers of the following types: long, long double,
	unsigned long, short and unsigned short which are silently converted
	to the most acceptable R type.
	
    o	Revert to not double-quote protecting the path on Windows as this
	breaks backticks expansion used n Makevars.win etc
        
    o   Exceptions classes have been moved out of Rcpp classes,
	e.g. Rcpp::RObject::not_a_matrix is now Rcpp::not_a_matrix

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Mon, 26 Apr 2010

R Project and Google Summer of Code: Welcome to our students!

A few hours ago, I sent the following to both the R development list and the informal R / GSoC list:
Date: Mon, 26 Apr 2010 15:27:29 -0500
To: R Development List 
CC: gsoc-r 
Subject: R and the Google Summer of Code 2010 -- Please welcome our new students!
From: Dirk Eddelbuettel 

Earlier today Google finalised student / mentor pairings and allocations for
the Google Summer of Code 2010 (GSoC 2010).  The R Project is happy to
announce that the following students have been accepted:

   Colin Rundel, "rgeos - an R wrapper for GEOS", mentored by Roger Bivand of
      the Norges Handelshoyskole, Norway

   Ian Fellows, "A GUI for Graphics using ggplot2 and Deducer", mentored by
      Hadley Wickham of Rice University, USA

   Chidambaram Annamalai, "rdx - Automatic Differentiation in R", mentored by
      John Nash of University of Ottawa, Canada

   Yasuhisa Yoshida, "NoSQL interface for R", mentored by Dirk Eddelbuettel,
      Chicago, USA

   Felix Schoenbrodt, "Social Relations Analyses in R", mentored by Stefan
      Schmukle, Universitaet Muenster, Germany

   Details about all proposals are on the R Wiki page for the GSoC 2010 at
   http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010

The R Project is honoured to have received its highest number of student
allocations yet, and looks forward to an exciting Summer of Code.  Please
join me in welcoming our new students.

At this time, I would also like to thank all the other students who have
applied for working with R in this Summer of Code. With a limited number of
available slots, not all proposals can be accepted -- but I hope that those
not lucky enough to have been granted a slot will continue to work with R and
towards making contributions within the R world. 

I would also like to express my thanks to all other mentors who provided for
a record number of proposals.  Without mentors and their project ideas we
would not have a Summer of Code -- so hopefully we will see you again next
year. 

Regards,  

Dirk (acting as R/GSoC 2010 admin)

/computers/misc | permanent link

Wed, 21 Apr 2010

Boston Marathon 2010

This Monday saw the 114th running of the Boston Marathon. Under near-ideal conditions with some sunny skies, some clouds and moderate temperatures in the high 40s to low 50s, a course record as well as US Mens record were set. Having run this race in 2007 and in 2009 (with 2008 in London), I knew the difficult course and was aiming to run it more evenly. So no breakdown on those famous hills! But without too much fire in my belly over the winter, training was somewhat minimal: one 20 miler, two 18 milers. Friends and co-runners were joking that I had turned my trusted 'run less, run faster' program by the FIRST lab into my own 'run less, run lesser' program. And then were a nagging achilles tendon and a little strain from the underwhelming recent half-marathon so that I was really trying to hold back and not go out too fast on the initial portion which is mostly downhill.

Which seems to have worked. My pace was more even, and I conserved some energy and made it past the last of the hills around mile 21 without walking a single step while staying a few second under a 8 min/mile average pace. But then around mile 23 and 24 I had two sharp short cramps which forced me to walk. Interestingly enough, Bob Richards writes about cramps as a main theme for many runners in this year's race. Maybe the wind and temperature combined with the hills to get us after all! Anyway, I ended up with 3:29:14 which, at a 7:59 pace, is right between the 2007 and 2009 results and quite decent given the circumstances.

And of course the weekend as whole was again a hoot even if I had only a short stay of around 30 hours in Boston given our R/Finance conference on Friday and Saturday. We'll see if I will manage to qualify once more for next year.

/sports/running | permanent link

Tue, 20 Apr 2010

R / Finance 2010 presentations

Last Friday and Saturday the second R / Finance conference took place in Chicago on the UIC campus.

As a co-organizer, it was a great pleasure to see so many users of R in Finance---from both industry and academia---come to Chicago to discuss and share recent work. There is a lot going on, and it is always good to exchange ideas with others sharing the same infrastructure. Participants appeared to enjoy the conference. My thanks to everybody who helped to put it together, from the local committee to the helping hands at UIC and of course the sponsors.

I just put my slides from the Extending and Embedding R with C++ tutorial preceding the conference, as well as the RQuantLib: Interfacing QuantLin from R presentation (with Khanh Nguyen), up onto my presentations page. I do have a usb-drive with all conference presentations and will provide them via the R / Finance site in a few days.

The only truly sour note is the fact that several presenters from Europe had their travels schedules turned upside down by the disruption to international air travel caused by the Icelandic volcano eruption and the resulting ash clouds. While we are glad to have had them for a little longer in Chicago, we understand that they are getting eager to return home. I hope this extended stay in the Windy City does not take away from the overall usefulness of the trip.

/computers/R | permanent link

Fri, 16 Apr 2010

Rcpp 0.7.12

A new bug fix versions 0.7.12 of Rcpp is awaiting inclusion into CRAN and Debian. It is also available from here.

This is another bug-fix version related solely to a build failure on Windows. Trying to protect paths with spaces has the side-effect of breaking backticks use, which unfortunately is already in use by a number of package that since broke during CRAN autobuilds. No other changes were made.

As always, full details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Wed, 07 Apr 2010

Video of UCLA / LA RUG talk on R and C++ integration

Thanks to the efforts of the tireless R User Group organizers Szilard Pafka (in Los Angeles, recording the talk) and Drew Conway (in New York, converting and organising hosting), there is now a video and slide combo of my recent talk about Rcpp and RInside at UCLA and the Los Angeles R Users Group.

Thanks also to David Smith (at the REvolutions blog) and Drew Conway (at his blog) for spreading the word about the presentation video and slides -- quite a few folks have come to my presentations page to get them.

/computers/R | permanent link

Sun, 04 Apr 2010

UCLA and LA RUG talks on R and C++ integration

We spent last week in the LA area and had a generally good time out west. I was able to sneak in two talks and a group discussion, thanks to the help by Jan de Leeuw (and everybody at UCLA's Stats department) as well as by Szilard Pafka representing the LA R User's Group. Pdf files for the slides for the talks are now on my presentations page in both a compact handout and presentation slide version (where the content is identical; if in doubt use the first file).

The talks centered around R and C++ integration using both Rcpp and RInside and summarise where both projects stand after all the recent work Romain and I put in over the last few months. The presentations went fairly well; I received some favourable comments.

Szilard and the R User Group had also suggested a group discussion about CRAN, its growth and how to maximise its usefulness. Given my CRANberries feed, my work on the CRAN Task Views for Empirical Finance and High-Performance Computing with R as well as our cran2deb binary package generator, I had some views and ideas that helped frame the discussion which turned out to very useful and informed. So maybe we should do this User Group thing in Chicago too!

Special thanks to Jan de Leeuw and Szilard Pafka for organising the meeting, talks and discussion.

/computers/R | permanent link

Fri, 26 Mar 2010

Finance::YahooQuote 0.24

Having espoused rule number one in regression testing in the post about yesterday's bug fix upload 0.23, we can now add rule number zero: Do not introduce a new error by omitting the trailing semicolon. I guess it shows that I don't really program in Perl anymore.

Anyway, a new version 0.24 of Finance::YahooQuote which addresses the issue that required upload 0.23 yesterday is now in the Debian queue and on CPAN and my local yahooquote page. This time it may even work. A big thanks to the CPAN Testers for getting me reports on this one too.

/computers/linux/debian/packages | permanent link

Rcpp 0.7.11

A new versions 0.7.11 of Rcpp is awaiting inclusion into CRAN and Debian. It is also available from here.

This version fixes a somewhat serious bug uncovered by Doug Bates when working with vectors of strings. We also added a few new accessor functions as well as a new convenience function create that is particularly useful for creating (possibly named) list objects that are returned to R.

Here is the full NEWS entry for this release:

0.7.11	2010-03-26

    o	Vector<> gains a set of templated factory methods "create" which
        takes up to 20 arguments and can create named or unnamed vectors.
        This greatly facilitates creating objects that are returned to R. 

    o	Matrix now has a diag() method to create diagonal matrices, and
        a new constructor using a single int to create square matrices

    o	Vector now has a new fill() method to propagate a single value
	
    o	Named is no more a class but a templated function. Both interfaces
        Named(.,.) and Named(.)=. are preserved, and extended to work also on 
        simple vectors (through Vector<>::create) 
	
    o	Applied patch by Alistair Gee to make ColDatum more robust
    
    o	Fixed a bug in Vector that caused random behavior due to the lack of
        copy constructor in the Vector template

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Thu, 25 Mar 2010

Finance::YahooQuote 0.23

Rule number one in regression testing is to not depend on volatile data. Which I seem to have violated in file t/02simple.t in the Perl package Finance::YahooQuote.

Which lead the automated Perl test scripts to remind me for a few days now that the full company name for symbol IBM no longer corresponded to what I had encoded. Not really a bug, but a failure in tests anyway.

So without further ado: a new version 0.23 of Finance::YahooQuote which addressed this issue is now in the Debian queue and on CPAN and my local yahooquote page.

/computers/linux/debian/packages | permanent link

Mon, 22 Mar 2010

RInside release 0.2.2

The shiny new 0.2.2 release of RInside has just been uploaded to CRAN; it should hit mirrors tommorow. Sources are also at my RInside page.

RInside is a set of convenience classes to facilitate embedding of R inside of C++ applications. It works particularly well with Rcpp and now depends on it.

This is the first release since version 0.2.1 in early January. Romain and I made numerous changes to Rcpp in the meantime. With this release, RInside is starting to catch up by taking advantage of many new automatic (templated) type converters. We have updated the existing examples, and added several new ones. These are all visibile directly via the Doxygen-generated documentation under the Files heading. Two examples are also shown directly on the RInside page.

Also added are new examples showing how to use RInside to embed R inside C++ applications using MPI for parallel computing. This was contributed via two examples files by Jianping Hua, and we reworked the examples slightly (and added two variants that use MPI's C++ API).

As it is so short, here is the basic 'Hello, World' example now showing the simpler Rcpp-based variable assignment:

// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4;  tab-width: 8; -*-
//
// Simple example showing how to do the standard 'hello, world' using embedded R
//
// Copyright (C) 2009 Dirk Eddelbuettel
// Copyright (C) 2010 Dirk Eddelbuettel and Romain Francois
//
// GPL'ed

#include <RInside.h>                    // for the embedded R via RInside

int main(int argc, char *argv[]) {

    RInside R(argc, argv);              // create an embedded R instance

    R["txt"] = "Hello, world!\n";	// assign a char* (string) to 'txt'

    R.parseEvalQ("cat(txt)");           // eval the init string, ignoring any returns

    exit(0);
}

One minor setback is that the examples currently segfault on Windows. That may be an issue with linking and class instantiation or something related. Romain and I focus much more on Linux and OS X, so this has not gotten a lot of attention. Debugging help would be appreciated.

/code/rinside | permanent link

Sun, 21 Mar 2010

2010 March Madness Half Marathon in Cary

The annual March Madness Half Marathon in Cary took place this morning. This is both one of Chicagoland's 'early races' to start the season as well as the classic Boston preparation due to the hilly course. I have now run this consecutively for six years (see 2005. 2006, 2007, 2008, and 2009).

As for the race conditions, we had fantastic weather all week with temperatures up to the sixties and then all of a sudden a forecast of rain, snow and even sleet for the weekend. Luckily, and while yesterday was sucky, today was allright or better. A little chilly and damp, but neither rain nor snow --- or even wind. So the conditions were good, with the course challenging as usual.

The race itself went fine. I ran more or less steadily, never had to stop but was not particularly fast at 1:39:38 or a pace of 7:36.3. I had aimed for beating 1:40, had missed that target by miles 4 to 6 and was about 10 or 15 seconds behind but managed to get a negative split on the second half of the course to reach that goal. Which is nice, but the time is still the slowest I've ever run that race, and my slowest half-marathon since 2004.

Training had been sluggish all winter. Oddly enough, already in last year's post I stated pretty much the same and feared that Boston may become tough --- which it did. But this year may well be a lot worse as I had no spring in my step all winter long. No fire in the belly for training will make for a long race. We'll see how it goes. Four weeks to go.

/sports/running | permanent link

Thu, 18 Mar 2010

R Project selected for the Google Summer of Code 2010

Earlier today, Google announced the list of accepted mentor organizations for the Google Summer of Code 2010 (GSoC 2010). And we are happy to report that the R Project is once again a participating organization (and now for the third straight year) joining a rather august group of open source projects from around the globe.

An R Wiki page had been created and serves as the central point of reference for the R Project and the GSoC 2010. It contains a list of project ideas, currently counting eleven and spanning everything from research-oriented topics (such as spatial statistics or automatic differentiation) to R community-support (regarding CRAN statistics and the CRANtastic site) to extensions (NoSQL, RPy2 data interfaces, Rserve browser integration) and more. I also just created a mailing list gsoc-r@googlegroups.com where prospective students and mentors can exchange ideas and discuss. As for other details, the Google Summer of Code 2010 site has most of the answers, and we will try to keep R-related information on the aforementioned R Wiki page.

/computers/misc | permanent link

Mon, 15 Mar 2010

Rcpp 0.7.10

Versions 0.7.7 to 0.7.9 of Rcpp contained a bug: protecting paths with quotes was supposed to help with Windows builds, but did the opposite at least in 'backticks mode' for getting path and/or library information. Using the shQuote() function instead helped. Our thanks to the tireless R-on-Windows maintainer Uwe Ligges for an earlier heads-up about the problem. So another quick bug-fix release 0.7.10 is now in Debian and should be on CRAN some time tomorrow.

We also put two small improvements in, see the full NEWS entry for this release:

0.7.10  2010-03-15

    o	new class Rcpp::S4 whose constructor checks if the object is an S4 object
	
    o	maximum number of templated arguments to the pairlist function, 
	the DottedPair constructor, the Language constructor and the 
	Pairlist constructor has been updated to 20 (was 5) and a script has been
	added to the source tree should we want to change it again

    o   use shQuotes() to protect Windows path names (which may contain spaces)

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Fri, 12 Mar 2010

Rcpp 0.7.9

Version 0.7.8 of Rcpp, released just a few days ago contained a nasty bug or two which we noticed when trying to built the initial release of RcppArmadillo on 64-bit platforms.

So a quick bug-fix release 0.7.9 is now in Debian and should be on CRAN shortly.

The full NEWS entry for this release follows:

0.7.9   2010-03-12

    o	Another small improvement to Windows build flags

    o	bugfix on 64 bit platforms. The traits classes (wrap_type_traits, etc)
	used size_t when they needed to actually use unsigned int

    o	fixed pre gcc 4.3 compatibility. The trait class that was used to 
	identify if a type is convertible to another had too many false positives
	on pre gcc 4.3 (no tr1 or c++0x features). fixed by implementing the 
	section 2.7 of "Modern C++ Design" book. 

As always, even fuller details are in Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

Update: First version number corrected to 0.7.8.

/code/rcpp | permanent link

Thu, 11 Mar 2010

RcppArmadillo 0.1.0

Besides the new RcppExamples, another new package RcppArmadillo got spun out of Rcpp with the recent release 0.7.8 of Rcpp.

Romain and I already had an example of a simple but fast linear model fit using the (very clever) Armadillo C++ library by Conrad Sanderson. In fact, I had used this as a motivational example of why Rcpp rocks in a recent talk to the ACM chapter at U of Chicago which, thanks to David Smith at REvo, got some further exposure.

Now this example is more refined as further glue got added. Given that both Armadillo and Rcpp make use of C++ templates, the actual amount of code in RcppArmadillo is not that large: just over 200 lines in a header file, and a little less for some testing accessor and example functions in a source file. And this makes for some really nice example code: the 'fast regression' example becomes this (where I simply removed two blocks with conditional on the Armadillo version):

#include <RcppArmadillo.h>

extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {

    Rcpp::NumericVector yr(ys);			// creates Rcpp vector from SEXP
    Rcpp::NumericMatrix Xr(Xs);			// creates Rcpp matrix from SEXP
    int n = Xr.nrow(), k = Xr.ncol();

    arma::mat X(Xr.begin(), n, k, false);   	// reuses memory and avoids extra copy
    arma::colvec y(yr.begin(), yr.size(), false);

    arma::colvec coef = solve(X, y);            // fit model y ~ X
    arma::colvec resid = y - X*coef; 		// residuals
    double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
    						// std.error of estimate
    arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) );

    Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
                       Rcpp::Named( "stderr", stderrest));
    return res;
}

No extra copies! Armadillo instantiates directly from the underlying R objects for the vector and matrix, solves the regression equations, computes the standard error of the estimates and returns the two vectors. Leaving us to write about eleven lines of code. Moreover, as Armadillo is well designed and uses template meta-programming to avoid extra copies (see these lecture notes for details), it is about as efficient as it can be (and will use Atlas or other BLAS where available).

And, this is just one example. Rcpp should be suitable for other C++ libraries, and provides an easy to use seamless interface between C++ and R.

However, we should note that (at about the last minute) we found out about some unit test failures in OS X as well as some issues in a Debian chroot -- cran2deb ran into some build issues on i386 and amd64 in the testing chroot even this 'it all works' swimmingly on our Debian, Ubuntu and Fedora build environments. A follow-up with fixes for either Rcpp and/or RcppArmadillo appears likely.

Update: The build issues seems to be with 64-bit systems and everything appears cool in 32-bit.

/code/rcpp | permanent link

Wed, 10 Mar 2010

RcppExamples 0.1.0

Version 0.1.0 of RcppExamples, a simple demo package for Rcpp should appear on CRAN some time tomorrow.

As mentioned in the post about release 0.7.8 of Rcpp, Romain and I carved this out of Rcpp itself to provide a cleaner separation of code that implements our R / C++ interfaces (which remain in Rcpp) and code that illustrates how to use it --- which is now in RcppExamples. This also provides an easier template for people wanting to use Rcpp in their packages as it will be easier to wrap one's head around the much smaller RcppExamples package.

A simple example (using the newer API) may illustrate this:

#include <Rcpp.h>

RcppExport SEXP newRcppVectorExample(SEXP vector) {

    Rcpp::NumericVector orig(vector);			// keep a copy (as the classic version does)
    Rcpp::NumericVector vec(orig.size());		// create a target vector of the same size

    // we could query size via
    //   int n = vec.size();
    // and loop over the vector, but using the STL is so much nicer
    // so we use a STL transform() algorithm on each element
    std::transform(orig.begin(), orig.end(), vec.begin(), sqrt);

    Rcpp::Pairlist res(Rcpp::Named( "result", vec),
                       Rcpp::Named( "original", orig));

    return res;
}

With essentially five lines of code, we provide a function that takes any numeric vector and returns both the original vector and a tranformed version---here by applying a square root operation. Even the looping along the vector is implicit thanks to the generic programming idioms of the Standard Template Library.

Nicer still, even on misuse, exceptions get caught cleanly and we get returned to the R prompt without any explicit coding on the part of the user:

R> library(RcppExamples)
Loading required package: Rcpp
R> print(RcppVectorExample( 1:5, "new" )) # select new API
$result
[1] 1.000 1.414 1.732 2.000 2.236

$original
[1] 1 2 3 4 5

R> RcppVectorExample( c("foo", "bar"), "new" )
Error in RcppVectorExample(c("foo", "bar"), "new") :
  not compatible with INTSXP
R>

There is also analogous code for the older API in the package, but it is about three times as long, has to loop over the vector and needs to set up the execption handling explicitly.

As of right now, RcppExamples does not document every class but it should already provide a fairly decent start for using Rcpp. And many more actual usage examples are ... in the over two-hundred unit tests in Rcpp.

Update: Now actually showing new rather than classic API.

/code/rcpp | permanent link

Tue, 09 Mar 2010

Rcpp 0.7.8

Version 0.7.8 of the Rcpp R / C++ interface classes is now on CRAN and in Debian. As of right now. Debian has already built packages for eight more architectures; and CRAN has built the Windows binary. Oh, and cran2deb had Debian packages for 'testing' before I was done with the blog entry.

This is a minor feature release based on a over three weeks of changes that are summarised below in the extract from the NEWS file. Some noteworthy highlights are

  • something that isn't there: we have split most of the example code and their manual pages off into a new package RcppExamples which can now be released given that 0.7.8 is out
  • another new package RcppArmadillo will also be forthcoming shortly: it shows how to use Rcpp with Conrad Sanderson's excellent Armadillo C++ library for linear algebra; this required some internal code changes to seamlessly pass data from R via Rcpp to Armadillo and back;
  • there is a new example fastLm using Armadillo for faster (than lm() or lm.fit()) linear model fits
  • yet more internal improvements to the class hierarchy as detailed below; more support for STL iterators and algorithms;
  • more build fixes; paths with spaces in the name should now be tolerated
  • and last but not least a new introduction / overview vignette based on a just-submitted paper on Rcpp.

The full NEWS entry for this release follows:

0.7.8   2010-03-09

    o	All vector classes are now generated from the same template class
    	Rcpp::Vector where RTYPE is one of LGLSXP, RAWSXP, STRSXP,
    	INTSXP, REALSXP, CPLXSXP, VECSXP and EXPRSXP. typedef are still 
    	available : IntegerVector, ... All vector classes gain methods 
    	inspired from the std::vector template : push_back, push_front, 
    	erase, insert
    	
    o	New template class Rcpp::Matrix deriving from 
    	Rcpp::Vector. These classes have the same functionality
    	as Vector but have a different set of constructors which checks
    	that the input SEXP is a matrix. Matrix<> however does/can not
    	guarantee that the object will allways be a matrix. typedef 
    	are defined for convenience: Matrix is IntegerMatrix, etc...
    	
    o	New class Rcpp::Row that represents a row of a matrix
    	of the same type. Row contains a reference to the underlying 
    	Vector and exposes a nested iterator type that allows use of 
    	STL algorithms on each element of a matrix row. The Vector class
    	gains a row(int) method that returns a Row instance. Usage 
    	examples are available in the runit.Row.R unit test file
    	
    o	New class Rcpp::Column that represents a column of a 
    	matrix. (similar to Rcpp::Row). Usage examples are 
    	available in the runit.Column.R unit test file

    o	The Rcpp::as template function has been reworked to be more 
    	generic. It now handles more STL containers, such as deque and 
    	list, and the genericity can be used to implement as for more
    	types. The package RcppArmadillo has examples of this

    o   new template class Rcpp::fixed_call that can be used in STL algorithms
	such as std::generate.

    o	RcppExample et al have been moved to a new package RcppExamples;
        src/Makevars and src/Makevars.win simplified accordingly

    o	New class Rcpp::StringTransformer and helper function 
    	Rcpp::make_string_transformer that can be used to create a function
    	that transforms a string character by character. For example
    	Rcpp::make_string_transformer(tolower) transforms each character
    	using tolower. The RcppExamples package has an example of this.
        
    o	Improved src/Makevars.win thanks to Brian Ripley

    o	New examples for 'fast lm' using compiled code: 
        - using GNU GSL and a C interface
        - using Armadillo (http://arma.sf.net) and a C++ interface
        Armadillo is seen as faster for lack of extra copying

    o	A new package RcppArmadillo (to be released shortly) now serves 
        as a concrete example on how to extend Rcpp to work with a modern 
	C++ library such as the heavily-templated Armadillo library

    o	Added a new vignette 'Rcpp-introduction' based on a just-submitted 
        overview article on Rcpp

As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

Update: Two links corrected.

/code/rcpp | permanent link

Thu, 25 Feb 2010

R and Sudoku solvers: Plus ca change...

Christian Robert blogged about a particularly heavy-handed solution to last Sunday's Sudoku puzzle in Le Monde. That had my symapthy as I like evolutionary computing methods, and his chart is rather pretty. From there, this spread on to the REvolutions blogs where David Smith riffed on it, and showed the acual puzzle. That didn't stop things as Christian blogged once more about it, this time welcoming his post-doc Robin Ryder who posts a heavy analysis on all this that is a little much for me at this time of day.

But what everybody seems to be forgetting is that R has had a Sudoku solver for years, thanks to the sudoku package by David Brahm and Greg Snow which was first posted four years ago. What comes around, goes around.

With that, and about one minute of Emacs editing to get the Le Monde puzzle into the required ascii-art form, all we need to do is this:

R> library(sudoku)
R> s <- readSudoku("/tmp/sudoku.txt")
R> s
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
 [1,]    8    0    0    0    0    1    2    0    0
 [2,]    0    7    5    0    0    0    0    0    0
 [3,]    0    0    0    0    5    0    0    6    4
 [4,]    0    0    7    0    0    0    0    0    6
 [5,]    9    0    0    7    0    0    0    0    0
 [6,]    5    2    0    0    0    9    0    4    7
 [7,]    2    3    1    0    0    0    0    0    0
 [8,]    0    0    6    0    2    0    1    0    9
 [9,]    0    0    0    0    0    0    0    0    0
R> system.time(solveSudoku(s))
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
 [1,]    8    4    9    6    7    1    2    5    3
 [2,]    6    7    5    2    4    3    9    1    8
 [3,]    3    1    2    9    5    8    7    6    4
 [4,]    1    8    7    4    3    2    5    9    6
 [5,]    9    6    4    7    8    5    3    2    1
 [6,]    5    2    3    1    6    9    8    4    7
 [7,]    2    3    1    8    9    4    6    7    5
 [8,]    4    5    6    3    2    7    1    8    9
 [9,]    7    9    8    5    1    6    4    3    2
   user  system elapsed
  5.288   0.004   5.951
R>
That took all of five seconds while my computer was also compiling a particularly resource-hungry C++ package....

Just in case we needed another illustration that it is hard to navigate the riches and wonders that is CRAN...

/computers/R | permanent link

Thu, 18 Feb 2010

U of C ACM talk

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp are now on my presentations page.

/computers/R | permanent link

Sun, 14 Feb 2010

Rcpp 0.7.7

Version 0.7.7, a shiny new bug fix release of Rcpp, our set of R / C++ interface classes, just arrived on CRAN and in Debian. The Language class had a real bug leading to this new release just two days after 0.7.6.

0.7.7	2010-02-14

    o	new template classes Rcpp::unary_call and Rcpp::binary_call
    	that facilitates using R language calls together 
    	with STL algorithms.
    	
    o	fixed a bug in Language constructors taking a string as their
    	first argument. The created call was wrong.

As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Sat, 13 Feb 2010

Rcpp 0.7.6

The new 0.7.6 release of Rcpp, our set of R / C++ interface classes, is now at CRAN and Debian. This comes just a few days after 0.7.5 as we had made a mistake in Makefile.win which is now fixed. A few other things sneaked in while were at it, see the snippet from the NEWS file below or look at Romain's blog where he highlights named-based indexing in vectors and the addition of iterator as well as begin() and end() members that now allow the use of STL algorithms on our R objects which is nifty.

The changes are summarised below in the NEWS file snippet, more details are in the ChangeLog as well.

0.7.6   2010-02-12

    o   SEXP_Vector (and ExpressionVector and GenericVector, a.k.a List) now
	have methods push_front, push_back and insert that are templated

    o   SEXP_Vector now has int- and range-valued erase() members

    o   Environment class has a default constructor (for RInside)

    o   SEXP_Vector_Base factored out of SEXP_Vector (Effect. C++ #44)

    o   SEXP_Vector_Base::iterator added as well as begin() and end()
        so that STL algorithms can be applied to Rcpp objects

    o   CharacterVector gains a random access iterator, begin() and end() to
	support STL algorithmsl; iterator dereferences to a StringProxy

    o   Restore Windows build; successfully tested on 32 and 64 bit;

    o   Small fixes to inst/skeleton files for bootstrapping a package

    o   RObject::asFoo deprecated in favour of Rcpp::as

As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 09 Feb 2010

Rcpp 0.7.5

A new release of our Rcpp R / C++ interface classes is now out, the version number is 0.7.5. It comes on the heels of the release 0.7.4 and keeps with our semi-frantic schedule of releases every ten or so days going. The package is now on CRAN and Debian, and mirrors start to get the new versions. As before, my local page provides more details and Romain's blog is always worth watching too.

The changes are summarised below in the NEWS file snippet, more details are in the ChangeLog as well.

0.7.5	2010-02-08

    o 	wrap has been much improved. wrappable types now are :
    	- primitive types : int, double, Rbyte, Rcomplex, float, bool
    	- std::string
    	- STL containers which have iterators over wrappable types:
    	  (e.g. std::vector, std::deque, std::list, etc ...). 
    	- STL maps keyed by std::string, e.g std::map
    	- classes that have implicit conversion to SEXP
    	- classes for which the wrap template if fully or partly specialized
    	This allows composition, so for example this class is wrappable: 
    	std::vector< std::map > (if T is wrappable)
    	
    o 	The range based version of wrap is now exposed at the Rcpp::
    	level with the following interface : 
    	Rcpp::wrap( InputIterator first, InputIterator last )
    	This is dispatched internally to the most appropriate implementation
    	using traits

    o	a new namespace Rcpp::traits has been added to host the various
    	type traits used by wrap

    o 	The doxygen documentation now shows the examples

    o 	A new file inst/THANKS acknowledges the kind help we got from others

    o	The RcppSexp has been removed from the library.
    
    o 	The methods RObject::asFoo are deprecated and will be removed
    	in the next version. The alternative is to use as.

    o	The method RObject::slot can now be used to get or set the 
    	associated slot. This is one more example of the proxy pattern
    	
    o	Rcpp::VectorBase gains a names() method that allows getting/setting
    	the names of a vector. This is yet another example of the 
    	proxy pattern.
    	
    o	Rcpp::DottedPair gains templated operator<< and operator>> that 
    	allow wrap and push_back or wrap and push_front of an object
    	
    o	Rcpp::DottedPair, Rcpp::Language, Rcpp::Pairlist are less
    	dependent on C++0x features. They gain constructors with up
    	to 5 templated arguments. 5 was choosed arbitrarily and might 
    	be updated upon request.
    	
    o	function calls by the Rcpp::Function class is less dependent
    	on C++0x. It is now possible to call a function with up to 
    	5 templated arguments (candidate for implicit wrap)
    	
    o	added support for 64-bit Windows (thanks to Brian Ripley and Uwe Ligges)

As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Fri, 05 Feb 2010

R / Finance 2010 Open for Registration

The annoucement below went out to R-SIG-Finance earlier today. For information is as usual the the R / Finance 2010 page:

Now open for registrations:

R / Finance 2010: Applied Finance with R
April 16 and 17, 2010
Chicago, IL, USA

The second annual R / Finance conference for applied finance using R, the premier free software system for statistical computation and graphics, will be held this spring in Chicago, IL, USA on Friday April 16 and Saturday April 17.

Building on the success of the inaugural R / Finance 2009 event, this two-day conference will cover topics as diverse as portfolio theory, time-series analysis, as well as advanced risk tools, high-performance computing, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management and trading.

Invited keynote presentations by Bernhard Pfaff, Ralph Vince, Mark Wildi and Achim Zeileis are complemented by over twenty talks (both full-length and 'lightning') selected from the submissions. Four optional tutorials are also offered on Friday April 16.

R / Finance 2010 is organized by a local group of R package authors and community contributors, and hosted by the International Center for Futures and Derivatives (ICFD) at the University of Illinois at Chicago.

Conference registration is now open. Special advanced registration pricing is available, as well as discounted pricing for academic and student registrations.

More details and registration information can be found at the website at
http://www.RinFinance.com

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan

See you in Chicago in April!

/computers/R | permanent link

Wed, 03 Feb 2010

RProtoBuf 0.1-0

Romain uploaded our first release of RProtoBuf to CRAN yesterday. RProtoBuf provides bindings for GNU R to the Google Protobuf implementation. Google Protobuf is (and I quote) a way of encoding structured data in an efficient yet extensible format that is used for almost all internal RPC protocols and file formats at Google.

RProtoBuf had a funny start. I had blogged about the 12 hour passage from proof of concept to R-Forge project following the ORD session hackfest in October. What happened next was as good. Romain emailed within hours of the blog post and reminded me of a similar project that is part of Saptarshi Guha's RHIPE R/Hadoop implementation. So the three of us--Romain, Saptarshi and I---started emailing and before long it becomes clear that Romain is both rather intrigued by this (whereas Saptarshi has slightly different needs for the inner workings of his Hadoop bindings) and was able to devote some time to it. So the code kept growing and growing at a fairly rapid clip. Til that stopped as we switched to working feverishly on Rcpp to both support the needs of this project, and to implement ideas we had while working on this. That now lead to the point where Rcpp is maturing in terms of features, so we will probably have time come back to more work on RProtoBuf to take advantage of the nice templated autoconversions we now have in Rcpp. Oddly enough, the initial blog post seemed to anticipate changes in Rcpp.

Anyway -- RProtoBuf is finally here and it already does a fair amount of magic based of code reflection using the proto files. The Google documentation has a simple example of a 'person' entry in an 'addressbook' which, when translated to R, goes like this:

R> library( RProtoBuf )                      ## load the package
R> readProtoFiles( "addressbook.proto" )     ## acquire protobuf information
R> bob <- new( tutorial.Person,              ## create new object
+   email = "bob@example.com",
+   name = "Bob",
+   id = 123 )
R> writeLines( bob$toString() )              ## serialize to stdout
name: "Bob"
id: 123
email: "bob@example.com"

R> bob$email                                 ## access and/or override
[1] "bob@example.com"
R> bob$id <- 5
R> bob$id
[1] 5

R> serialize( bob, "person.pb" )             ## serialize to compact binary format

There is more information at the RProtoBuf page, and we already have a draft package vignette, a 'quick' overview vignette and a unit test summary vignette.

More changes should be forthcoming as Romain and I find time to code them up. Feedback is as always welcome.

/code/rprotobuf | permanent link

Sun, 31 Jan 2010

Rcpp 0.7.4

Yesterday, and about nine days after release 0.7.3 of Rcpp (a set of R / C++ interface classes), Romain and I released version 0.7.4. It has been uploaded to CRAN and Debian, and mirrors should already have new versions. As before, my local page is also available for downloads and some more details.

The release once again combines a number of necessary fixes with numerous new features:

  • Building on OS X did not support multi-arch, and we are grateful for Simon who once again came to the rescue. Things should be fine now. The big take-away is that under no circumstances, include either a file configure or src/Makefile if you want multi-arch builds for free. As Rcpp is effectively a library to be used by other packages, this mattered.
  • We added a file NEWS from which I include the relevant section below.
  • Much more code re-organisation and enhancement making passage of various C++ types even easier -- see the NEWS entry below.
  • More unit tests, now including ones for the 'old Rcpp API'.
Post-release, I also reworked the doxygen setup slightly so that all examples are now browseable, and the whole documentation is now searchable as well.

Lastly, we had a remaining Windows build issue. Also, Brian Ripley and Uwe Ligges kindly sent us a small patch supporting the new Windows 64-bit builds using the new MinGW 64-bit compiler for Windows -- so release 0.7.5 may follow in due course.

The NEWS file entry for release 0.7.4 is as follows:

0.7.4	2010-01-30

    o	matrix matrix-like indexing using operator() for all vector 
    	types : IntegerVector, NumericVector, RawVector, CharacterVector
    	LogicalVector, GenericVector and ExpressionVector. 

    o	new class Rcpp::Dimension to support creation of vectors with 
    	dimensions. All vector classes gain a constructor taking a 
    	Dimension reference.

    o	an intermediate template class "SimpleVector" has been added. All
    	simple vector classes are now generated from the SimpleVector 
    	template : IntegerVector, NumericVector, RawVector, CharacterVector
    	LogicalVector.

    o	an intermediate template class "SEXP_Vector" has been added to 
    	generate GenericVector and ExpressionVector.

    o	the clone template function was introduced to explicitely
    	clone an RObject by duplicating the SEXP it encapsulates.

    o	even smarter wrap programming using traits and template
        meta-programming using a private header to be include only
        RcppCommon.h

    o 	the as template is now smarter. The template now attempts to 
    	build an object of the requested template parameter T by using the
    	constructor for the type taking a SEXP. This allows third party code
    	to create a class Foo with a constructor Foo(SEXP) to have 
    	as for free.

    o	wrap becomes a template. For an object of type T, wrap uses
    	implicit conversion to SEXP to first convert the object to a SEXP
    	and then uses the wrap(SEXP) function. This allows third party 
    	code creating a class Bar with an operator SEXP() to have 
    	wrap for free.

    o	all specializations of wrap :  wrap, wrap< vector >
    	use coercion to deal with missing values (NA) appropriately.

    o	configure has been withdrawn. C++0x features can now be activated
    	by setting the RCPP_CXX0X environment variable to "yes".

    o	new template r_cast to facilitate conversion of one SEXP
    	type to another. This is mostly intended for internal use and 
    	is used on all vector classes

    o	Environment now takes advantage of the augmented smartness
    	of as and wrap templates. If as makes sense, one can 
    	directly extract a Foo from the environment. If wrap makes
    	sense then one can insert a Bar directly into the environment. 
    	Foo foo = env["x"] ;  /* as is used */
	Bar bar ;
	env["y"] = bar ;      /* wrap is used */    	

    o	Environment::assign becomes a template and also uses wrap to 
    	create a suitable SEXP

    o	Many more unit tests for the new features; also added unit tests
        for older API

As always, even fuller details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Thu, 21 Jan 2010

Rcpp 0.7.3

A quick nine days after release 0.7.2 of Rcpp, our R / C++ interface classes, Romain and I are happy to roll out a new version 0.7.3. It has been uploaded to CRAN and Debian, and mirrors should have the new versions shortly. As before, my local page is also available for downloads and some more details.

This release combines a number of under-the-hood fixes and enhancements with one bug fix:

  • The Rcpp:::LdFlags() helper function to dynamically provide linker options for packages using Rcpp now defaults to static linking on OS X as well. For installation from source dynamic linking always worked, but not for binary installation (as e.g. from CRAN). As on the other platforms, this default can be overridden. Thanks to the phylobase team for patient help in tracking this down.
  • Accessing various types via [] should now be faster due to some enhancements in the internal representations.
  • configure now has a command-line option (as well as an environment variable) to select support for the draft of the upcoming C++0x standard.
  • A new function Rcpp.package.skeleton(), modelled after package.skeleton() in R itself, helps to set up a new package with support for using Rcpp.
  • A number of other minor tweaks and improvements...

As always, full details are in the ChangeLog on the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/computers/linux/debian/packages | permanent link

Thu, 14 Jan 2010

RQuantLib 0.3.2 released

A new version of RQuantLib (a package combining the quantitative analytics of QuantLib with the R statistical computing environment and language) is now out at CRAN and in Debian (where it depends on the 1.0.0 beta of QuantLib that is currently in the NEW queue with its new library version). This RQuantLib release works with either the current release 0.9.9 as well as with the just-released first beta of QuantLib 1.0.0.

This versions brings a few cleanups due to minor Rcpp changes (in essence: we now define the macro R_NO_REMAP before including R's headers and this separate non-namespaced functions like error() or length() out into prefixed-versions Rf_error() and Rf_length() which is a good thing).

It also adds a number of calendaring and holiday utilities that Khanh just added: tests for weekend, holiday, endOfMonth as well dayCount, date advancement and year fraction functions commonly used in fixed income.

Full changelog details, examples and more details about this package are at my RQuantLib page.

/computers/linux/debian/packages | permanent link

Tue, 12 Jan 2010

Rcpp 0.7.2

Not even two weeks after the Rcpp 0.7.1 release, Romain and I have a new one to present: Rcpp 0.7.2. It has been uploaded to CRAN and Debian, and the respective package management systems should carry them around in the next few hours. As always, the local page is also available for download too.

A lot of the momentum for the new API is continuing, thanks in large part to Romain. A number of new classes have been added, and existing ones have been enhanced. There are more unit tests than ever, and more documentation. We have better build support (with g++ version detection so that we can add some C++0x support where available) and a new examples sub-directory.

We did take one toy away, though. The Doxygen-generated docs were getting so big that we decided to keep them out of the source tarball. (And arguably, they are also too volatile.) We still have the browseable html docs as well as the pdf version (now at over 300 pages!). And we added zip archives of the docs in html, latex, and man format for download.

As always, full details are in the ChangeLog on the Rcpp page. Questions, comments etc: bring them to the rcpp-devel mailing list off the R-Forge page

/computers/linux/debian/packages | permanent link

Thu, 07 Jan 2010

Review of 'Computational Statistics: An Introduction to R' in JSS

Somehow missed during the the end-of-year switchover was the fact that my review of Guenther Sawitzki's Computational Statistics: An Introduction to R (CRC / Chapman \& Hall, 2009) is now up on the Journal of Statistical Software website.

/computers/R | permanent link

Wed, 06 Jan 2010

RInside release 0.2.1

The shiny new 0.2.1 release of RInside, a set of convenience classes to facilitate embedding of R inside of C++ applications, just went out to CRAN; sources are also at my RInside page

This is a maintenance release building on the recent 0.2.0 release which added Windows support (provided you use the Rtools toolchain for Windows). In this release, we changed the startup initialization so that interactive() comes out FALSE (just as we had done for littler just yesterday) and with that no longer call Rf_KillAllDevices() from the destructor as we may not have had devices in the first place. A few minor things were tweaked around the code organisation and build process, see the ChangeLog for details.

The new release should hit CRAN mirrors tomorrow, and is (as always) available from my machine too.

/computers/linux/debian/packages | permanent link

Tue, 05 Jan 2010

littler 0.1.3

A new littler release (now at 0.1.3) just went out of the door this evening.

littler provides r (pronounced littler), a shebang / scripting / quick eval / pipelining front-end to the the R language and system.

This version adds a few minor behind-the-scenes improvements:

  • interactive() now evaluates to false as you'd expect in a non-interactive scripting front-end. To restore the previous behaviour, new switches -i or --interactive have been added.
  • Some of the 'cleanup' functionality described in Section 8.1.2 on 'Setting R callbacks' from the R Extension manual have been adopted.
  • Example scripts install.r and update.r received an update based on lessons learned from the R 2.10.0 roll-out and package rebuilding.
  • A few build issues were improved, a minor manual page formatting bug was fixed.

As usual, our code is available via our svn archive or from tarballs off my littler page and the local directory here. A fresh package is in Debian's incoming queue and will hit mirrors shortly.

/computers/linux/debian/packages | permanent link

Sat, 02 Jan 2010

Rcpp 0.7.1

Two weeks after the Rcpp 0.7.0 release, Romain and I are happy to announce release 0.7.1 of Rcpp. It is currently in the incoming section of CRAN and has been accepted into Debian. Mirrors will catch up over the next few days, in the meantime the local page is available for download too.

A lot has changed under the hood since 0.7.0, and this is the first release that really reflects many of Romain's additions. Some of the changes are

  • A new base class Rcpp::RObject that replaces RcppSexp (which is still provided for compatibility); it provides basic R object handling and other new classes derive from it.
  • Rcpp::RObject has real simple wrappers for object creation and a SEXP operator for transfer back to R that make simple interfaces even easier.
  • New classes Rcpp::Evaluator and Rcpp::Environment for expression evaluation and R environment access, respectively.
  • A new class Rcpp::XPtr for external pointer access and management.
  • Enhanced exception handling: exception can be trapped at the R even outside of try/catch blocks, see Romain's blog post for more.
  • Namespace support with the addition of a Rcpp namespace; we will be incremental in phasing this in keeping compatibility with the old interface
  • Unit test for most all of the above via use of the RUnit package, and several new examples.
  • Inline support has been removed and replaced with a Depends: on inline (>= 0.3.4) as our patch is now part of the current inline package as mentioned <here.
As before, fuller details are in the ChangeLog on the Rcpp page.

/computers/linux/debian/packages | permanent link