Dirk Eddelbuettel Thinking inside the box
 
Mon, 13 May 2013

RcppArmadillo 0.3.820
Conrad rolled up a new Armadillo release 3.820 (following two minor fix release in the 0.3.810 series of which we packaged the one that was relevant for us). This new version is now out in a release 0.3.820 of RcppArmadillo which is already on CRAN and in Debian.

The summary of the main changes follows:

Changes in RcppArmadillo version 0.3.820 (2013-05-12)

  • Upgraded to Armadillo release Version 3.820 (Mt Cootha)

    • faster as_scalar() for compound expressions

    • faster transpose of small vectors

    • faster matrix-vector product for small vectors

    • faster multiplication of small fixed size matrices

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sun, 12 May 2013

Recent Rcpp talks at U of C and MCW
A couple of days ago, I had an opportunity to give a guest lecture on our Rcpp package for R and C++ integration. This was in CMSC 12300 Computer Science with Applications-3 in the Department of Computer Science at University of Chicago. The course is the final part of a three term sequence introducing students to data-centric work in R, Python, Java and C++. I tried to keep it brief and engaging in order to motivate the why or R/C++ integration while providing plenry of useful examples.

And yesterday I got to spend a day giving an invited day-long workshop at the Medical College of Wisconsin as part of a two-day R workshop sponsored by the Milwaukee Chapter of the American Statistical Assocation as well as the CTSI and PCOR centers at the Medical College of Wisconsin. In the workshop, I followed the previously-used setup of four parts on introduction, Rcpp details, advanced topics and last-but-not-least applications, but also updated and extended to more recent topics.

Pdf slides from both events are now on my presentations page.

/code/rcpp | permanent link

Sat, 20 Apr 2013

RcppArmadillo 0.3.810.0
A new Armadillo release 3.810.0 by Conrad appeared yesterday, and was wrapped up in a new release 0.3.810.0 of RcppArmadillo. Upstream changes bring FFT support as well as more Sparse matrix constructors, and we have an improvement to the sample() function contributed by Christian Gunning.

As RcppArmadillo is used by an increasing number of packages---on CRAN alone, we find 34 direct dependencies---I also added the package to Debian and upload there in parallel.

The summary of the main changes follows:

Changes in RcppArmadillo version 0.3.810.0 (2013-04-19)

  • Upgraded to Armadillo release Version 3.810.0 (Newell Highway)

    • added fast Fourier transform: fft()

    • added handling of .imbue() and .transform() by submatrices and subcubes

    • added batch insertion constructors for sparse matrices

    • minor fix for multiplication of complex sparse matrices

  • Updated sample() function and test again contributed by Christian Gunning

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Fri, 29 Mar 2013

R / Finance 2013 Open for Registration
The annoucement below just went to the R-SIG-Finance list. More information is as usual at the R / Finance page:

Now open for registrations:

R / Finance 2013: Applied Finance with R
May 17 and 18, 2013
Chicago, IL, USA

The registration for R/Finance 2013 -- which will take place May 17 and 18 in Chicago -- is NOW OPEN!

Building on the success of the previous conferences in 2009, 2010, 2011 and 2012, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynotes by Sanjiv Das, Attilio Meucci, Ryan Sheftel and Ruey Tsay. The main agenda (currently) includes seventeen full presentations and fifteen shorter "lightning talks". We are also excited to offer five optional pre-conference seminars on Friday morning.

To celebrate the fifth year of the conference in style, the dinner will be held at The Terrace of the Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:

http://www.RinFinance.com/agenda/

Registration information is available at

http://www.RinFinance.com/register/
and can also be directly accessed by going to
http://www.regonline.com/RFinance2013
We would to thank our 2013 Sponsors for the continued support enabling us to host such an exciting conference:
International Center for Futures and Derivatives at UIC

Revolution Analytics
MS-Computational Finance at University of Washington

Google
lemnica
OpenGamma
OneMarketData
RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

See you in Chicago in May!!

/computers/R | permanent link

Sat, 23 Mar 2013

Rcpp 0.10.3
A new relase 0.10.3 of Rcpp is now on CRAN and in Debian.

This is the fourth release in the 0.10.* series, and further extends and solidifies the excellent Rcpp attributes. A few other bugs were fixed as well, and support for wide character strings has been added.

We once again tested this fairly rigorously by checking against 86 of the 100 CRAN packages depending on Rcpp. All of these passed. So we do not expect any issues with dependent packages, but one never knows.

The complete NEWS entry for 0.10.3 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.10.3 (2013-03-23)

  • Changes in R code:

    • Prevent build failures on Windowsn when Rcpp is installed in a library path with spaces (transform paths in the same manner that R does before passing them to the build system).

  • Changes in Rcpp attributes:

    • Rcpp modules can now be used with sourceCpp

    • Standalone roxygen chunks (e.g. to document a class) are now transposed into RcppExports.R

    • Added Rcpp::plugins attribute for binding directly to inline plugins. Plugins can be registered using the new registerPlugin function.

    • Added built-in cpp11 plugin for specifying the use of C++11 in a translation unit

    • Merge existing values of build related environment variables for sourceCpp

    • Add global package include file to RcppExports.cpp if it exists

    • Stop with an error if the file name passed to sourceCpp has spaces in it

    • Return invisibly from void functions

    • Ensure that line comments invalidate block comments when parsing for attributes

    • Eliminated spurious empty hello world function definition in Rcpp.package.skeleton

  • Changes in Rcpp API:

    • The very central use of R API R_PreserveObject and R_ReleaseObject has been replaced by a new system based on the functions Rcpp_PreserveObject, Rcpp_ReleaseObject and Rcpp_ReplaceObject which shows better performance and is implemented using a generic vector treated as a stack instead of a pairlist in the R implementation. However, as this preserve / release code is still a little rough at the edges, a new #define is used (in config.h) to disable it for now.

    • Platform-dependent code in Timer.cpp now recognises a few more BSD variants thanks to contributed defined() test suggestions

    • Support for wide character strings has been added throughout the API. In particular String, CharacterVector, wrap and as are aware of wide character strings

Thanks to CRANberries, you can also look at a diff to the previous release 0.10.2. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 12 Mar 2013

Rcpp master class in New York last weekend
On Saturday I had the opportunity to teach another one-day master class on Rcpp. The class had been organized by Jared Lander, and organized very well I might add.

The weekend started with a slight disappointment. I had taken Friday off, and hoped to reach NY by early afternoon to join JJ there, and to spend the afternoon with the RStan team. However, the tail end of last week's snowstorm made it such that we both got to Columbia's stats department closer to 6pm rather than 1pm, and half the team had left. Dang. Very frustrating travel experience. We salvaged the evening by gabbing over a cold beverage or two, before sharing some sacred New York pizza with Wes McKinney and Jared.

The class itself on Saturday went quite well. With JJ on deck, we were able to have every participant log into an EC2-hosted instance of RStudio Server, which worked very well for usage examples of Rcpp. It has been almost a year since I last taught the class, and many exciting things--such as Rcpp attributes, added by JJ himself--have appeared, which made it extra fun. Participants were rather kind with praise. Either they really liked it, or they really are hard-nosed New Yorkers who manage to lie to my face without me noticing.

We ended the day with some hard-earned cold beverages, followed by some dinner at Sylvia's (as tweeted by Jared) followed by more drinks. Ended up a little past my usual bedtime, but I managed to get out and enjoy a lovely 6.5 miles run across Central Park the next morning before leaving town.

All in all, a very nice weekend, the travel horror of Friday notwithstanding. And who know, maybe we'll just do it again another time...

/code/rcpp | permanent link

RcppArmadillo 0.3.800.1
Conrad released a first bug-fix release 3.800.1 of Armadillo earlier today. This has been wrapped up in release 0.3.800.1 of RcppArmadillo as usual. This release also contains a very nice function sample() (contributed by Christian Gunning) which provides sampling (with or without replacement) at the C++ level modeled after what we are used to in R itself. We also refactored the unit tests into just two compilation units to speed testing up a little.

The summary of the main changes follows:

Changes in RcppArmadillo version 0.3.800.1 (2013-03-12)

  • Upgraded to Armadillo release Version 3.800.1 (Miami Beach)

    • workaround for a bug in ATLAS 3.8 on 64 bit systems

    • faster matrix-vector multiply for small matrices

  • Added new sample() function and tests contributed by Christian Gunning

  • Refactored unit testing code for faster unit test performance

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sat, 02 Mar 2013

RcppArmadillo 0.3.800.0
A new Armadillo version 3.800.0 is now out. Conrad picked a new numbering scheme to coincide with the relicensing from LGPL to MPL 2.0. The new version 0.3.800.0 of the corresponding RcppArmadillo package (which still uses GPL 2 or later) is now on CRAN. It also contains the updated version of our paper as the package vignette.

The summary of the main changes follows:

Changes in RcppArmadillo version 0.3.800.0 (2013-03-01)

  • Upgraded to Armadillo release Version 3.800.0 (Miami Beach)

    • Armadillo is now licensed using the Mozilla Public License 2.0

    • added .imbue() for filling a matrix/cube with values provided by a functor or lambda expression

    • added .swap() for swapping contents with another matrix

    • added .transform() for transforming a matrix/cube using a functor or lambda expression

    • added round() for rounding matrix elements towards nearest integer

    • faster find()

    • fixes for handling non-square matrices by qr() and qr_econ()

    • minor fixes for handling empty matrices

    • reduction of pedantic compiler warnings

  • Updated vignette to paper now in press at CSDA

  • Added CITATION file with reference to CSDA paper

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 27 Feb 2013

inline 0.3.11
A maintenance release of inline is now on CRAN, and is being uploaded to Debian. The release fixes two minor bugs kindly reported by users. As the two previous releases appear to not have been announced here, their NEWS entries are included as well.

Changes in inline version 0.3.11 (2013-02-26)

  • Fix bug in cfunction for .C convention with raw vectors.

  • Correct cfunction to use .Platform$dynlib.ext as the file extension for the library file (unless on Windows).

  • Allow rcpp wrapper to pass another plugin (as eg RcppArmadillo)

Changes in inline version 0.3.10 (2012-10-03)

  • getDynLib() error message corrected as suggested by Yasir Suhail

  • Added rcpp() wrapper for cxxfunction() which sets plugin="Rcpp"

  • Converted NEWS to NEWS.Rd

  • New maintainer, after having coordinated releases (along with Romain) since 0.3.5 in June 2010

Changes in inline version 0.3.9 (2012-10-02)

  • Uncoordinating hijacking of package by CRAN maintainers with a single word change in cfunction.R to prevent an error under an unreleased version of R

Courtesy of CRANberries, there is also a diffstat report for the most recent release. A few more details are available at the R-Forge page.

/code/inline | permanent link

Sat, 23 Feb 2013

Two papers about RcppEigen and RcppArmadillo published
Two papers got published recently. The first one is Bates and Eddelbuettel (2013). It is titled Fast and Elegant Numerical Linear Algebra Using the RcppEigen Package, and provides a pretty thorough introduction to our RcppEigen package which uses Rcpp to provide access to the Eigen C++ template library from GNU R. The paper is out as Volume 50, Issue 5 at the (all electronic, open, and generally awesome) Journal of Statistical Software. A bibtex entry is available.

The second paper is Eddelbuettel and Sanderson (2013). This one is titled RcppArmadillo: Accelerating R with high-performance C++ linear algebra and introduces the RcppArmadillo package which brings Conrad Sanderson's Armadillo C++ template library to GNU R by deploying Rcpp. The paper is currently "in press" at Computational Statistics & Data Analysis but the DOI 10.1016/j.csda.2013.02.005 will remain once a volume and issue is assigned by CSDA.

Preprints of both papers are available via my papers page, and as vignettes in the corresponding packages.

The upcoming Rcpp class in New York will feature Rcpp, RcppArmadillo and RcppEigen. Space is still available.

/code/rcpp | permanent link

Wed, 20 Feb 2013

RcppArmadillo 0.3.6.3
A new Armadillo version 3.6.3 came out this morning, and the corresponding RcppArmadillo version is now on CRAN. Changes are incremental:

Changes in RcppArmadillo version 0.3.6.3 (2013-02-20)

  • Upgraded to Armadillo release Version 3.6.3

    • faster find()

    • minor fix for non-contiguous submatrix views to handle empty vectors of indices

    • reduction of pedantic compiler warnings

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Mon, 18 Feb 2013

New Rcpp master class scheduled for New York
A new Rcpp master class is scheduled for March 9 in New York. The format will an updated version of the one-day workshops I have given at the University of Rochester in 2010, in San Franciso in 2011 (organised by Revolution Analytics) and at the UseR! conference in 2012.

The style will be hands-on, with numerous concrete examples and solid coverage of most aspects of Rcpp and related packages. As before, about six hours of instruction, split into four sessions of around ninety minutes focussing (loosely) on motivation/intro, core parts, extensions and applications. This should leave ample time for informal discussions and Q+A---as well as for lunch and coffee breaks---for a total of eight hours in the classroom.

This is being put together in New York with the help of Jared Lander, and we will have some technical assistance from RStudio in order to use their EC2 farm for exercises with Rcpp.

Registrations details are available here; information about other Rcpp events is also available.

Feel free to contact me or Jared at our usual email addresses with questions.

/code/rcpp | permanent link

RQuantLib 0.3.10
A new minor release RQuantLib 0.3.10 is now on CRAN and in Debian. RQuantLib combines (some of) the quantitative analytics of QuantLib with the R statistical computing environment and language.

The discount curve building code in QuantLib has shown some overly large numerical instabilities. We have used the same example parameters (taken from the Swap example in QuantLib) for years; it currently fails to solve for a rate at some term further out the curve. So I made the decistion to disable this just in the examples in order to not upset the CRAN testing framework. The examples now use a flat curve instead. I also updated one function to silence some new warnings from R-devel about symbols from another packages's namespace (in this case rgl, and it is just for surface plots, a purely cosmetic function).

Thanks to CRANberries, there is also a diff to the previous release 0.3.9. Full changelog details, examples and more details about this package are at my RQuantLib page.

/code/rquantlib | permanent link

Sat, 16 Feb 2013

digest 0.6.3
digest version 0.6.3 is now on CRAN, and I'll upload the Debian package in a minute.

This is a minor bug release regarding just the recently-added sha512 support. Turns out the wrong initial buffer size was used on the R side. Hannes fixed that within hours after we got the bug report; but I was a little swamped with multiple deadlines and failed to upload this right away.

CRANberries provides the usual summary of changes to version 0.6.2. Our package is available via the R-Forge page leading to svn and tarball access, my digest page, the local directory here as well as via Debian and its mirrors.

/code/digest | permanent link

Tue, 05 Feb 2013

A book about Rcpp
Some little birds had already been whispering about it, but I didn't want to jinx it and told myself I would wait with an announcement until the booksellers have (at least) placeholder pages. And as I learned from Duncan Murdoch via email earlier today, at least Chapters/Indigo had a page up, presumably scraped from the publisher page, so here it goes:

I have in fact handed a complete draft of my book about Seamless R and C++ Integration with Rcpp to Springer a few weeks ago. With a bit of luck on the production side, we could be seeing physical copies by May of a new title in their popular UseR! Series series.

And to the slowly growing new Rcpp site, I have added a a formal page about the Rcpp book where one can find information about it, including a link to the Springer page, links to a few bookseller's pages --- as well as a few wonderfully flattering endorsements. Eventually, errata and other support material should be available via this page too. Can't wait til I hold a physical copy in hand...

/code/rcpp | permanent link

New Rcpp page on upcoming events -- including Master Class in New York
Lots of exciting things are happening with and around Rcpp. I just added a new page about Upcoming Events to the recently-created Rcpp site. This events page has lots to cover: an upcoming talk at Columbia on March 8 (details still TBD), a day-long workshop in New York on March 9, a possible participation at a CERN / ROOT conference in Switzerland on May 11-14, an upcoming talk in May in Milwaukee, and last but not least the tutorial by Romain and Hadley at UseR! 2013 in Spain. Phew!

With that, a few quick words about the upcoming master class in New York. It will be a full day, covering an introduction and motivation, details about the core data types, tools for working with and and extending Rcpp and of course applications galore, including RcppArmadillo and RInside. I have done the same one day class format a few times before, most recently (with Revolution Analytics) in San Francisco in late 2011, and also as a two-part seminar at UseR! 2012. This time, we plan on providing cloud-hosted RStudio instances for participants. Better still, RStudio's own JJ Allaire will be on deck as well for RStudio --- and Rcpp Attributes --- questions.

Details and registration information for the New York class are at this page.

/code/rcpp | permanent link

Sun, 03 Feb 2013

The Rcpp Gallery and my Seinfeld Streak
A good three weeks ago, we introduced the Rcpp Gallery. While this is a joint effort by several of us on the Rcpp team, the backend was conceived and implemented entirely by JJ who also bootstrapped it with same first content, drawing on posts by Hadley, Romain and myself. As the How to contribute page makes plain, this is all backed by GitHub and all logs are public anyway.

So after it was up and working, JJ and I refined the look and feel, and I started to add more content so that would have something by the time the initial announcement came around. A few years I read about an (attributed) secret to Seinfeld's producitivity: "Don't break the chain". Just keep writing, and write every day.

I made my goal of a post every day for just over a month, and created this sequences: (20 Dec) simulating-pi, (21 Dec) vector-minimum, (22 Dec) gsl-colnorm-example, (23 Dec) fibonacci-sequence, (24 Dec) random-number-generation, (25 Dec) armadillo-sparse-matrix, (26 Dec) timing-rngs, (27 Dec) stl-inner-product, (28 Dec) stl-transform, (29 Dec) stl-transform-for-subsetting, (30 Dec) stl-random-shuffle, (31 Dec) stl-random-sample, (01 Jan) stl-for-each, (02 Jan) armadillo-subsetting, (03 Jan) accessing-environments, (04 Jan) armadillo-eigenvalues, (05 Jan) r-function-from-c++, (06 Jan) using-the-rcpp-timer, (07 Jan) sugar-function-clamp, (08 Jan) using-rcout, (09 Jan) first-steps-with-C++11, (10 Jan) simple-lambda-func-c++11, (11 Jan) eigen-eigenvalues, (12 Jan) getting-attributes-for-xts-example, (13 Jan) intro-to-exceptions, (14 Jan) a-first-boost-example, (15 Jan) a-second-boost-example, (16 Jan) timing-normal-rngs, (17 Jan) creating-xts-from-c++, (18 Jan) gsl-for-eigenvalues, (19 Jan) accessing-xts-api, (20 Jan) custom-as-and-wrap-example, (21 Jan) passing-cpp-function-pointers,

The Rcpp Gallery continues to grow, we now have 58 posts from 7 different authors. And it is open for business: new contributions are always welcome.

/code/rcpp | permanent link

Fri, 01 Feb 2013

RcppExamples 0.1.6
A pure maintenance release 0.1.6 of RcppExamples was made two weeks ago, and never announced. We merely moved the NEWS.Rd file into the proper location in the inst/ directory, and, while were at it, mentioned the new Rcpp Gallery in the DESCRIPTION file.

Thanks to CRANberries, there is the standard diff to the previous release 0.1.5. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

digest 0.6.2
digest version 0.6.2 came out a few days ago as an almost immediate follow-up to release 0.6.1. We used paste0() in a few places, and this is only available with newer versions of R. To not introduce as somewhat unnessecary dependency, we reverted this to plain old paste(). CRANberries provides the usual summary of changes to version 0.6.1.

As usual, our package is available via the R-Forge page leading to svn and tarball access, my digest page, the local directory here as well as via Debian and its mirrors.

/code/digest | permanent link

Thu, 31 Jan 2013

Introducing the BH package
Earlier today a new package BH arrived on CRAN. Over the years, Jay Emerson, Michael Kane and I had numerous discussions about a basic Boost infrastructure package providing Boost headers for other CRAN packages (and yes, we are talking packages using C++ here). JJ and Romain chipped in as well, and Jay finally took the lead by first creating a repo on R-Forge. And now the package is out, so I just put together a quick demo post over at the Rcpp Gallery.

As that post notes, BH is still pretty new and rough, and we probably missed some other useful Boost packages. If so, let one of us know.

/code/snippets | permanent link

Wed, 30 Jan 2013

RcppArmadillo 0.3.6.2
A new Armadillo version 3.6.2 came out yesterday, and the corresponding RcppArmadillo version is now on CRAN. Changes are mostky incremental:

Changes in RcppArmadillo version 0.3.6.2 (2013-01-29)

  • Upgraded to Armadillo release Version 3.6.2

    • faster determinant for matrices marked as diagonal or triangular

    • more fine-grained handling of 64 bit integers

  • Added a new example of a Kalman filter implementation in R, and C++ using Armadillo via RcppArmadillo, complete with timing comparison

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 23 Jan 2013

Rcpp reaches 100 dependents on CRAN
With the arrival earlier today of the stochvol package onto the CRAN network for R, our Rcpp project reached a new milestone: 100 packages have either a Depends:, Imports: or LinkingTo: statement on it.

The full list will always be at the bottom of the CRAN page for Rcpp; I also manually edit a list on my Rcpp page. But for the record as of today, here is the current list as produced by a little helper script I keep:

 acer                apcluster           auteur             
 bcp                 bfa                 bfp                
 bifactorial         blockcluster        ccaPP              
 cda                 classify            clusteval          
 ConConPiWiFun       EpiContactTrace     fastGHQuad         
 fdaMixed            forecast            fugeR              
 GeneticTools        gMWT                gof                
 gRbase              gRim                growcurves         
 GUTS                jaatha              KernSmoothIRT      
 LaF                 maxent              mets               
 minqa               mirt                mRMRe              
 multmod             mvabund             MVB                
 NetworkAnalysis     ngspatial           oem                
 openair             orQA                parser             
 pbdBASE             pbdDMAT             phom               
 phylobase           planar              psgp               
 quadrupen           Rchemcpp            Rclusterpp         
 RcppArmadillo       RcppBDT             rcppbugs           
 RcppClassic         RcppClassicExamples RcppCNPy           
 RcppDE              RcppEigen           RcppExamples       
 RcppGSL             RcppOctave          RcppRoll           
 RcppSMC             RcppXts             rforensicbatwing   
 rgam                RInside             Rmalschains        
 Rmixmod             robustgam           robustHD           
 rococo              RProtoBuf           RQuantLib          
 RSNNS               RSofia              rugarch            
 RVowpalWabbit       SBSA                sdcMicro           
 sdcTable            simFrame            spacodiR           
 sparseHessianFD     sparseLTSEigen      SpatialTools       
 stochvol            surveillance        survSNP            
 termstrc            tmg                 transmission       
 trustOptim          unmarked            VIM                
 waffect             WideLM              wordcloud          
 zic                

And not to be forgotten is BioConductor which has another 10:

 ddgraph            GeneNetworkBuilder GOSemSim          
 GRENITS            mosaics            mzR               
 pcaMethods         Rdisop             Risa              
 rTANDEM  

As developers of Rcpp, we are both proud and also a little humbled. The packages using Rcpp span everything from bringing new libraries to R, to implementing faster ways of doing things we have before to doing completely new things. It is an exciting time to be using R, and to be connecting R to C++, especially with so many exciting things happening in C++ right now. Follow the Rcpp links for more, and come join us on the Rcpp-devel mailing list to discuss and learn.

/code/rcpp | permanent link

Mon, 21 Jan 2013

digest 0.6.1
digest version 0.6.1 is now on CRAN, and I will push the corresponding version into Debian shortly.

Duncan Murdoch added AES support, and helped me fix two issues which (annoyingly) made the Rout.save output differ on another platform.

CRANberries provides the usual summary of changes to version 0.6.0.

As usual, our package is available via the R-Forge page leading to svn and tarball access, my digest page and the local directory here.

/code/digest | permanent link

Fri, 18 Jan 2013

Sing The Truth at Symphony Center
Just back in from an amazing concert at the Chicago Symphony Center. Vocalists Angelique Kidjo, Dianne Reeves and Lizz Wright supported by all-star band of Geri Allen on piano, Romero Lubambo on electric and acoustic guitar, James Genus on electric bass guitar, Munyungo Jackson on percussion and Terri Lyne Carrington on drums (and musical director). The concert alternates between solos and joint pieces with more joy and soul than I heard in a long time. Great evening.

/music/jazz/live | permanent link

Tue, 08 Jan 2013

Annoucing the Rcpp Gallery
Earlier this morning, JJ announced what we had been working on for the last few weeks: the Rcpp Gallery.

Now, as our luck will have it, the Rcpp-devel list received his message but did not transmit it for an apparent mail system outage at WU Vienna: no sign at the Gmane archive of rcpp-devel or in the personal mailboxen of myself or anybody I spoke to. Hence, so far, and preceding this blog announcement, the only way word got out was via this earlier tweet of mine from about 12 hours ago.

The Rcpp Gallery is really the brainchild of JJ. It builds on what he contributed over the last few months in not one but two implementations: Rcpp Attributes. These are described in a vignette of their own. They provide very powerful new functions like sourceCpp which allow the easiest-yet way to get compiled code into R---see for example these posts from my blog about simulating pi in essentially five lines of R or five lines of C++, or this post about using the GSL with ease from R. The Rcpp Gallery also builds on Yihui's excellent knitr package which gained the ability to process C++ code just like R code, as well as some Ruby / Jekyll magic to build a website on the github infrastructure. I helped a little on the side by (at long last) learning how to do prettier websites thanks to Boostrap and its theming extensions.

So what does it do, and what is it for? Have a look around the Rcpp Gallery site. Each post is based on a single C++ (or Markdown) file which gets digested by knitr and Rcpp, with the actual output shown alongside the marked up code and explanatory text. Raw sources are available, just pass them into the sourceCpp() function from a current Rcpp release and you should have the same output.

Our idea is to have this as a repository for useful code: from simple and introductory to fancy and featureful. We already seeded it with several dozen posts covered anything from lesser known but powerful STL idioms, to Rcpp sugar, to tieing in Armadillo or GSL, random number generation and of course benchmarking---as we do love performance.

The entire content is in this github repository, and our page on how to contribute details how you can get involved.

We are looking forward to what is to come. In many ways, we are only just getting started.

/code/rcpp | permanent link

Mon, 31 Dec 2012

Ragnar Relay Chicago 2012
One things I never quite got around to during 2012 was to blog about the awesome relay we ran in early June. This was the race formerly known as MC200 (for Madiston, WI, to Chicago, IL, by way of Milwaukee, WI, for about 200 miles) and is now part of the Ragnar Relay series: Ragnar Chicago.

We ran as a so-called ultra team of six runners, as opposed to a regular ream of twelve. The course is cut into 36 segments; on regular teams you get 3, we each had twice that. My first leg was a combined 17 miles in what turned out to be pretty blistering heat in mid-to-late afternoon. My fellow team members were awesome in getting me lots of water an ice, and I managed to hold onto a pace of just over 8 min/miles. One of the harder runs I've had. Next was a wonderful run pretty much exactly at midnight under starry skies---about seven or so miles followed by ten more miles the next morning.

We ended up coming third (yay!) beating the next time by about six or seven seconds (!!) over a total time of 25 or 26 hours.

It was hard. It was fun. It was exhilirating. It may also have broken me as I haven't really run much since. So good intentions for 2013: get back into the groove.

/sports/running | permanent link

Sun, 30 Dec 2012

RcppClassicExamples 0.1.1
Yesterday's initial upload of RcppClassicExamples was lacking a versioned Depends: to prevent builds on older versions of R. This has been added in a new upload 0.1.1. We also added a NEWS file (see below); no code changes were made.

Changes in version 0.1.1 (2012-12-30)

  • Added versioned Depends: in DESCRIPTION to not build under older versions of Rcpp and RcppClassic

Changes in version 0.1.0 (2012-12-27)

Thanks to CRANberries, you can also look at a diff to the previous release 0.1.0. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sat, 29 Dec 2012

RcppExamples 0.1.5 and RcppClassicExamples 0.1.0
The recent releases of Rcpp 0.10.2 and RcppClassic 0.9.3 had one more repercussion. On that dreaded OS, the linker no longer wanted to instantiate a symbol present in both packages; seems to me that the linker in the other two OSs is a little smarter. Anyway -- I didn't fight this but at long last moved all remnands of the long-deprecated older Rcpp API (which is still maintained by package RcppClassic) out of package RcppExamples and into a new package RcppClassicExamples.

And the updated version 0.1.5 of the RcppExamples package appeared on CRAN and has now been joined by the initial version 0.1.0 of the new package RcppClassicExamples. No code changed were made; manual pages and descriptions where brushed up and that is about it.

Thanks to CRANberries, you can also look at a diff to the previous release 0.1.4 of RcppExamples. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sat, 22 Dec 2012

RcppClassic 0.9.3
Yesterday's release of Rcpp 0.10.2 required a small change to RcppClassic, the package supporting the deprecated older classic Rcpp API defined in the earlier 2005 to 2006 releases. So version 0.9.3 of RcppClassic is now on CRAN. There is no new user-facing code.

Courtesy of CRANberries, there is the set of changes relative to the previous release 0.9.2.

Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Kurt Elling Quintet at the Green Mill
Kurt Elling is back in Chicago for two gigs at the Green Mill (where I last saw him in a wonderful big-band setting), along with his long-time collaborator Laurence Hobgood on piano, Clark Sommers on bass, Chicago's own John McLean on guitar, and the new kid Bryan Carter on drums.

And they were, of course, awesome. Great mix of standards as well as new stuff from the Grammy-nominated new recording. If you can see Kurt Elling live, go. Now. They return for another gig tonight and I will have to see if I can swing that.

/music/jazz/live | permanent link

Patricia Barber at Unity Temple
Forgot to blog about this (though I posted on Google+ about it), but we went to a neat little concert in the neighborhood on October 6. Chicago's own Patricia Barber came to the wonderful Unity Temple by Frank Lloyd Wright (more on the temple at Wikipedia and the foundation).

This closed the 2012 series of the Unity Temple concerts and the setting was a little, well, weird. Look like mostly concert subscribers, with the commensurate age brackets, and not too many jazz folks. Not sure how many Patricia Barber, along with Larry Kohut on bass, won over, but I enjoyed it. I had secured front row (!!) seats in what is already an wonderfully intimate setting for such a duet concert.

See the Google+ link for a YouTube recording by her.

/music/jazz/live | permanent link

Fri, 21 Dec 2012

Rcpp 0.10.2
Relase 0.10.2 of Rcpp provides the second update to the 0.10.* series, and has arrived on CRAN and in Debian.

It brings another great set of enhancements and extensions, building on the recent 0.10.0 and 0.10.1 releases. The new Rcpp attributes were rewritten to not require Rcpp modules (as we encountered on issue with exceptions on Windows when built this way), code was reorganized to significantly accelerate compilation and a couple of new things such as more Rcpp sugar goodies, a new timer class, and a new string class were added. See below for full details.

We also tested this fairly rigorously by checking about two thirds of the over 90 CRAN packages depending on Rcpp (and the remainder required even more package installs which we did not do as this was already taking about 12 total cpu hours to test). We are quite confident that no changes are required (besides one in our own RcppClassic package which we will update.

The complete NEWS entry for 0.10.2 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.10.2 (2012-12-21)

  • Changes in Rcpp API:

    • Source and header files were reorganized and consolidated so that compile time are now significantly lower

    • Added additional check in Rstreambuf deletetion

    • Added support for clang++ when using libc++, and for anc icpc in std=c++11 mode, thanks to a patch by Yan Zhou

    • New class Rcpp::String to facilitate working with a single element of a character vector

    • New utility class sugar::IndexHash inspired from Simon Urbanek's fastmatch package

    • Implementation of the equality operator between two Rcomplex

    • RNGScope now has an internal counter that enables it to be safely used multiple times in the same stack frame.

    • New class Rcpp::Timer for benchmarking

  • Changes in Rcpp sugar:

    • More efficient version of match based on IndexHash

    • More efficient version of unique base on IndexHash

    • More efficient version of in base on IndexHash

    • More efficient version of duplicated base on IndexHash

    • More efficient version of self_match base on IndexHash

    • New function collapse that implements paste(., collapse= "" )

  • Changes in Rcpp attributes:

    • Use code generation rather than modules to implement sourceCpp and compileAttributes (eliminates problem with exceptions not being able to cross shared library boundaries on Windows)

    • Exported functions now automatically establish an RNGScope

    • Functions exported by sourceCpp now directly reference the external function pointer rather than rely on dynlib lookup

    • On Windows, Rtools is automatically added to the PATH during sourceCpp compilations

    • Diagnostics are printed to the console if sourceCpp fails and C++ development tools are not installed

    • A warning is printed if when compileAttributes detects Rcpp::depends attributes in source files that are not matched by Depends/LinkingTo entries in the package DESCRIPTION

Thanks to CRANberries, you can also look at a diff to the previous release 0.10.1. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Mon, 17 Dec 2012

R / Finance 2013 Call for Papers
The text below just went out to r-sig-finance along with updates to the R/Finance website and its Call for Papers page.

Call for Papers:

R/Finance 2013: Applied Finance with R
May 17 and 18, 2013
University of Illinois, Chicago, IL, USA

The fifth annual R/Finance conference for applied finance using R will be held on May 17 and 18, 2013 in Chicago, IL, USA at the University of Illinois at Chicago. The conference is expected to cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past four years, R/Finance has included attendees from around the world. It featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2013.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for full talks and abbreviated "lightning talks". Both academic and practitioner proposals related to R are encouraged.

Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters at the discretion of the conference committee. Requests for assistance should be made at the time of submission.

Please send submissions to: committee at RinFinance.com. The submission deadline is February 15, 2013. Submitters will be notified of acceptance via email by February 28, 2013. Notification of whether a presentation will be a long presentation or a lightning talk will also be made at that time.

Additional details will be announced at this website as they become available. Information on previous year's presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

So see you in Chicago in May!

/computers/R | permanent link

RcppArmadillo 0.3.6.1
A first minor bug-fix update by Conrad to the 3.6 series of Armadillo arrived today as version 3.6.1, and we prepared a corresponding version 0.3.6.1 of RcppArmadillo, our wrapper for R and Armadillo. This is now on CRAN, and the changes are summarized below.

Changes in RcppArmadillo version 0.3.6.1 (2012-12-17)

  • Upgraded to Armadillo release Version 3.6.1

    • faster trace()

    • fix for handling sparse matrices by dot()

    • fixes for interactions between sparse and dense matrices

  • Now throws compiler error if Rcpp.h is included before RcppArmadillo.h (as the former is included automatically by the latter anyway, but template logic prefers this ordering).

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sat, 08 Dec 2012

Rcpp attributes: Even easier integration of GSL code into R
Following the Rcpp 0.10.0 release, I had written about simulating pi easily by using the wonderful new Rcpp Attributes feature. Now with Rcpp 0.10.1 released a good week ago, it is time to look at how Rcpp Attributes can help with external libraries. As this posts aims to show, it is a breeze!

One key aspect is the use of the plugins for the inline package. They provide something akin to a callback mechanism so that compilation and linking steps can be informed about header and library locations and names. We are going to illustrate this with an example from GNU Scientific Library (GSL). The example I picked uses B-Spline estimation from the GSL. This is a little redundant as R has its own spline routines and package, but serves well as a simple illustration---and, by reproducing an existing example, followed an established path. So we will look at Section 39.7 of the GSL manual which has a complete example as a standalone C program, generating both the data and the fit via cubic B-splines.

We can decompose this two parts: data generation, and fitting. We will provide one function each, and then use both from R. These two function will follow the aforementioned example from Section 39.7 somewhat closely.

We start with the first function to generate the data.

// [[Rcpp::depends(RcppGSL)]]
#include <RcppGSL.h>

#include <gsl/gsl_bspline.h>
#include <gsl/gsl_multifit.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_statistics.h>

const int N = 200;                              // number of data points to fit 
const int NCOEFFS = 12;                         // number of fit coefficients */
const int NBREAK = (NCOEFFS - 2);               // nbreak = ncoeffs + 2 - k = ncoeffs - 2 since k = 4 */

// [[Rcpp::export]]
Rcpp::List genData() {

    const size_t n = N;
    size_t i;
    double dy;
    gsl_rng *r;
    RcppGSL::vector<double> w(n), x(n), y(n);

    gsl_rng_env_setup();
    r = gsl_rng_alloc(gsl_rng_default);

    //printf("#m=0,S=0\n");
    /* this is the data to be fitted */

    for (i = 0; i < n; ++i) {
        double sigma;
        double xi = (15.0 / (N - 1)) * i;
        double yi = cos(xi) * exp(-0.1 * xi);

        sigma = 0.1 * yi;
        dy = gsl_ran_gaussian(r, sigma);
        yi += dy;

        gsl_vector_set(x, i, xi);
        gsl_vector_set(y, i, yi);
        gsl_vector_set(w, i, 1.0 / (sigma * sigma));
                
        //printf("%f %f\n", xi, yi);
    }

    Rcpp::DataFrame res = Rcpp::DataFrame::create(Rcpp::Named("x") = x,
                                                  Rcpp::Named("y") = y,
                                                  Rcpp::Named("w") = w);

    x.free();
    y.free();
    w.free();
    gsl_rng_free(r);

    return(res);
}
We include a few header files, define (in what is common for C programs) a few constants and then define a single function genData() which returns and Rcpp::List as a list object to R. A primary importance here are the two attributes: one to declare a dependence on the RcppGSL package, and one to declare the export of the data generator function. That is all it takes! The plugin of the RcppGSL will provide information about the headers and library, and Rcpp Attributes will do the rest.

The core of the function is fairly self-explanatory, and closely follows the original example. Space gets allocated, the RNG is setup and a simple functional form generates some data plus noise (see below). In the original, the data is written to the standard output; here we return it to R as three columns in a data.frame object familiar to R users. We then free the GSL vectors; this manual step is needed as they are implemented as C vectors which do not have a destructor.

Next, we can turn the fitting function.

// [[Rcpp::export]]
Rcpp::List fitData(Rcpp::DataFrame ds) {

    const size_t ncoeffs = NCOEFFS;
    const size_t nbreak = NBREAK;

    const size_t n = N;
    size_t i, j;

    Rcpp::DataFrame D(ds);              // construct the data.frame object
    RcppGSL::vector<double> y = D["y"]; // access columns by name, 
    RcppGSL::vector<double> x = D["x"]; // assigning to GSL vectors
    RcppGSL::vector<double> w = D["w"];

    gsl_bspline_workspace *bw;
    gsl_vector *B;
    gsl_vector *c; 
    gsl_matrix *X, *cov;
    gsl_multifit_linear_workspace *mw;
    double chisq, Rsq, dof, tss;

    bw = gsl_bspline_alloc(4, nbreak);      // allocate a cubic bspline workspace (k = 4)
    B = gsl_vector_alloc(ncoeffs);

    X = gsl_matrix_alloc(n, ncoeffs);
    c = gsl_vector_alloc(ncoeffs);
    cov = gsl_matrix_alloc(ncoeffs, ncoeffs);
    mw = gsl_multifit_linear_alloc(n, ncoeffs);

    gsl_bspline_knots_uniform(0.0, 15.0, bw);   // use uniform breakpoints on [0, 15] 

    for (i = 0; i < n; ++i) {                   // construct the fit matrix X 
        double xi = gsl_vector_get(x, i);

        gsl_bspline_eval(xi, B, bw);            // compute B_j(xi) for all j 

        for (j = 0; j < ncoeffs; ++j) {         // fill in row i of X 
            double Bj = gsl_vector_get(B, j);
            gsl_matrix_set(X, i, j, Bj);
        }
    }

    gsl_multifit_wlinear(X, w, y, c, cov, &chisq, mw);  // do the fit 
    
    dof = n - ncoeffs;
    tss = gsl_stats_wtss(w->data, 1, y->data, 1, y->size);
    Rsq = 1.0 - chisq / tss;
    
    Rcpp::NumericVector FX(151), FY(151);       // output the smoothed curve 
    double xi, yi, yerr;
    for (xi = 0.0, i=0; xi < 15.0; xi += 0.1, i++) {
        gsl_bspline_eval(xi, B, bw);
        gsl_multifit_linear_est(B, c, cov, &yi, &yerr);
        FX[i] = xi;
        FY[i] = yi;
    }

    Rcpp::List res =
      Rcpp::List::create(Rcpp::Named("X")=FX,
                         Rcpp::Named("Y")=FY,
                         Rcpp::Named("chisqdof")=Rcpp::wrap(chisq/dof),
                         Rcpp::Named("rsq")=Rcpp::wrap(Rsq));

    gsl_bspline_free(bw);
    gsl_vector_free(B);
    gsl_matrix_free(X);
    gsl_vector_free(c);
    gsl_matrix_free(cov);
    gsl_multifit_linear_free(mw);
    
    y.free();
    x.free();
    w.free();

    return(res);   
}

The second function closely follows the second part of the GSL example and, given the input data, fits the output data. Data structures are setup, the spline basis is created, data is fit and then the fit is evaluated at a number of points. These two vectors are returned along with two goodness of fit measures.

We only need to load the Rcpp package and source a file containing the two snippets shown above, and we are ready to deploy this:

library(Rcpp)
sourceCpp("bSpline.cpp")                # compile two functions
dat <- genData()                        # generate the data
fit <- fitData(dat)                     # fit the model, returns matrix and gof measures

And with that, we generate a chart such as

Spline fitting example from GSL manual redone with Rcpp Attributes

via a simple four lines, or as much as it took to create the C++ functions, generate the data and fit it!

op <- par(mar=c(3,3,1,1))
plot(dat[,"x"], dat[,"y"], pch=19, col="#00000044")
lines(fit[[1]], fit[[2]], col="orange", lwd=2)
par(op)

The RcppArmadillo and RcppEigen package support plugin use in the same way. Add an attribute to export a function, and an attribute for the depends -- and you're done. Extending R with (potentially much faster) C++ code has never been easier, and opens a whole new set of doors.

/code/snippets | permanent link

Fri, 07 Dec 2012

RcppArmadillo 0.3.6.0
Conrad launched the 3.6 series of Armadillo earlier today with a first 3.6.0 release. So RcppArmadillo, our wrapper for R and Armadillo, is now on CRAN with its corresponding version 0.3.6.0. No R level or interface changes were needed, and the upstream changes are summarized below.

Changes in RcppArmadillo version 0.3.6.0 (2012-12-07)

  • Upgraded to Armadillo release Version 3.6.0 (Piazza del Duomo)

    • faster handling of compound expressions with submatrices and subcubes

    • added support for loading matrices as text files with NaN and Inf elements

    • added stable_sort_index(), which preserves the relative order of elements with equivalent values

    • added handling of sparse matrices by mean(), var(), norm(), abs(), square(), sqrt()

    • added saving and loading of sparse matrices in arma_binary format

Courtesy of CRANberries, there is also a diffstat report for the most recent release As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 05 Dec 2012

RInside 0.2.10
The new maintenance release 0.2.10 of RInside is now on CRAN, including Windows binaries. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ integration package.

This release helps with an update to stack checking, required by a recent change in R itself. The NEWS extract below has more details.

Changes in RInside version 0.2.10 (2012-12-05)

  • Adjusted to change in R which requires turning checking of the stack limit off in order to allow for access from multiple threads as in the Wt examples. As there are have been no side-effects, this is enabled by default on all platforms (with the exception of Windows).

  • Added new ‘threads’ example directory with a simple example based on a Boost mutex example.

  • Disabled two examples (passing an external function down) which do not currently work; external pointer use should still work.

CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

/code/rinside | permanent link

Sun, 02 Dec 2012

RQuantLib 0.3.9
A minor feature release RQuantLib 0.3.9 is now on CRAN and in Debian. RQuantLib combines (some of) the quantitative analytics of QuantLib with the R statistical computing environment and language.

Bryan Lewis had suggested to enable another pricing engine for American Options in order to get (at least some) Greeks. This is now supported by picking engine="CrankNicolson" as shown in the default example for the AmericanOption function:

R> library(RQuantLib)
R> example(AmericanOption)

AmrcnOR> # simple call with unnamed parameters
AmrcnOR> AmericanOption("call", 100, 100, 0.02, 0.03, 0.5, 0.4)
Concise summary of valuation for AmericanOption 
  value   delta   gamma    vega   theta     rho  divRho 
11.3648      NA      NA      NA      NA      NA      NA 

AmrcnOR> # simple call with some explicit parameters
AmrcnOR> AmericanOption("put", strike=100, volatility=0.4, 100, 0.02, 0.03, 0.5)
Concise summary of valuation for AmericanOption 
  value   delta   gamma    vega   theta     rho  divRho 
10.9174      NA      NA      NA      NA      NA      NA 

AmrcnOR> # simple call with unnamed parameters, using Crank-Nicolons
AmrcnOR> AmericanOption("put", strike=100, volatility=0.4, 100, 0.02, 0.03, 0.5, engine="CrankNicolson")
Concise summary of valuation for AmericanOption 
  value   delta   gamma    vega   theta     rho  divRho 
10.9173 -0.4358  0.0140      NA      NA      NA      NA 
R> 

Thanks to CRANberries, there is also a diff to the previous release 0.3.8. Full changelog details, examples and more details about this package are at my RQuantLib page.

/code/rquantlib | permanent link

Tue, 27 Nov 2012

Rcpp 0.10.1
A the new Rcpp release 0.10.1 arrived this morning on CRAN (as already has Windows binaries) and in Debian.

This is a follow-up to the recent 0.10.0 release which extends the exciting new Rcpp-attributes and Rcpp-sugar work further, and as in a number of other areas as detailed below in the NEWS sections.

This release brings an change to some of the binary interfaces. If you have packages using Rcpp, you will most likely have to reinstall them from source. Some change were made to const correctness as well as other aspects, and it seems that we have temporarily broken the excellent RcppEigen and RcppOctave packages. We are looking into this, and are sorry about the bug.

The complete NEWS entry for 0.10.1 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.10.1 (2012-11-26)

  • Changes in Rcpp sugar:

    • New functions: setdiff, union_, intersect setequal, in, min, max, range, match, table, duplicated

    • New function: clamp which combines pmin and pmax, e.g. clamp( a, x, b) is the same as pmax( b, pmin(x, a) )

    • New function: self_match which implements something similar to match( x, unique( x ) )

  • Changes in Rcpp API:

    • The Vector template class (hence NumericVector ...) get the is_na and the get_na static methods.

    • New helper class no_init that can be used to create a vector without initializing its data, e.g. : IntegerVector out = no_init(n) ;

    • New exception constructor requiring only a message; stop function to throw an exception

    • DataFrame gains a nrows method

  • Changes in Rcpp attributes:

    • Ability to embed R code chunks (via specially formatted block comments) in C++ source files.

    • Allow specification of argument defaults for exported functions.

    • New scheme for more flexible mixing of generated and user composed C++ headers.

    • Print warning if no export attributes are found in source file.

    • Updated vignette with additional documentation on exposing C++ interfaces from packages and signaling errors.

  • Changes in Rcpp modules:

    • Enclose .External invocations in BEGIN_RCPP/END_RCPP

  • Changes in R code :

    • New function areMacrosDefined

    • Additions to Rcpp.package.skeleton:

      • attributes parameter to generate a version of rcpp_hello_world that uses Rcpp::export.

      • cpp_files parameter to provide a list of C++ files to include the in the src directory of the package.

  • Miscellaneous changes:

    • New example 'pi simulation' using R and C++ via Rcpp attributes

Thanks to CRANberries, you can also look at a diff to the previous release 0.10.0. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Sun, 25 Nov 2012

digest 0.6.0
A new version of the digest package (which generates hash function summaries for arbitrary (and possibly nested) R objects using any of the standard md5, sha-1, sha-256, sha-512 or crc32 algorithms) is now on CRAN, and I will push the corresponding version into Debian in a moment.

For this release, Hannes Muehleisen added support for sha-512 using an older standalone function by Aaron D. Gifford which I had to whip into slightly more portable shape to work on Windows. (Hint: uint32_t from stdint.h, not u_int32_t)

CRANberries provides the usual summary of changes to version 0.5.2.

As usual, our package is available via the R-Forge page leading to svn and tarball access, my digest page and the local directory here.

/code/digest | permanent link

Tue, 20 Nov 2012

Rcpp attributes: A simple example 'making pi'
We introduced Rcpp 0.10.0 with a number of very nice new features a few days ago, and the activity on the rcpp-devel mailing list has been pretty responsive which is awesome.

But because few things beat a nice example, this post tries to build some more excitement. We will illustrate how Rcpp attributes makes it really easy to add C++ code to R session, and that that code is as easy to grasp as R code.

Our motivating example is everybody's favourite introduction to Monte Carlo simulation: estimating π. A common method uses the fact the unit circle has a surface area equal to π. We draw two uniform random numbers x and y, each between zero and one. We then check for the distance of the corresponding point (x,y) relative to the origin. If less than one (or equal), it is in the circle (or on it); if more than one it is outside. As the first quadrant is a quarter of a square of area one, the area of the whole circle is π -- so our first quadrant approximates π over four. The following figure, kindly borrowed from Wikipedia with full attribution and credit, illustrates this:

Example of simulating pi

Now, a vectorized version (drawing N such pairs at once) of this approach is provided by the following R function.

piR <- function(N) {
    x <- runif(N)
    y <- runif(N)
    d <- sqrt(x^2 + y^2)
    return(4 * sum(d < 1.0) / N)
}

And in C++ we can write almost exactly the same function thanks the Rcpp sugar vectorisation available via Rcpp:

#include <Rcpp.h>

using namespace Rcpp;

// [[Rcpp::export]]
double piSugar(const int N) {
    RNGScope scope;		// ensure RNG gets set/reset
    NumericVector x = runif(N);
    NumericVector y = runif(N);
    NumericVector d = sqrt(x*x + y*y);
    return 4.0 * sum(d < 1.0) / N;
}
Sure, there are small differences: C++ is statically typed, R is not. We need one include file for declaration, and we need one instantiation of the RNGScope object to ensure random number draws remain coordinated between the calling R process and the C++ function calling into its (compiled C-code based) random number generators. That way we even get the exact same draws for the same seed. But the basic approach is identical: draw a vector x and vector y, compute the distance to the origin and then obtain the proportion within the unit circle -- which we scale by four. Same idea, same vectorised implementation in C++.

But the real key here is the one short line with the [[Rcpp::export]] attribute. This is all it takes (along with sourceCpp() from Rcpp 0.10.0) to get the C++ code into R.

The full example (which assumes the C++ file is saved as piSugar.cpp in the same directory) is now:

#!/usr/bin/r

library(Rcpp)
library(rbenchmark)

piR <- function(N) {
    x <- runif(N)
    y <- runif(N)
    d <- sqrt(x^2 + y^2)
    return(4 * sum(d < 1.0) / N)
}

sourceCpp("piSugar.cpp")

N <- 1e6

set.seed(42)
resR <- piR(N)

set.seed(42)
resCpp <- piSugar(N)

## important: check results are identical with RNG seeded
stopifnot(identical(resR, resCpp))

res <- benchmark(piR(N), piSugar(N), order="relative")

print(res[,1:4])

and it does a few things: set up the R function, source the C++ function (and presto: we have a callable C++ function just like that), compute two simulations given the same seed and ensure they are in fact identical -- and proceed to compare the timing in a benchmarking exercise. That last aspect is not even that important -- we end up being almost-but-not-quite twice as fast on my machine for different values of N.

The real takeaway here is the ease with which we can get a C++ function into R --- and the new process completely takes care of passing parameters in, results out, and does the compilation, linking and loading.

More details about Rcpp attributes are in the new vignette. Now enjoy the π.

Update:One somewhat bad typo fixed.

Update:Corrected one background tag.

/code/snippets | permanent link

Fri, 16 Nov 2012

RcppArmadillo 0.3.4.4
A minor bug-fix release 3.4.4 of Armadillo came out upstream a few days ago. RcppArmadillo, our wrapper for R and Armadillo, is now on CRAN with its corresponding version 0.3.4.4. No R level or interface changes were made and the upstream changes are summarized below.

Changes in RcppArmadillo version 0.3.4.4 (2012-11-15)

  • Upgraded to Armadillo release 3.4.4

    • fix for handling complex numbers by sparse matrices

    • fix for minor memory leak by sparse matrices

Courtesy of CRANberries, there is also a diffstat report for 0.3.4.4 relative to 0.3.4.3 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Wed, 14 Nov 2012

Rcpp and the new R:: namespace for Rmath.h
We released Rcpp 0.10.0 earlier today. This post will just provide a simple example for one of the smaller new features -- the new namespace for functions from Rmath.h -- and illustrate one of the key features (Rcpp attributes) in passing.

R, as a statistical language and environment, has very well written and tested statistical distribution functions providing probability density, cumulative distribution, quantiles and random number draws for dozens of common and not so common distribution functions. This code is used inside R, and available for use from standalone C or C++ programs via the standalone R math library which Debian / Ubuntu have as a package r-mathlib (and which can be built from R sources).

User sometimes write code against this interface, and then want to combine the code with other code, possibly even with Rcpp. We allowed for this, but it required a bit of an ugly interface. R provides a C interface; these have no namespaces. Identifiers can clash, and to be safe one can enable a generic prefix Rf_. So functions which could clash such as length or error become Rf_length and Rf_error and are less likely to conflict with symbols from other libraries. Unfortunately, the side-effect is that calling, say, the probability distribution function for the Normal distribution becomes Rf_pnorm5() (with the 5 denoting the five parameters: quantile, mean, std.deviation, lowerTail, logValue). Not pretty, and not obvious.

So one of the things we added was another layer of indirection by adding a namespace R with a bunch of inline'd wrapper functions (as well as several handful of unit tests to make sure we avoided typos and argument transposition and what not).

The short example below shows this for a simple function taking a vector, and returning its pnorm computed three different ways:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::DataFrame mypnorm(Rcpp::NumericVector x) {
    int n = x.size();
    Rcpp::NumericVector y1(n), y2(n), y3(n);

    for (int i=0; i<n; i++) {

        // the way we used to do this
        y1[i] = ::Rf_pnorm5(x[i], 0.0, 1.0, 1, 0);

        // the way we can do it now
        y2[i] = R::pnorm(x[i], 0.0, 1.0, 1, 0);

    }
    // or using Rcpp sugar in one go
    y3 = Rcpp::pnorm(x);

    return Rcpp::DataFrame::create(Rcpp::Named("Rold")  = y1,
                                   Rcpp::Named("Rnew")  = y2,
                                   Rcpp::Named("sugar") = y3);
}

This example also uses the new Rcpp attributes described briefly in the announcement blog post and of course in more detail in the corresponding vignette. Let us just state here that we simply provide a complete C++ function, using standard Rcpp types -- along with one 'attribute' declaration of an export via Rcpp. That's it -- even easier than using inline.

Now in R we simply do

R> sourceCpp("mypnorm.cpp")
to obtain a callable R function with the C++ code just shown behind it. No Makefile, no command-line tool invocation -- nothing but a single call to sourceCpp() which takes care of things --- and brings us a compiled C++ function to R just given the source file with its attribute declaration.

We can now use the new function to compute the probaility distribution both the old way, the new way with the 'cleaner' R::pnorm(), and of course the Rcpp sugar way in a single call. We build a data frame in C++, and assert that all three variants are the same:

R> x <- seq(0, 1, length=1e3)
R> res <- mypnorm(x)
R> head(res)
      Rold     Rnew    sugar
1 0.500000 0.500000 0.500000
2 0.500399 0.500399 0.500399
3 0.500799 0.500799 0.500799
4 0.501198 0.501198 0.501198
5 0.501597 0.501597 0.501597
6 0.501997 0.501997 0.501997
R> all.equal(res[,1], res[,2], res[,3])
[1] TRUE
R> 
This example hopefully helped to illustrate how Rcpp 0.10.0 brings both something really powerful (Rcpp attributes -- more on this another time, hopefully) and convenient in the new namespace for statistical functions.

/code/snippets | permanent link

Rcpp 0.10.0
Rcpp release 0.10.0 is now on CRAN and being uploaded to Debian.

This is a new feature release, and we are very exciting about the changes, notably Rcpp attributes which make using C++ from R even easier than inline (see below as well as the new vignette for details and first examples), the extensions to Rcpp modules (see below) and more as for example new Rcpp sugar functions, a new error output device syncing to R, and a new namespace R> for the statistical functions from Rmath.h.

The complete NEWS entry for 0.10.0 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.10.0 (2012-11-13)

  • Support for C++11 style attributes (embedded in comments) to enable use of C++ within interactive sessions and to automatically generate module declarations for packages:

    • Rcpp::export attribute to export a C++ function to R

    • sourceCpp() function to source exported functions from a file

    • cppFunction() and evalCpp() functions for inline declarations and execution

    • compileAttribtes() function to generate Rcpp modules from exported functions within a package

    • Rcpp::depends attribute for specifying additional build dependencies for sourceCpp()

    • Rcpp::interfaces attribute to specify the external bindings compileAttributes() should generate (defaults to R-only but a C++ include file using R_GetCCallable can also be generated)

    • New vignette "Rcpp-attribute"

  • Rcpp modules feature set has been expanded:

    • Functions and methods can now return objects from classes that are exposed through modules. This uses the make_new_object template internally. This feature requires that some class traits are declared to indicate Rcpp's wrap/as system that these classes are covered by modules. The macro RCPP_EXPOSED_CLASS and RCPP_EXPOSED_CLASS_NODECL can be used to declared these type traits.

    • Classes exposed through modules can also be used as parameters of exposed functions or methods.

    • Exposed classes can declare factories with ".factory". A factory is a c++ function that returns a pointer to the target class. It is assumed that these objects are allocated with new on the factory. On the R side, factories are called just like other constructors, with the "new" function. This feature allows an alternative way to construct objects.

    • "converter" can be used to declare a way to convert an object of a type to another type. This gets translated to the appropriate "as" method on the R side.

    • Inheritance. A class can now declare that it inherits from another class with the .derives<Parent>( "Parent" ) notation. As a result the exposed class gains methods and properties (fields) from its parent class.

  • New sugar functions:

    • which_min implements which.min. Traversing the sugar expression and returning the index of the first time the minimum value is found.

    • which_max idem

    • unique uses unordered_set to find unique values. In particular, the version for CharacterVector is found to be more efficient than R's version

    • sort_unique calculates unique values and then sorts them.

  • Improvements to output facilities:

    • Implemented sync() so that flushing output streams works

    • Added Rcerr output stream (forwarding to REprintf)

  • Provide a namespace 'R' for the standalone Rmath library so that Rcpp users can access those functions too; also added unit tests

  • Development releases sets variable RunAllRcppTests to yes to run all tests (unless it was alredy set to 'no'); CRAN releases do not and still require setting – which helps with the desired CRAN default of less testing at the CRAN server farm.

Thanks to CRANberries, you can also look at a diff to the previous release 0.9.15. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

Update: One link corrected.

/code/rcpp | permanent link

Mon, 05 Nov 2012

RInside 0.2.9
A new version 0.2.9 of RInside arrived on CRAN earlier today; Windows binaries have already been built too. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ integration package.

This release adds a few new features as detailed in the extract from the NEWS below.

A key new feature is the added support for resilience to bad user input based on discussions and an initial (but altered) patch by Theodore Lytras. There are a few ways that this can be deployedon so we also added two more example programs detailing it. The featured Qt example and the Wt example have been updated to use this too.

Another feature that may be quite useful for some is the additonal attempt to find a value for R_HOME if none has been set. The value is typically found at compile time of the RInside package which poses a problem for those using the Windows build -- which is shipped as a binary reflecting the value of the build system. One alternative has always been to build the package locally too to get the local value, or to set it explicitly. But because the error behaviour – a cryptic message of Cannot open base package – is confusing to many, we now try to call the R function which gets this value from the registry. This may need more tweaking and testing, and if you use RInside on the Windows platform I would appreciate feedback.

The complete list of changes since the last release are summarized below in the corresponding NEWS file entry:

Changes in RInside version 0.2.9 (2012-11-04)

  • Applied (modified) patch by Theodore Lytras which lets RInside recover from some parsing errors and makes RInside applications more tolerant of errors

  • Added non-throwing variants of parseEval() and parseEvalQ()

  • Modified Qt and Wt examples of density estimation applications to be much more resilient to bad user input

  • On Windows, have RInside use R's get_R_HOME() function to get R_HOME value from registry if not set by user

  • Added note to examples/standard/Makefile.win that R_HOME may need to be set to run the executables – so either export your local value, or re-install RInside from source to have it reflected in the library build of libRinside

  • Updated CMake build support for standard, armadillo and eigen

  • Improved CMake builds of examples/standard, examples/eigen and examples/armadillo by detecting architecture

CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

/code/rinside | permanent link

Thu, 25 Oct 2012

Accelerating R code: Computing Implied Volatilities Orders of Magnitude Faster
This blog, together with Romain's, is one of the main homes of stories about how Rcpp can help with getting code to run faster in the context of the R system for statistical programming and analysis. By making it easier to get already existing C or C++ code to R, or equally to extend R with new C++ code, Rcpp can help in getting stuff done. And it is often fairly straightforward to do so.

In this context, I have a nice new example. And for once, it is work-related. I generally cannot share too much of what we do there as this is, well, proprietary, but I have this nice new example. The other day, I was constructing (large) time series of implied volatilities. Implied volatilities can be thought of as the complement to an option's price: given a price (and all other observables which can be thought of as fixed), we compute an implied volatility price (typically via the standard Black-Scholes model). Given a changed implied volatility, we infer a new price -- see this Wikipedia page for more details. In essence, it opens the door to all sorts of arbitrage and relative value pricing adventures.

Now, we observe prices fairly frequently to create somewhat sizeable time series of option prices. And each price corresponds to one matching implied volatility, and for each such price we have to solve a small and straightforward optimization problem: to compute the implied volatility given the price. This is usually done with an iterative root finder.

The problem comes from the fact that we have to do this (i) over and over and over for large data sets, and (ii) that there are a number of callbacks from the (generic) solver to the (standard) option pricer.

So our first approach was to just call the corresponding function GBSVolatility from the fOption package from the trusted Rmetrics project by Diethelm Wuertz et al. This worked fine, but even with the usual tricks of splitting over multiple cores/machines, it simply took too long for the resolution and data amount we desired. One of the problems is that this function (which uses the proper uniroot optimizer in R) is not inefficient per se, but simply makes to many function call back to the option pricer as can be seen from a quick glance at the code. The helper function .fGBSVolatility gets called time and time again:

R> GBSVolatility
function (price, TypeFlag = c("c", "p"), S, X, Time, r, b, tol = .Machine$double.eps, 
    maxiter = 10000) 
{
    TypeFlag = TypeFlag[1]
    volatility = uniroot(.fGBSVolatility, interval = c(-10, 10), 
        price = price, TypeFlag = TypeFlag, S = S, X = X, Time = Time, 
        r = r, b = b, tol = tol, maxiter = maxiter)$root
    volatility
}
<environment: namespace:fOptions>
R> 
R> .fGBSVolatility
function (x, price, TypeFlag, S, X, Time, r, b, ...) 
{
    GBS = GBSOption(TypeFlag = TypeFlag, S = S, X = X, Time = Time, 
        r = r, b = b, sigma = x)@price
    price - GBS
}
<environment: namespace:fOptions>

So the next idea was to try the corresponding function from my RQuantLib package which brings (parts of) QuantLib to R. That was seen as been lots faster already. Now, QuantLib is pretty big and so is RQuantLib, and we felt it may not make sense to install it on a number of machines just for this simple problem. So one evening this week I noodled around for an hour or two and combined (i) a basic Black/Scholes calculation and (ii) a standard univariate zero finder (both of which can be found or described in numerous places) to minimize the difference between the observed price and the price given an implied volatility. With about one hundred lines in C++, I had something which felt fast enough. So today I hooked this into R via a two-line wrapper in quickly-created package using Rcpp.

I had one more advantage here. For our time series problem, the majority of the parameters (strike, time to maturity, rate, ...) are fixed, so we can structure the problem to be vectorised right from the start. I cannot share the code or more the details of my new implementation. However, both GBSVolatility and EuropeanOprionImpliedVolatility are on CRAN (and as I happen to maintain these for Debian, also just one sudo apt-get install r-cran-foptions r-cran-rquantlib away if you're on Debian or Ubuntu). And writing the other solver is really not that involved.

Anyway, here is the result, courtesy of a quick run via the rbenchmark package. We create a vector of length 500; the implied volatility computation will be performed at each point (and yes, our time series are much longer indeed). This is replicated 100 times (as is the default for rbenchmark) for each of the three approaches:

xyz@xxxxxxxx:~$ r xxxxR/packages/xxxxOptions/demo/timing.R
    test replications elapsed  relative user.self sys.self user.child sys.child
3 zzz(X)          100   0.038     1.000     0.040    0.000          0         0
2 RQL(X)          100   3.657    96.237     3.596    0.060          0         0
1 fOp(X)          100 448.060 11791.053   446.644    1.436          0         0
xyz@xxxxxxxx:~$ 
The new local solution is denoted by zzz(X). It is already orders of magnitude faster than the RQL(x) function using RQuantLib (which is, I presume, due to my custom solution internalising the loop). And the new approach is a laughable amount faster than the basic approach (shown as fOp) via fOptions. For one hundred replications of solving implied volatilities for all elements of a vector of size 500, the slow solution takes about 7.5 minutes --- while the fast solution takes 38 milliseconds. Which comes to a relative gain of over 11,000.

So sitting down with your C++ compiler to craft a quick one-hundred lines, combining two well-known and tested methods, can reap sizeable benefits. And Rcpp makes it trivial to call this from R.

/code/snippets | permanent link

Sun, 14 Oct 2012

Rcpp 0.9.15
Rcpp release 0.9.15 is now on CRAN and being uploaded to Debian.

Martin Morgan provided a clever fix for a header search needed between clang++ (especially on OS X) and g++ (which still provided libstdc++ and headers for clang++). This should hopefully put the clang issues to bed. Ben North noticed an unprotected string conversion when exception messages are turned into R errors which got fixed, and I expanded the coverage of Date (and Datetime) types to deal properly with non-finite values NA, NaN and Inf.

The complete NEWS entry for 0.9.15 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.9.15 (2012-10-13)

  • Untangling the clang++ build issue about the location of the exceptions header by directly checking for the include file – an approach provided by Martin Morgan in a kindly contributed patch as unit tests for them.

  • The Date and Datetime types now correctly handles NA, NaN and Inf representation; the Date type switched to an internal representation via double

  • Added Date and Datetime unit tests for the new features

  • An additional PROTECT was added for parsing exception messages before returning them to R, following a report by Ben North

Thanks to CRANberries, you can also look at a diff to the previous release 0.9.14. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Thu, 04 Oct 2012

RcppArmadillo 0.3.4.3
Another bug-fix release of Armadillo, now at version 3.4.3 whike the 3.4.* stabilizes, and with it a version 0.3.4.3 of RcppArmadillo, our wrapper for R and Armadillo. The new version is already on CRAN as of earlier today. Once again no R level or interface changes were, the upstream changes are summarized below.

Changes in RcppArmadillo version 0.3.4.3 (2012-10-04)

  • Upgraded to Armadillo release 3.4.3

    • fix for aliasing issue in diagmat()

    • fix for speye() signature

Courtesy of CRANberries, there is also a diffstat report for 0.3.4.3 relative to 0.3.4.2 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

RProtoBuf 0.2.6
Release 0.2.6 of RProtoBuf arrived on CRAN earlier this morning. RProtoBuf provides GNU R bindings for the Google Protobuf data encoding library used and released by Google.

This release was once more driven largely by Murray whom we have now added among the list of authors of the package too. The NEWS file entry follows below:

Changes in version 0.2.6 (2012-10-04)

  • Applied several more patches by Murray to

    • correct '_' and '__' mismatches in wrapper calls

    • update a few manual pages for style, and add examples

    • fix a bug where NAs were silently treated as TRUE for logical/bool types

    • fix a bug that caused crashes when adding vectors to optional fields

    • fix bugs in readASCII that returned empty protocol buffers when the file or connection could not be opened

    • distinguish between non-existant and not-set fieldswith has() by returning NULL in the former case.

    • fix a bug that caused non-deterministic behavior when setting a repeated message field in a protobuf to a single Message.

    • add unit tests for all of the above.

  • Added Murray to Authors: field in DESCRIPTION

  • Removed old and unconvincing example on RProtoBuf for storage and serialization in an imagined HighFrequencyFinance context

CRANberries also provides a diff to the previous release 0.2.5. More information is at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.

/code/rprotobuf | permanent link

Mon, 01 Oct 2012

Rcpp 0.9.14
Another release of Rcpp has just appeared on CRAN and was just uploaded to Debian.

It addresses yet another issue we had on OS X and should hopefully put the build issues to rest. Three new (vectorized) sugar functions were added, along with some new regression tests and more. The complete NEWS entry for 0.9.14 is below; more details are in the ChangeLog file in the package and on the Rcpp Changelog page.

Changes in Rcpp version 0.9.14 (2012-09-30)

  • Added new Rcpp sugar functions trunc(), round() and signif(), as well as unit tests for them

  • Be more conservative about where we support clang++ and the inclusion of exception_defines.h and prevent this from being attempted on OS X where it failed for clang 3.1

  • Corrected a typo in Module.h which now again permits use of finalizers

  • Small correction for (unexported) bib() function (which provides a path to the bibtex file that ships with Rcpp)

  • Converted NEWS to NEWS.Rd

Thanks to CRANberries, you can also look at a diff to the previous release 0.9.13. As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Tue, 25 Sep 2012

RcppArmadillo 0.3.4.2
The development of Armadillo 3.4.* continues with bug fixes and more sparse matrix support. Conrad release 3.4.2 this morning. I wrapped up the corresponding RcppArmadillo 0.3.4.2 before leaving for work, and this version should now have all CRAN mirrors. Once again no R level or interface changes were, the upstream changes are summarized below.

Changes in RcppArmadillo version 0.3.4.2 (2012-09-25)

  • Upgraded to Armadillo release 3.4.2

    • minor fixes for handling sparse submatrix views

    • minor speedups for sparse matrices

Courtesy of CRANberries, there is also a diffstat report for 0.3.4.2 relative to 0.3.4.1 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Tue, 18 Sep 2012

RcppArmadillo 0.3.4.1
Conrad released the bug-fix release 3.4.1 of Armadillo earlier today, and the corresponding RcppArmadillo package 0.3.4.1 is already on CRAN. No R level or interface changes were, the upstream changes are summarized below.

Changes in RcppArmadillo version 0.3.4.1 (2012-09-18)

  • Upgraded to Armadillo release 3.4.1

    • workaround for a bug in the Mac OS X accelerate framework

    • fixes for handling empty sparse matrices

    • added documentation for saving & loading matrices in HDF5 format

    • faster dot() and cdot() for complex numbers

Courtesy of CRANberries, there is also a diffstat report for 0.3.4.1 relative to 0.3.4.0 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sat, 08 Sep 2012

RInside 0.2.8
This morning version 0.2.8 of RInside arrived on the CRAN sites. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ integration package.

This release adds no new features but improves the build process a little and should make use on Windows a little easier. All changes since the last release are summarized below in the NEWS file entry:

Changes in RInside version 0.2.8 (2012-09-07)

  • Added CMake build support for armadillo and eigen examples, once again kindly contributed by Peter Aberline

  • Corrected Windows package build to always generate a 64 bit static library too

  • Updated package build to no longer require configire and configure.win to update the two header file supplying compile-time information; tightened build dependencies on headers in Makevars / Makevars.win

  • Improved examples/standard/Makefile.win by detecting architecture

CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

/code/rinside | permanent link

Thu, 06 Sep 2012

RcppArmadillo 0.3.4.0
A new major released of Armadillo came out earlier today. I prepared the corresponding RcppArmadillo package 0.3.4.0 which also arrived on CRAN earlier today. This released contains a few performance improvements, the beginnings of support of sparse matrices and more, see below. We also post the NEWS entry for the beta release which was prepared, but not uploaded to CRAN to minimise the upload frequency there. On the RcppArmadillo side, two enhancements were made for the fastLm() function for faster linear model fits.

Changes in RcppArmadillo version 0.3.4.0 (2012-09-06)

  • Upgraded to Armadillo release 3.4.0 (Ku De Ta)

    • added economical QR decomposition: qr_econ()

    • added .each_col() & .each_row() for vector operations repeated on each column or row

    • added preliminary support for sparse matrices, contributed by Ryan Curtin et al. (Georgia Institute of Technology)

    • faster singular value decomposition via divide-and-conquer algorithm

    • faster .randn()

  • NEWS file converted to Rd format

Changes in RcppArmadillo version 0.3.3.91 (2012-08-30)

  • Upgraded to Armadillo release 3.3.91

    • faster singular value decomposition via "divide and conquer" algorithm

    • added economical QR decomposition: qr_econ()

    • added .each_col() & .each_row() for vector operations repeated on each column or row

    • added preliminary support for sparse matrices, contributed by Ryan Curtin, James Cline and Matthew Amidon (Georgia Institute of Technology)

  • Corrected summary method to deal with the no intercept case when using a formula; also display residual summary() statistics

  • Expanded unit tests for fastLm

Courtesy of CRANberries, there is also a diffstat report for 0.3.4.0 relative to 0.3.2.4 As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

/code/rcpp | permanent link

Sun, 02 Sep 2012

Faster creation of binomial matrices
Scott Chamberlain blogged about faster creation of binomial matrices the other day, and even referred to our RcppArmadillo package as a possible solution (though claiming he didn't get it to work, tst tst -- that is what the rcpp-devel list is here to help with).

The post also fell short of a good aggregated timing comparison for which we love the rbenchmark package. So in order to rectify this, and to see what we can do here with Rcpp, a quick post revisiting the issue.

As preliminaries, we need to load three packages: inline to create compiled code on the fly (which, I should mention, is also used together with Rcpp by the Stan / RStan MCMC sampler which is creating some buzz this week), the compiler package included with R to create byte-compiled code and lastly the aforementioned rbenchmark package to do the timings. We also set row and column dimension, and set them a little higher than the original example to actually have something measurable:

library(inline)
library(compiler)
library(rbenchmark)

n <- 500
k <- 100
The first suggestion was the one by Scott himself. We will wrap this one, and all the following ones, in a function so that all approaches are comparable as being in a function of two dimension arguments:
scott <- function(N, K) {
    mm <- matrix(0, N, K)
    apply(mm, c(1, 2), function(x) sample(c(0, 1), 1))
}
scottComp <- cmpfun(scott)
We also immediatly compute a byte-compiled version (just because we now can) to see if this helps at all with the code. As there are no (explicit !) loops, we do not expect a big pickup. Scott's function works, but sweeps the sample() function across all rows and columns which is probably going to be (relatively) expensive.

Next is the first improvement suggested to Scott which came from Ted Hart.

ted <- function(N, K) {
    matrix(rbinom(N * K, 1, 0.5), ncol = K, nrow = N)
}
This is quite a bit smarter as it vectorises the approach, generating N times K elements at once which are then reshaped into a matrix.

Another suggestion came from David Smith as well as Rafael Maia. We rewrite it slightly to make it a function with two arguments for the desired dimensions:

david <- function(m, n) {
    matrix(sample(0:1, m * n, replace = TRUE), m, n)
}
This is very clever as it uses sample() over zero and one rather than making (expensive) draws from random number generator.

Next we have a version from Luis Apiolaza:

luis <- function(m, n) {
     round(matrix(runif(m * n), m, n))
}
It draws from a random uniform and rounds to zero and one, rather than deploying the binomial.

Then we have the version using RcppArmadillo hinted at by Scott, but with actual arguments and a correction for row/column dimensions. Thanks to inline we can write the C++ code as an R character string; inline takes care of everything and we end up with C++-based solution directly callable from R:

arma <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   return wrap(arma::randu(n, k));
')
This works, and is pretty fast. The only problem is that it answers the wrong question as it returns U(0,1) draws and not binomials. We need to truncate or round. So a corrected version is
armaFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   return wrap(arma::floor(arma::randu(n, k) + 0.5));
')
which uses the the old rounding approximation of adding 1/2 before truncating.

With Armadillo in the picture, we do wonder how Rcpp sugar would do. Rcpp sugar, described in one of the eight vignettes of the Rcpp package, is using template meta-programming to provide R-like expressiveness (aka "syntactic sugar") at the C++ level. In particular, it gives access to R's RNG functions using the exact same RNGs as R making the results directly substitutable (whereas Armadillo uses its own RNG).

sugar <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   Rcpp::RNGScope tmp;
   Rcpp::NumericVector draws = Rcpp::runif(n*k);
   return Rcpp::NumericMatrix(n, k, draws.begin());
')
Here Rcpp::RNGScope deals with setting/resetting the R RNG state. This draws a vector of N time K uniforms similar to Luis' function -- and just like Luis' R function does so without looping -- and then shapes a matrix of dimension N by K from it.

And it does of course have the same problem as the RcppArmadillo approach earlier and we can use the same solution:

sugarFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   Rcpp::RNGScope tmp;
   Rcpp::NumericVector draws = Rcpp::floor(Rcpp::runif(n*k)+0.5);
   return Rcpp::NumericMatrix(n, k, draws.begin());
')

Now that we have all the pieces in place, we can compare:

res <- benchmark(scott(n, k), scottComp(n,k),
                 ted(n, k), david(n, k), luis(n, k),
                 arma(n, k), sugar(n,k),
                 armaFloor(n, k), sugarFloor(n, k),
                 order="relative", replications=100)
print(res[,1:4])
With all the above code example in a small R script we call via littler, we get
edd@max:~/svn/rcpp/pkg$ r /tmp/scott.r 
Loading required package: methods
              test replications elapsed   relative
7      sugar(n, k)          100   0.072   1.000000
9 sugarFloor(n, k)          100   0.088   1.222222
6       arma(n, k)          100   0.126   1.750000
4      david(n, k)          100   0.136   1.888889
8  armaFloor(n, k)          100   0.138   1.916667
3        ted(n, k)          100   0.384   5.333333
5       luis(n, k)          100   0.410   5.694444
1      scott(n, k)          100  33.045 458.958333
2  scottComp(n, k)          100  33.767 468.986111
We can see several takeaways:
  • Rcpp sugar wins, which is something we have seen in previous posts on this blog. One hundred replication take only 72 milliseconds (or 88 in the corrected version) --- less than one millisecond per matrix creation.
  • RcppArmadillo does well too, and I presume that the small difference is due not to code in Armadillo but the fact that we need one more 'mapping' of data types on the way back to R
  • The sample() idea by David and Rafael is very, very fast too. This proves once again that well-written R code can be competitive. It also suggest how to make the C++ solution by foregoing (expensive) RNG draws in favour of sampling
  • The approaches by Ted and Luis are also pretty good. In practice, the are probably good enough.
  • Scott's function is not looking so hot (particularly as we increased the problem dimensions) and byte-compilation does not help at all.
Thanks to Scott and everybody for suggesting this interesting problem. Trying the rbinom() Rcpp sugar function, or implementing sample() at the C++ level is, as the saying goes, left as an exercise to the reader.

/code/snippets | permanent link

Thu, 16 Aug 2012

Follow-up to Counting CRAN Package Depends, Imports and LinkingTo
A few days ago, I blogged about visualizing CRAN dependency ranks which turned out to be a somewhat popular post. David Smith followed-up at the REvo blog suggesting to exclude packages already shipping with R (which is indicated by their 'Recommended' priority). Good idea!

So here is an updated version, where we limit the display to the top twenty packages counted by reverse 'Depends:', and excluding those already shipping with R such as MASS, lattice, survival, Matrix, or nlme.

CRAN package chart of Reverse Depends relations excluding Recommended packages

The mvtnorm package is still out by a wide margin, but we can note that (cough, cough) our Rcpp package for seamless R and C++ is now tied for second with the coda package for MCMC analysis. Also of note is the fact that CRAN keeps growing relentlessly and moved from 3969 packages to 3981 packages in the space of these few days...

Lastly, I have been asked about the code and/or data behind this. It is really pretty simply as the main data.frame can be had from CRAN (where I also found the initial few lines to load it). After that, one only needs a little bit of subsetting as shown below. I look forward to seeing other people riff on this data set.

#!/usr/bin/r
##
## Initial db downloand from http://developer.r-project.org/CRAN/Scripts/depends.R and adapted

require("tools")

## this function is essentially the same as R Core's from the URL
## http://developer.r-project.org/CRAN/Scripts/depends.R
getDB <- function() {
    contrib.url(getOption("repos")["CRAN"], "source") # trigger chooseCRANmirror() if required
    description <- sprintf("%s/web/packages/packages.rds", getOption("repos")["CRAN"])
    con <- if(substring(description, 1L, 7L) == "file://") {
        file(description, "rb")
    } else {
        url(description, "rb")
    }
    on.exit(close(con))
    db <- readRDS(gzcon(con))
    rownames(db) <- db[,"Package"]

    db
}

db <- getDB()

## count packages
getCounts <- function(db, col) {
    foo <- sapply(db[,col],
                  function(s) { if (is.na(s)) NA else length(strsplit(s, ",")[[1]]) } )
}

## build a data.frame with the number of entries for reverse depends, reverse imports,
## reverse linkingto and reverse suggests; also keep Recommended status
ddall <- data.frame(pkg=db[,1],
                    RDepends=getCounts(db, "Reverse depends"),
                    RImports=getCounts(db, "Reverse imports"),
                    RLinkingTo=getCounts(db, "Reverse linking to"),
                    RSuggests=getCounts(db, "Reverse suggests"),
                    Recommended=db[,"Priority"]=="recommended"
                    )

## Subset to non-Recommended packages as in David Smith's follow-up post
dd <- subset(ddall, is.na(ddall[,"Recommended"]) | ddall[,"Recommended"] != TRUE)

labeltxt <- paste("Analysis as of", format(Sys.Date(), "%d %b %Y"),
                  "covering", nrow(db), "total CRAN packages")

cutOff <- 20
doPNG <- TRUE

if (doPNG) png("/tmp/CRAN_ReverseDepends.png", width=600, heigh=600)
z <- dd[head(order(dd[,2], decreasing=TRUE), cutOff),c(1,2)]
dotchart(z[,2], labels=z[,1], cex=1, pch=19,
         main="CRAN Packages sorted by Reverse Depends:",
         sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"),
         xlab=labeltxt)
if (doPNG) dev.off()

if (doPNG) png("/tmp/CRAN_ReverseImports.png", width=600, heigh=600)
z <- dd[head(order(dd[,3], decreasing=TRUE), cutOff),c(1,3)]
dotchart(z[,2], labels=z[,1], cex=1, pch=19,
         main="CRAN Packages sorted by Reverse Imports:",
         sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"),
         xlab=labeltxt)
if (doPNG) dev.off()

# no cutOff but rather a na.omit
if (doPNG) png("/tmp/CRAN_ReverseLinkingTo.png", width=600, heigh=600)
z <- na.omit(dd[head(order(dd[,4], decreasing=TRUE), 30),c(1,4)])
dotchart(z[,2], labels=z[,1], pch=19,
         main="CRAN Packages sorted by Reverse LinkingTo:",
         xlab=labeltxt)
if (doPNG) dev.off()

/code/snippets | permanent link

Mon, 13 Aug 2012

RInside 0.2.7
A new version 0.2.7 of RInside is now available via CRAN. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ integration package.

This release adds two new examples subdirectories demonstrating use of RInside with, respectively, RcppArmadillo and RcppEigen. We extended the 'web application' example using the Wt toolkit by adding CSS and XML support, and added another new example motivated by a StackOverflow question. CMake support has been added for Windows as well thanks to Peter Aberline---he also contributed CMake code for the two new example directories but that contribution made it only into SVN and not this release.

All changes since the last release are summarized below in the NEWS file entry:

Changes in RInside version 0.2.7 (2012-08-12)

  • New fifth examples subdirectory 'armadillo' with two new examples showing how to combine RInside with RcppArmadillo

  • New sixth examples subdirectory 'eigen' with two new examples showing how to combine RInside with RcppEigen

  • Prettified the Wt example 'web application' with CSS use, also added and XML file with simple headers and description text

  • New example rinside_sample12 motivated by StackOverflow question on using sample() from C

  • Added CMake build support on Windows for the examples

CRANberries also provides a short report with changes from the previous release. More information is on the RInside page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

/code/rinside | permanent link

Fri, 10 Aug 2012

RcppExamples 0.1.4
An updated version 0.1.4 of the RcppExamples package is now on on CRAN. RcppExamples contains a few illustrations of how to use Rcpp.

The NEWS entry is below: a new example was added illustrating use of the (vectorised) random-number generators for three of the different distributions --- and showing how it perfectly reproduces the values one gets in R.

Changes in RcppExamples version 0.1.4 (2012-08-09)

  • Added new example for Rcpp sugar and vectorised draws of RNGs

  • Minor updates to reflect newer CRAN Policy

Thanks to CRANberries, you can also look at a diff to the previous release 0.1.3. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page

/code/rcpp | permanent link

Thu, 09 Aug 2012

RProtoBuf 0.2.5
A new release 0.2.5 of RProtoBuf is now on CRAN. RProtoBuf provides GNU R bindings for the Google Protobuf data encoding library used and released by Google.

This release once again contains a number of patches kindly contributed by Murray Stokely, as well as some updates to conform to CRAN Policy changes.

The NEWS file entry follows below:

Changes in version 0.2.5 (2012-08-08)

  • Applied patches by Murray to

    • correctly deal with nested Protocol Buffer definitions, and also add new unit test for this

    • test a a protocol buffer for missing required fields before serializing it, also add a unit test

    • add a small stylistic fix and examples to the 'add.Rd' manual page

  • Moved inst/doc/ to vignettes/ per newer CRAN Policy

CRANberries also provides a diff to the previous release 0.243. More information is at the RProtoBuf page which has a draft package vignette, a 'quick' overview vignette and a unit test summary vignette. Questions, comments etc should go to the rprotobuf mailing list off the RProtoBuf page at R-Forge.

/code/rprotobuf | permanent link

Wed, 08 Aug 2012

RcppBDT 0.2.1
A new bug-fix release of the RcppBDT package appeared on CRAN earlier today. David Reiner noticed that the functions getEndOfMonth and getEndOfBizWeek were not working right. These are convenience wrappers around the real functionality provided as a member function to the reference class built by Rcpp modules---which works off a reference instance of the class, and these two convenience functions were not updating the date. This is now fixed.

The complete NEWS entry is below:

Changes in version 0.2.1 (2012-08-08)

  • Bug for getEndOfBizWeek() and getEndOfMonth() who were lacking a call to fromDate(date) to actually pass the date for which the functions are computing the end of business week or month.

Courtesy of CRANberries, there is also a diffstat report for 0.2.1 relative to 0.2.0. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.

/code/rcpp | permanent link