Dirk Eddelbuettel Thinking inside the box
 
Sat, 27 Jun 2009

R 2.9.1, CRANberries outage, and missing Java support
Just a short note that version 2.9.1 of R was released yesterday. And a corresponding Debian release went out as usual on the same day. One sour note: as the Java toolchain is currently broken, I had to disable compile-time support for Java. Just run R CMD javareconf once installed if you need it.

Speaking of broken, I had neither noticed that this R version now returns an additional field (for the repository) in the per-package metadata via available.packages(), nor that this change had broken my oh-so-useful and increasingly popular CRANberrries html and rss summaries of CRAN changes. So with the usual beta and rc releases or R 2.9.1 in Debian starting a week prior, CRANberries had been silent for six days from Friday the 21st to last Thursday. I rectified it once I noticed, and changed the code to no longer fall on its nose at that spot. Sorry for the few days without service.

/computers/R | permanent link

Sun, 31 May 2009

Ubuntu Developer Summit in Barcelona
Due to some things falling into place, I had an opportunity to attend the first two days of last week's Ubuntu Developer Summit in beautiful Barcelona. Somehow, I had never managed to attend a Debian conference either, so it was good to meet a few of the old Debian hands now moving Ubuntu along, as well as a few of the Ubuntu folks. I also gave a short presentation on R in Debian / Ubuntu and the plans for the upcoming Ubuntu release. More on that another time.

All told, a well-organised conference in a nice setting -- two stone throws from the legendary Camp Nou. Unfortunately, I had to leave by Wednesday so I missed what was undoubtedly quite a scene in Barcelona following Barca's dismantling of Man U in this year's Champions League final.

/computers/misc | permanent link

Sat, 30 May 2009

JPM Chase Corporate Challenge 2009
The 28th annual JP Morgan Chase Corporate Challenge race took place a couple of days ago May 21. Participation was down from the record of 23,000 runners set last year at around 17,125. With splendid weather, it is always a nice way to start the Memorial day weekend.

We fielded a small but spirited team of nine runners. I finished with a decent (hand-stopped) time of 22 minutes and 27.93 seconds for the 3.5 miles -- or a 6:25 min/mile pace. That is among the fasters times but not quite the fastest compared to the other six times I have run this.

Most importantly, everybody seems to have had a blast. And we did set a record for longer post-race party which sets a nice precedent for 2010.

/sports/running | permanent link

Sat, 23 May 2009

Temporary Debian mail outage
It would appear that debian.org rejected mail for maybe up to twelve hours from late yesterday afternoon (Central timezone) to some time shortly after I got up this morning. Things appear to be back to normal, so a big Thanks to the mail admins.

If you happened to have sent me mail to my debian.org address during that time period, you may have gotten a hard reject ('550 Administrative prohibition') as did a test mail of mine. In this case mail may not be respooled, so please do send it again.

My alternate address, formed by my first name followed by the family name and the commercial top-level domain, remained functional as a fallback.

/computers/misc | permanent link

Fri, 01 May 2009

Brad Mehldau at the CSO
Just got home from a wonderful concert by Brad Mehldau at the CSO. This was long overdue as I kept reading about Mehldau. And even though he performs quite regularly around here, I had never seen him. Big mistake.

The first set was performed as a (strictly acoustic) trio with Larry Grenadier on bass and Jeff Ballard on drums. After several compositions by Mehldau and a brazilian samba piece, the first set closed with a rendition of 'Holland' from Sufjan Stevens' album Michigan which was truly beautiful. The second set had Mehldau performing solo, again with several compositions of his own as well as one from Neil Young's classic 'The Needle and the Damage Done' leading two two pieces from the Sound of Music including an amazing, yet really different 'My favourite things' that just hushed a piece of the central melody along with a strond rhythmic element. Lovely. And then to cap it all off, four encores.

Highly recommended.

/music/jazz/live | permanent link

Thu, 30 Apr 2009

GSoC 2009 Chicago area meeting
Thanks to the effort of the tireless Borja Sotomayor for the local ACM chpater as well as the good folks at Google in Chicago, we had a local kickoff meeting for this year's Google Summer of Code at Google's Chicago office. A few accepted GSoC students, mentors, and a few Google engineers gave short presentations to a bunch of ACM-affiliated students from U of C, Northwestern, DePaul, IIT, UIC, ... I gave my first ever 'lightning talk' --- on R and the GSoC --- and after chatting a bit more rushed home to catch the end of the amazing triple-overtime win of the Bulls over the Celtics. Go Bulls for the improbable game seven in Boston!

/computers/misc | permanent link

Wed, 29 Apr 2009

Slides from most recent R and HPC tutorial
A little earlier I put the slides from my Introduction to High-Performance Computing with R tutorial at the R / Finance conference last week onto my talks / presentation page. Other tutorials, talks and keynotes are also being posted on the conference program page.

This tutorial was a shorter format of just an hour which did not allow for any parallel computing with R. However, parallel computing with R via MPI, snow, nws, ... is covered in the slides from December's workshop at the BoC.

/computers/R | permanent link

Tue, 28 Apr 2009

Google Summer of Code 2009: R / Quantlib
With everything that has been going on of late, I have yet to mention that Khanh Nguyen, a Ph.D. student in Computer Science at U Mass / Boston, will be working with me on RQuantLib as part of the Google Summer of Code program this year.

We had twenty-two applications to review for the R project, including three for the RQuantLib topic I had proposed. Khanh's application was clearly among the best, and I look forward to helping him do cool stuff over the summer. He already posted two short emails on the r-sig-finance and the quantlib-user lists soliciting suggestions and comments. So if you have comments regarding R and QuantLib, please get in touch with him or me!

/computers/misc | permanent link

Mon, 27 Apr 2009

Review of 'Analysis of Integrated and Cointegrated Time Series with R (2nd ed)' in JSS
A few weeks ago I wrote up a short review of Bernhard Pfaff's nice (but somewhat dry) Analysis of Integrated and Cointegrated Time Series with R (2nd ed) on unit root and cointegration modeling with R. This is now online at the Journal of Statistical Software.

/computers/R | permanent link

Sun, 26 Apr 2009

R / Finance 2009
Our inaugural R / Finance conference, mentioned here twice is now over.

We were fortunate to get seven outstanding invited keynote speakers, as well as eleven excellent presentations. This was preceded by four short tutorials (and I'll post slides from my Introduction to High-Performance Computing with R soon). With about 150 registered participants, plus keynoters, presenters, committee members, representatives from the sponsors (a quick shout of Thanks! to them), some folks from UIC (especially Holly without whom few things would have happened), we were probably around 200 people gathered at UIC. And then there was an extended social program at Jaks which is rather appropriate as we had numerous important committee meetings there over the preceding months. All in all it seems like a successful event. We may even do it again.

/computers/R | permanent link

Thu, 23 Apr 2009

Real nice Boston Marathon writeup
This article from the Boston Globe gives an excellent description of what running Boston last Monday was like, and why we keep coming back, almost regardless. The writer, for whom it was the fifth Boston and ninth total, expresses the range of emotions, expectations, frustrations, excitements, ups, downs, ... all of which form the challenge of racing a marathon in general, and on this course on a windy day in particular. And just like she concludes that the finish of one marathon leads to planning the next, I registered last night for the Chicago Marathon on October 11, apparently around twelve hours before it sold out.

And so it is with regrets that I have to decline Christian's invitation to run Cologne with him on October 4. Another time, hopefully.

/sports/running | permanent link

Wed, 22 Apr 2009

Boston Marathon 2009
Monday was the 113th Boston Marathon. Just like in 2007 (and having skipped last year as I ran London the week before), I went as part of a group of local running friends. We once again had a blast---Boston on Marathon weekend is a real spectacle. Lots of people, lots of excitement, and this time even great weather.

The race itself was challenging. Having done it once before, and having come off a really decent last marathon, I may have underestimated the impact of the famous hills. This really is a wonderful but challenging course. Combined with the poor training conditions during this last Chicago winter which forced us indoors for quite a few long runs, as well as a somewhat upset stomach which forced a two-minute break, I came up short and posted an underwhelming second half. The head wind was also a factor that was mentioned in a few reports on the comparatively slow times of the elite runners. So when all was said and done, I ended up with a time of 3:30:13 and 8:01 min/miles which is a little slower than last time.

All in all a really great marathon weekend. As my time from Berlin qualifies me for Boston 2010, I may well be back next year.

/sports/running | permanent link

Fri, 10 Apr 2009

New Garmin Forerunner
Going for a run--or a race--almost always meant grabbing the GPS and often also meant setting a target pace and distance. For the last four and a half years, this was measured by a Garmin Forerunner 201. That's the large rectangular model in the original "brick" form factor. I've come to love the device. Training is great because the distance log, as well as the pacing, keeps you honest. Racing is probably better because it helps a lot with the pacing (but then I also had PRs in race where I had forgotten the GPS at home). But there is a downside. This model sometimes takes forever to find tracking, and has interference / weak signal problems when it is cloudy or moist. But worst of all, it just bonks out in the 'urban jungle' downtown where it drops signals too easily. And now after all these years the display had some minor damage, the strap didn't really hold anymore, and it was generally time for something new. On Tuesday it even lost a quarter mile when we were going for a short but really fast three miler.

But then a few days prior I had followed fellow Debian marathoner Christian and used my birthday at the end of this month (as well as the upcoming Boston Marathon) as an excuse for conspicuous comsumption. After some price comparison, I ordered a factory reconditioned Garmin Forerunner 405 from this web discounter at a nice rebate to the regular price. It arrived this afternoon, seemingly shining new and I have been fiddling with it for the last little while.

This device features wireless data transfer to a usbstick. This meant booting the laptop in windoze for the first time in years to load the 'client software' after which data transfer proceeded. The Garmin Connect site has very slick presentation and aggregation of the data. The trouble is of course how to get the data there when running Linux... Christian had mentioned the garmin-forerunner-tools package. Unfortunately, this seems to really be written for the Forerunner 305 models as it doesn't see the device at all. Some more googling lead to this page and the gant tarball. All still fairly raw, but with some prodding in the settings of the 405 ('pairing' set to 'on', 'force send' set to 'yes'; which may have to be reset each time ?) I got my two xml files off the gps watch. Yay. We'll see what mode I will settle one. With the 201 and its ancient serial port, I basically just dropped the run and training histories which their fairly limited data collections.

Last but not least, fellow Oak Park runner Peter Sagal had a humorous Runner's World column on the whole GPS geekyness. If he'd only known how to pair it with programming geekyness...

/gadgets | permanent link

Wed, 01 Apr 2009

Rcpp 0.6.5
A minor new maintenance release 0.6.5 of Rcpp just went off to CRAN and Debian. This version corrects a small oversight for the OS X build, and adds the LGPL as file COPYING to the sources.

/computers/linux/debian/packages | permanent link

Mon, 16 Mar 2009

Dianne Reeves at Dominican
Yesterday afternoon, we had another chance to see Dianne Reeves (wikipedia). This time, it almost felt like she came to us as she was headlining at the annual trustee benefit concert at Dominican University, a small college about a mile from our place. And as in 2007 and 2003, she did not disappoint. Great voice, great stage presence. Highly recommended.

/music/jazz/live | permanent link

Sun, 15 Mar 2009

2009 March Madness Half Marathon in Cary
This morning it was once more time for the annual March Madness Half Marathon in Cary. This race is basically the start of the running season in Chicagoland. And we could not have asked for better weather. After a really cold and long winter, and a short snapback to really cold temperatures this week, it started to warm up a little yesterday with expectations of more of the same today. So while it was still cold at the start at around 37 degrees, most runners opted for shorts and by the finish temperatures were in the high 40s. Coupled with clear blue skies and no wind, it was a really nice morning for a run.

So how did it go? Well, I had pretty low expectations. Training has been difficult with too much snow, rain and plain cold weather. So like most of my running friends, motivation ran pretty low of late. I was really only trying to set a modest pace, and to hope to hang on to it and run steadily. That worked: I didn't walk a single water stop, and while my legs were getting really tired and sore I carried through to the end. Final time was 1:36:08.57 per my stop watch. That's a tad faster than last year, a lot slower than 2007, just a tad slower than 2006, and quite a bit faster than 2005.

Next stop: my second Boston Marathon in five weeks and given the underwhelming training this winter, it will be a challenge.

/sports/running | permanent link

Thu, 05 Mar 2009

Short introduction to R in Finance
Adam Gehr of DePaul University's Finance Department had organized a panel session about R in Finance at the Midwest Finance Association's 58th Annual Meeting which is happening this week here in Chicago.

I just posted my slides on my presentations page. The slides give a brief overview of R, the CRAN network and the by now over 1600 packages, mention the Finance Task View, briefly present four different packages (or package sets) and of course beat the drum for our upcoming R/Finance conference that will take place here in Chicago at the end of next month.

/computers/R | permanent link

Tue, 03 Mar 2009

RQuantLib 0.2.11
The changes in Rcpp that I blogged about a few days ago required a few small changes in RQuantLib. Not really much more that prefixing std:: in a number of variable declarations and a few member function calls -- so this is definitely a minor maintenance release. New source and binary packages have already been pushed to CRAN and Debian.

/computers/linux/debian/packages | permanent link

Sun, 01 Mar 2009

Rcpp 0.6.4
A new maintenance version of Rcpp (now at 0.6.4) was just pushed to CRAN and has been uploaded to Debian. Rcpp is a set of utility classes that provide interfaces for transferring the major R data types to C++ and back which makes it easier to extend R with dynamically loadable code written in C or C++.

This version changes how use the std namespace: all usage is now properly prefixed. Likewise, we now define R_NO_REMAP to not let R define utility functions as length() or error() which occassionally creates trouble with other include files -- so now use the more explicit forms Rf_length(), Rf_error() etc. Also, starting from this release, C++ class documentation created by Doxygen is included; it can also be seen from my box as both browsable html and a pdf file. Lastly, this version also adds a minor correction to the Windows build (spotted by Uwe and Simon).

/computers/linux/debian/packages | permanent link

Wed, 25 Feb 2009

Review of 'Applied Econometrics in R' in JSS
A short review of Kleiber and Zeileis' excellent Applied Econometrics with R is now out at the (online) Journal of Statistical Software.

/computers/R | permanent link

Mon, 23 Feb 2009

R/Finance conference in Chicago in April: Registration now open
Regarding the aforementioned R/Finance conference that will take place at the end of April here in Chicago, we announced earlier today that the conference website is now available. It provides information about the program, speakers and other details as well as a link to registration details.

See you in Chicago in April!

/computers/R | permanent link

Thu, 12 Feb 2009

New project: RInside
A few days ago, I started a new project called RInside by uploading a few files to a new SVN repo at R-Forge.

RInside makes it easy to embed R into your own C++ application by hiding the nitty gritty of initializing an R interpreter behind a simple abstraction. More information is at a (currently pretty simple) RInside page, and you may want to look at the related Rcpp and possibly littler projects. The former is helpful for data exchange, and the latter provided my first real use of R embedding which in some ways also lead to RInside.

/computers/linux/debian/packages | permanent link

Tue, 03 Feb 2009

Correct Datetime / POSIXct behaviour for R and kdb+
We have started to look into kdb+ as a possible high-performance column-store backend. Kx offers free trials -- and so I have played with this for a day or two, both the general system, data loads and dumps and in particular with the interface to R, Based on the few files (one C source with interface code, one R file to access the C code, one object file to link against, one header file and a simple Makefile), it took just a couple of minutes to turn this into a proper CRAN-style R package.

Anyway, the reason for this post was that the R / kdb+ glue code works well ... but not for datetimes. I really like to be able to pass date/time objects natively between systems as easily as, say, numbers or strings (and see e.g. my Rcpp package for doing this with R and C++) and I was a bit annoyed when the millisecond timestamps didn't move smoothly. Turns out that the basic converter function in the code had a number of problems: it converted to integer, only covered a single scalar rather than vectorised mode, and erroneously reduced a reference count. A better version, in my view, is as follows:

static SEXP from_datetime_kobject(K x) 
{
	SEXP result;
	int i, length = x->n;
	if (scalar(x)) {
		result = PROTECT(allocVector(REALSXP, 1));
		REAL(result)[0] = (kF(x)[0] + 10957) * 86400;
	} else {
		result = PROTECT(allocVector(REALSXP, length));
		for(i = 0; i < length; i++) {
		    	REAL(result)[i] = (kF(x)[i] + 10957) * 86400;
		}
	}
	SEXP datetimeclass = PROTECT(allocVector(STRSXP,2));
	SET_STRING_ELT(datetimeclass, 0, mkChar("POSIXt"));
	SET_STRING_ELT(datetimeclass, 1, mkChar("POSIXct"));
	setAttrib(result, R_ClassSymbol, datetimeclass);
	UNPROTECT(2); 
        return result; 
}
This deals with vectors as well as scalars, converts Kdb's 'fractional days since Jan 1, 2000' to the Unix standard of seconds since the epoch -- including the R extension of fractional seconds -- and as importantly, sets the class attributes to POSIXt POSIXct as needed by R. With that, a simple select max datetime from table does just that, and vectors of timestamped records of trades or quotes or whatever also come with proper POSIXct behaviour into R. Note that it needs TZ to be set to UTC, though, or you get a timezone offset you may not want.

/computers/R | permanent link

Fri, 30 Jan 2009

State-of-the-art in parallel computing with R: New paper
A few weeks ago, we finished a paper that surveys the current state of parallel computing with R. The paper was lead by Markus Schmidberger and written while he was visiting the Fred Hutchinson Cancer Research Center in Seattle. The co-authors are Martin Morgan, myself, Hao Yu, Luke Tierney and Ulrich Mansmann. The paper is now available as a technical report from LMU Munich via open access, and also from my papers page.

/computers/R | permanent link

Sat, 24 Jan 2009

New CRAN Task View on HPC
A while back, I suggested to Achim to add a new CRAN Task View for High Performance Computing with R. And as of a day or two ago, we now have the new CRAN Task View for High Performance Computing with R providing an overview about available packages, grouped thematically, with a focus on the various parallel computing application. I have already received a few great comments that even lead to an entire new section on applications. Keep'em coming!

/computers/R | permanent link

Wed, 14 Jan 2009

littler 0.1.2
Version 0.1.2 of r (pronounced littler) was just rolled up.

This version adds two new command-line switches:

  • -t selects per-session temporary directories in the same way as R does (with thanks to Paul Gilbert for the suggestion), and
  • -q skips autoloading of default libraries at startup for another small yet noticeable gain in startup speed (with thanks to Simon Urbanek).

As usual, our code in our svn archive, on my r page, and in the local directory here. A fresh package is in Debian's incoming queue, and Jeff's littler page at Vanderbilt should reflect the new release soon too.

/computers/linux/debian/packages | permanent link

Fri, 09 Jan 2009

Rcpp 0.6.3
I just pushed Rcpp 0.6.3 out to CRAN and Debian.

This version adds a fix to the OS X installation (thanks to Simon Urbanek), adds some 'view-only' classes for R vectors, matrices and string vectors (kindly suggested/provided by David Reiss) as well two shorter helper functions to derive compilation and linker flags for packages using Rcpp.

/computers/linux/debian/packages | permanent link

Wed, 07 Jan 2009

Google Summer of Code 2009
Word is out that there will be a 2009 edition of the Google Summer of Code. I have some follow-up ideas based on last year's mentoring for both Debian and R, but there will be a better time and place to discuss possible project ideas.

/computers/misc | permanent link

R featured in New York Times article
Today's New York Times carries a decent article about R. Predictably, this lead to one (short), two (longest), three (short) threads on the main R mailing list. One aspect merits further highlighting. The reporter asked whether R would pose a threat to SAS:

"I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet."

That's silly on so many levels. A concise and rather appropriate follow-up came in early from Frank Harrell, a long-time S and R advocate:

This is great to see. It's interesting that SAS Institute feels that non-peer-reviewed software with hidden implementations of analytic methods that cannot be reproduced by others should be trusted when building aircraft engines.

Achim already added this (and two more posts from the aforementioned threads) to the fortunes package that collects such choice quotes.

R in Finance (the topic of our upcoming conference) gets mentioned as well. Now, as editor of the Finance task view, I find that second half of

The financial services community has demonstrated a particular affinity for R; dozens of packages exist for derivatives analysis alone.
to be a little off the mark. But that's minor as the article is broadly sympathetic, and mostly "gets it" where it matters. Recommended.

/computers/R | permanent link

Sat, 03 Jan 2009

Multiseat setup via Userful
As I had blogged a while back, multiseat use broke following the normal upgrade to Ubuntu 8.10. I had also suggested a fix but it turns out that the fix didn't work. So we had a clear regression -- multiseat use of a single Ubuntu workstation with two screens, two keyboards and two mice no longer worked. Consequently, the kids ended up 'serializing' access to their computer letting the second screen go idle.

Shortly after Christmas, that computer suffered a catastrophic disk failure (and as an aside, I hate LVM when that happens...). So I reinstalled, this time using the Ubuntu rather Kubuntu variant. This should allow use of Userful Multiplier --- a commercial multiseat solution with free two-seat licenses. The base package even comes via the Ubuntu repos.

I still had a couple of minor issues. One was possibly related to the Radeon card (as in: don't drive one monitor in dvi mode and one in analog mode but rather use both in analog mode via a dvi/analog dongle) so the live cdrom offered by Userful just went into a perpetual 'reconfigure, reboot, reconfigure, reboot, ...' loop. Another hitch was that their license manager no longer wanted to use the license key I had requested in November when I tried in vain to use Userful with KDE. And of course I wouldn't a new key as the home ip address hadn't changed... Now, with a newly requested key from another IP address, things appear to work at last using the default Gnome setup --- and the kids are back in proper 'parallel' use of their workstation.

All in all, Userful Multiplier is a nice and useful product especially as long as stock XFree does them the favour of no longer competing in the basic two-seat case.

/computers/hardware | permanent link

Thu, 01 Jan 2009

R/Finance conference in Chicago in April: Call for Papers
The following went out to the R-announce and R-SIG-Finance mailing lists a few days ago. The conference already has a very strong lineup of invited speakers, and we are now asking R / Finance users from both academia and industry to submit suitable one-page abstracts:

Call for Papers

The Finance Department of the University of Illinois at Chicago (UIC),
the International Center for Futures and Derivatives at UIC, and
members of the R finance community are pleased to announce

R/Finance 2009: Applied Finance with R

on April 24 and 25, 2009, in Chicago, IL, USA

Confirmed keynote speakers include:

Patrick Burns (Burns Statistics)
David Kane (Kane Capital)
Roger Koenker (U of Illinois at Urbana/Champaign)
David Ruppert (Cornell)
Diethelm Wuertz (ETH Zuerich)
Eric Zivot (U of Washington)
We invite all users of R in Finance to submit one-page abstracts or
complete papers (in txt/pdf/doc format). We encourage papers both on
academic research topics and related to use of R by Finance practitioners.

Presenters are strongly encouraged to provide working R code to accompany
the presentation/paper. Datasets need not be made public.

Please send submissions to committee@RinFinance.com.
The submission deadline is January 31st, 2009.
Submissions will be evaluated and submitters notified via email on a rolling basis.

Additional details about the conference will be announced as available.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan
See you in Chicago in April!

/computers/R | permanent link

Mon, 29 Dec 2008

RQuantLib 0.2.10
Earlier this month and following the release of QuantLib 0.9.7, I updated RQuantLib to version 0.2.10. For once, there were no changes required to keep up with QuantLib. Rather, changes were internal as Rcpp had been spun off into its own package. As Rcpp is now in Debian, RQuantLib itself was also updated in Debian following the earlier upload to the CRAN network.

/computers/linux/debian/packages | permanent link

Thu, 25 Dec 2008

Rcpp now in Debian
Rcpp is an interface package that makes it easier to add C++ code to GNU R. It had started as a part of my RQuantLib project but has now come into its own as blogged here and here. As of today, Rcpp is now also available as a Debian package.

/computers/linux/debian/packages | permanent link

Wed, 24 Dec 2008

Very flattering
Good friend and fellow Oak Park / River Forest runner Paul Oppenheim used his column in the local weekly for some very flattering words about the marathon runners in our informal running group and even highlighting my personal World Marathon Majors adventure. I may just have to keep a copy for my epitaph.

/sports/running | permanent link

Tue, 23 Dec 2008

Updated 'Introduction to High-Performance Computing with R'
Fellow R user Paul Gilbert had invited me to come to Ottawa and the Bank of Canada to give a presentation/workshop on 'high-performance computing with R' similar to the UseR 2008 tutorial and talk.

I just posted the updated slides from this talk, and there is also an updated live cdrom on the Alioth server. Also, it looks like the tutorial will be held again at UseR 2009 in Rennes, see here for a brief synopsis.

It was nice to get back to Canada, even if it was a 24 hour whirlwind trip. Ottaws looked quite pretty in all the snow. And it seems that I got rather lucky with the travel dates as both the days before and after my trip had a large number of flight cancellations and delays due to snow storms.

/computers/R | permanent link

Tue, 02 Dec 2008

Rcpp relaunched with versions 0.6.0 and 0.6.1
I just announced Rcpp 0.6.0 and 0.6.1 on the low-volume R-packages list. Rcpp provides C++ classes that greatly facilitate interfacing C or C++ code in R packages using the .Call() interface provided by R.

Rcpp provides matching C++ classes for a large number of basic R data types. Hence, a package author can keep his data in normal R data structure without having to worry about translation or transfer to C++. At the same time, the data structures can be accessed as easily at the C++ level, and used in the normal manner.

The mapping of data types works in both directions. It is as straightforward to pass data from R to C++, as it is it return data from C++ to R.

Rcpp was initially written by Dominick Samperi to in the context of the RQuantLib package and later released on its own, but had not seen any releases in twenty-four months. I have substantially expanded the documentation, simplified the build structure yet made it easier to use Rcpp from other packages, and started to add some new classes (notably microsecond time types). Rcpp is supported on Windows, Linux and Mac OS X (with special thanks to Simon for some extended help).

More information for Rcpp can be found at the package homepage, the R-forge repository or the package CRAN page.

/computers/linux/debian/packages | permanent link

Mon, 01 Dec 2008

CRANberries prettified
Judging from my html logs, a fair number of folks go to the html version of my CRANberries feed (which was originally announced here) of new or updated packages for R,

So I quickly put together some simple css formatting to make it look a little better than the default blosxom theme it sported previously. That said, you probably should read the rss version (more about rss here) anyway!

Update: Oops. And it even works with a correct path to the css file. Now fixed.

/computers/R | permanent link

Mon, 03 Nov 2008

Multiseat update under Ubuntu 08.10
The aforementioned multi-seat setup allowing two kids with two screens/keyboards/mice connected to one computer required a minor update under the new Ubuntu release. The Xephyr 'x11-server inside an x11-server' that is used to display two distinct sessions on two distinct monitors didn't seem to recognise the mice and keyboards anymore.

Somehow not explicitly specifying them helped. I.e. the calls to the script /usr/sbin/Xephyr-path.sh from /etc/gdm/gdm.conf now read

[server-Xephyr1]
name=Xephyr1
command=/usr/sbin/Xephyr-path.sh -display :0 -br -dpi 100 -xauthority /var/lib/gdm/:0.Xauth -screen 1280x1024
handled=true
flexible=false

[server-Xephyr2]
name=Xephyr2
command=/usr/sbin/Xephyr-path.sh -display :0 -br -dpi 100 -xauthority /var/lib/gdm/:0.Xauth -screen 1280x1024+1280+0
handled=true
flexible=false
Otherwise, the tutorial referenced in my earlier post still applies. And the kids are very impressed with new eye candy in KDE 4.1.

/computers/hardware | permanent link

Tue, 28 Oct 2008

Google Summer of Code 2008 Mentors Summit
Spent last weekend in Mountain View where Google had invited a number of mentors for the Summer of Code project that Google once again graciously sponsored. A rather impressive list of projects sent up to two people each, giving a probably unparalled sample of major Open Source projects.

I had a blast. Chris, Leslie and the rest of the Google's Open Source Programs Office facilitated a really nice unconference that spawned a few really nice sessions, and they took very good care of us. And just about everybody met a number of folks in person that were previously known only via email or irc. As the saying goes: nothing like the bandwidth of a face-to-face meeting...

Last but not least I should issue a health warning. Sharing a room with the fearless Debian DPL is not for the faint of heart: His snooring is truly world-class.

/computers/misc | permanent link

Tue, 14 Oct 2008

RPostgreSQL 0.1.0
As part of the Google Summer of Code program for 2008 (which I mentioned here and here), Sameer and I are happy to announce that RPostgreSQL is now on the CRAN mirror network for R. RPostgreSQL provides a (DBI-compliant) interface between R and the Postgresql database system. I also just sent this short announcement to the r-packages list.

/computers/misc | permanent link

Thu, 09 Oct 2008

More running data visualization
A few years into this running hobby, I realized that my times were getting better. But I had no feel for by how much, or whether that was a constant rate of improvement etc pp. Long story short, I started to plot some of the data. What seemed natural was to record the date, the distance in miles as well as in a qualitative variable, and finally the average pace. Additionally, I played with groupings into just three categories 'short', 'mid' and 'long'.

This leads to a natural 'one-factor' model of pace as a function of race date grouped by race distance. And given how easy it is to do conditional plots in R, I quickly arrived at something that already resembled the following chart:

(pace by date given group lattice chart)

At first, some of the groups had too few data points to actually reliably construct regression lines, let alone non-parametric smoothers. But over time more and more data points were added as I kept running races. Including for example the somewhat disappointing result from last year's Chicago marathon in record heat that resulted in the outlier in the last panel. It actually made the smooth fit turn upwards! Luckily, the subsequent times in New York last fall, London in April, and of course in Berlin last month helped to dampen the effect of the one outlier, resulting in a more normal straight line for marathon performance that is comparable to the other four race lengths.

All in all I am now quite happy with the chart. The combination of the non-parametric loess smoother and the robust linear regression (using lrm from the MASS package for R) shows that most groups exhibit very little non-linearity as both regression curves are very close to each other. The curvature in the '10m' group is probably mostly a small-sample effect. And I am obviously happy with the fact that three of the five panels show their respective last race as a PR :)

The R script containing the data and code is available here but requires some familiarity with the lattice package for R (as the lattice book would provide).

/sports/running | permanent link

Sun, 05 Oct 2008

World Marathon Majors
Last Sunday's Berlin Marathon was my fifth and final piece in completing the World Marathon Majors during 2007/2008: after running Boston, Chicago and New York in 2007, and then London and now Berlin in 2008, the set is complete.

The idea was born after having run Chicago a few times, qualifying for Boston and winning a New York lottery entry. With friends to visit in New York, London and Berlin, it became feasible.

It's been a great experience to run these famous courses in front of large crowds. Conditions ranged from cold, windy and rainy in Boston to way too hot in Chicago, had mixed conditions including a solid rain shower in London and were just perfect in both New York and Berlin. The crowds were awesome in all five places. All in all, these races were a blast -- if you're into long-distance running, give each or all of them a shot.

/sports/running | permanent link

Wed, 01 Oct 2008

Berlin Marathon 2008
Last Sunday was the 35th Berlin Marathon. I had flown over to Berlin on Thursday after work, and had Friday and Saturday to 'chill'. The weather was already pretty nice before the race, and truly gorgeous on Sunday: sunny yet not too warm, blue skies, no wind. As has been widely reported, Haile Gebrselassie set a new world record breaking his own mark set the year before and becoming the first man to finish under two hours and four minutes. Truly impressive.

My race was pretty good too. I shaved over four and a half minutes off my own personal record (which was set in early 2006 at Sunburst) and finished in 3:13:09. That's a pace of 7:22 min/mile (or 4:35 min/km) which I am rather happy with. I held a fairly steady pace of under 7:30 almost all the way but but had to fight off the onset of cramps with some short walks about less than two miles to go.

Coming back in Berlin after all those years is always a charm. The city has obviously changed a lot in some very visible areas. Yet it still recalls the Berlin of those years. The course was really nice, covering numerous neighbourhoods and starting and ending in Tiergarten.

Lastly, it was also good to see old friends who have now been there since the mid- to late 1980s. And I managed to pack a quick visit to my parents in as they are just a good 80 minute ICE train ride away. All in all a very nice trip even though the travel from Chicago (without a direct flight!) is a bit of a hike.

/sports/running | permanent link

Sun, 14 Sep 2008

Chicago Half Marathon 2008
Interesting conditions today for the 2008 edition of the Chicago Half Marathon, a race I have now done in 2003, 2004, 2005, 2006, and 2007 which is a personal record in itself.

While the weather story of the weekend is obviously the aftermath of hurricane Ike in Texas and neighbouring states were millions of people are still without power, we were also hit in a surprisingly hard way here in northwestern Illinois. According to the Tribune all of Chicago had a rain record day and the Chicago River crested causing evacuations. Not pretty.

The new race organisers (who had acquired the race since the 2007 event) were standing steadfast and guaranteeing the race 'come rain or shine'. Participation looked decent -- word was of a record turnout of sixteen thousand runners though I am sure some stayed home given yesterday's rain and the forecast for today. Given all that, it turned out to be not that bad. While we had steady rain the whole, it rarely rained that hard. Shoes and socks did get wet towards the end, but it was tolerable overall. I had been worried about the gross humidity we had yesterday --- but today was much better with temperatures in the sixties and little wind.

As for the race, I went out somewhat fast but managed to hang on. The Garmin had every mile split below 7:00 min/mile, and I came in at a new personal record of 1:30:51.52. My GPS, an old Garmin 201, also showed the course long at 13.4 miles; a few other runners I talked to had it as correct or long by a lesser amount. The leaves the pace at 6:56 min/mile (or, for Christian, at 4:19 min/km :-) if the half marathon course length was in fact correct, and at 6:47 min/mile (4:13 min/km) if my Garmin had it right.

And from now on it's all tapering for the the next big one in two weeks!

/sports/running | permanent link

Wed, 10 Sep 2008

RDieHarder 0.1.0 released
I just rolled up version 0.1.0 of RDieHarder, an R package providing an interface between GNU R and the DieHarder battery of tests for random number generators developed by Robert G. Brown. See the the RDieHarder page for some introductory material and links to the talk at UseR! 2007.

Version 0.1.0 extends the functionality of the dieharder function quite substantially and catches up to a number of recent changes in DieHarder. In particular:

  • dieharder() generator selection changes along the same line as in the DieHarder release: ids 1 to 200 are reserved for GNU GSL geneators, ids 201 to 400 for Dieharder, 401 to 500 for GNU R, 501 to 600 are hardware-based and ids over 600 are for user-contributed generators
  • dieharder() now supports new arguments 'inputfile' (for file_input and file_input_raw) and 'ntuple' (for tests with variable bit length)
  • dieharder() now also supports the 'rgb', 'sts' and 'user' tests
  • dieharder() now returns multiple Kuiper KS p-values for those tests that generate multiple p-values
  • dieharderGenerators() now returns a data.frame with two columns 'names' and 'ids' and generators can be selected via either a a name (as e.g. 'mt19937') or a numeric id (e.g. 16)
  • dieharderTests() was added and also returns a data.frame with names and ids permitting a similar selection via test name or via test id.
  • Some misc. code organisation, a cleanup removing more files, updated vignette output files, and the actual test sources updated to DieHarder 2.8.1

This new version should show up at CRAN and its mirrors in due course, in the meantime sources are also RDieHarder page.

/computers/linux/debian/packages | permanent link

Mon, 01 Sep 2008

Easy multi-seat (two screens, keyboards, and mice off one computer) setup
This is 'back to school' season aroun here. So on Saturday we went and set up two desks for our two kids. And with that I finally converted 'their' computer to a working multi-seat setup. At first, I had fiddled with two distinct (ATI Radeon) graphics cards. Somehow I never got x11 to recognise both cards properly.

But two cards are not needed. As the machine is running a standard Kubuntu setup, I just followed this excellent three-part tutorial for Ubuntu multi-seat setup which describes the process using nothing but standard Ubuntu software. From setting up a 'big desktop' spanning two screens (which is easy enough using one card via the vga and dvi outputs), it is fairly straightforward to modify the gdm.conf setup to spawn two gdm greeter instances using the Xephyr nesting xserver.

So far, all is well. We'll see what possible shortcomings we will find. The GL extensions are not supported, so some eye-candy will be unavailable.

/computers/hardware | permanent link

Wed, 27 Aug 2008

littler 0.1.1 released
The new release 0.1.1 of r (pronounced littler) was just rolled up.

The only new feature is due to a suggestion by Paul Gilbert: r now reports the value of the optional status variable when calling q() at the end of a script:

$ r -e'q(status=42)'; echo $?
42
This can be very useful to signal exit codes and branching on those in other scripts or Makefile. We also applied a patch to manual page which adds some examples there (thanks, Seb!) and made some small changes to tests and examples.

As usual, our code in our svn archive, on my r page, and in the local directory here. A fresh package is in Debian's incoming queue, and Jeff's littler page at Vanderbilt should reflect the new release soon too.

/computers/linux/debian/packages | permanent link

Tue, 19 Aug 2008

UseR! 2008 talk
Besides the slides from the tutorial at UseR! 2008 that were mentioned here previously, I also gave a short talk on scripting with R in high-performance computing using our littler frontend to R.

The talk introduces and extends an example related to some of the material from the tutorial itself. The slides from the talk are a little rough as the talk was somewhat ad-hoc: As session chair, I was confronted with a fairly last-minute cancellation and a 15 minute hole, and thought this would make a good little talk. It does show a nice trick for using littler with Open MPI (via snow) under the powerful slurm resource manager and batch/queue engine.

/computers/R | permanent link

Tue, 12 Aug 2008

UseR! 2008 tutorial
Earlier today, I presented a 3 1/2 hour tutorial Introduction to high-performance R (here is a brief description of the talk) at the UseR! 2008 conference at the TU Dortmund.

In a nutshell, the tutorial covered how to measure / profile R performance for speed and memory use, how to accelerate R using vectorised expression and tools like Ra / jit, how to add compiled code to R using either the .C or .Call interface and using the inline and RCpp packages, how to use R code in parallel (explicitly using NWS, Rmpi or snow as well as implicitly using pnmath / OpenMP), and how to script / automate R using littler, Rscript or RPy.

The final version of the slides is now available via my presentations page, and the live cdrom with software support for all the software used is at Alioth.

Update: Corrected link to presentations page thanks to heads-up by Charles. Thanks!

/computers/R | permanent link

Sat, 09 Aug 2008

RQuantLib 0.2.9
As version 0.9.6 of QuantLib, which was released a couple of days ag, is now in Debian, I just uploaded an updated version of RQuantLib. Only minor API changes to src/curves.cpp were needed. This new version 0.2.9 is currently in the queue at R's master CRAN host and should hit the CRAN mirrors shortly; likewise the Debian package has been uploaded and should also propagate to Debian mirrors in due course. As usual, source are also available locally on my site. Lastly, RQuantLib is hosted on R-Forge and potential contributors are encouraged to register at R-Forge and to get in touch -- this is a great way to learn how to combine C++ and R.

/computers/linux/debian/packages | permanent link

Sat, 07 Jun 2008

Into the sunset
I finally got around to dropping four old computers off to recycling. Triton, a community college nearby, had a recycling event where students volunteered, and so I finally got around to dropping two generations of old computers off.

Old computers, I hear you ask, well how old? Real old. The older two were from an age where the bios didn't yet boot off cdroms -- circa 1995. We had bought those in Kingston just off the Queen's campus. These were respectively a pentium 90 and a pentium 100, which still have traces on the web as miles.econ.queensu.ca (e.g. in a number of Debian changelogs) and rosebud.sps.queensu.ca which was of course Lisa's office machine and for a while the only internet address showing SPS.

The next two were purchased around 1999 in Toronto on College St just north of U of T's main St George campus. Those, an AMD k6-2 300 and a Celeron overclocked to 450 MHz (woot :) lived happily in the basement of our Toronto home, forming the first lan I built. If I recall they were initially connected using a crossed ethernet cable and a second nic to the ISP. Oh boy.

At least those latter two still boot off Knoppix. And do they ever feel slow. To think now just how many Debian packages I must have built on at least three of these over the years... And each machine must have gotten at least five decent years of usage out of them. One of the second generation computers eventually morphed into the kids play computer but even retired from that a while ago.

In any event, it was good to have them recycled, and also good to have been able to do so without paying a fee as is increasingly common. So cheers to Triton. I may be back in a few years as there are still a few computers spread across the house.

/computers/hardware | permanent link

Fri, 06 Jun 2008

Wayne Shorter at the CSO
Just got home from the 'An Evening with Wayne Shorter' concert at the CSO, part of this year's tour apropos his 75th birthday. The man is a legend and one my favourite musicians for both his own Blue Note work from the 60s and of course his participation in the legendary Miles Davis Quintet of the same period.

Shorter (ts, as) was playing with his quartet of recent years: Danilo Perez (p), John Patitucci (b) and Brian Blade (dr). And playing they did. Shorter has such a soft lyrical tone, which accentuates both the rhythmic and harmonic quality of the side men. Very enjoyable concert, fairly 'modern' and free in style. And no standards or old material. Oddly enough, not one spoken word: neither greeting nor good byes or just an introduction of the band. Recommended.

/music/jazz/live | permanent link

Thu, 05 Jun 2008

Adventures with Comcast: Part ohbynowIhavelostcount in an ongoing series
Regular readers of this blog (ed: oxymoron alert) may recall tales of woe with our beloved (ha!) cable internet provider such as this; then there are of course minor tales like this or this or this or the other stories on on this page but I am probably forgetting others.

Anyway, yesterday's highlight was initiated with a mail, seemingly sent to all customers, informing me that

ACTION REQUIRED: Comcast has determined that your computer(s) have been used to send unsolicited email ("spam"), which is generally an indicator of a virus. For your own protection and that of other Comcast customers, we have taken steps to prevent further transmission of spam from your computer(s).
and the email went on to recommend some Windows anti-spam measures, including a reference to a page I could only open with IE at work and one URL to a page that doesn't exist. Nice. Not. Needless to say, there are now Windows computers sending mail (via Comcast) here (as the lone windows box, my wife's work laptop goes straight to her university webmail).

And obviously, they blocked port 25, so no more mail sending from home. So I grumpily logged a compaint having been on hold and in telephony menu hell for fifteen or twenty minutes. I was promised to hear back in 72 hours. Hasn't happened yet, naturally, but we're only half way through...

Anyway, to make a long story short and this post constructive: Here is what you do on a Debian or Ubuntu system running exim as your mail transport:

  • sudo editor /etc/exim4/conf.d/transport/30_exim4-config_remote_smtp_smarthost and add a line port = submission in the remote_smtp_smarthost block (assumming you have the split configuration chosen for the exim4-config package). Setting port to 'submission' switches from plain old STMP to the authenticated version running on port 587; submission is mapped to 587 in /etc/services.
  • sudo editor /etc/exim4/passwd.client and add your user and password id as e.g. for comcast web-login
  • sudo update-exim4.conf to update the configuration
  • sudo /etc/init.d/exim4 restart to restart exim
And it may pay to check /var/log/exim4/mainlog for any irregularities. Barring those, you should now be sending mail to you smarthost using authenticated transfer over port 587.

In the meantime, it looks like they unblocked port 25 at some point today...

/computers/broadband | permanent link

Sat, 31 May 2008

Accelerated R in Debian
A few months ago, Stephen Milborrow started releasing a patched version of R that performs just-in-time compilation -- see his Ra page for some details and further pointers.

In a nutshell, Ra provides a modified R engine so that code preceded by all jit(1) function call, using his jit package from the CRAN archive, will run faster due to just-in-time compilation of loops and arithmetic expressions.

Ra offers to pick the low-hanging fruit for users as loops can be a bottleneck. Of course, as shown in Stephen's case study, using appropriate vectorised expression will often be faster still. That said, for a certain class of problems, Ra should offer a decent speed boost.

Debian users can now just say

    sudo apt-get install r-base-core-ra r-cran-jit
as the Ra and jit packages in Debian's unstable distribution (and in the case of jit, even in testing).

Lastly, version 1.1.0 of Ra was released by Stephen yesterday and is now also in Debian unstable.

/computers/linux/debian/packages | permanent link

Sun, 25 May 2008

Bike The Drive 2008
Memorial Day weekend, so time for the annual Bike The Drive in Chicago. Got the whole family up bright and early, and was it ever nice -- 60-some degrees, sunny blue skies and no wind. Perfect conditions. And the Chicagoist blog has some pictures up.

/sports/cycling | permanent link

smtm bug fix release 1.6.10
A new version of smtm just went to Debian and CPAN. Perl 5.10 required a small change in how we test whether certain arrays do, or do not, contain elements. No other changes were made.

/computers/linux/debian/packages | permanent link

Thu, 22 May 2008

JPM Chase Corporate Challenge 2008
Just got back a little earlier from running the 2008 edition of the JP Morgan Chase Corporate Challenge. And again a record crowd of now just over 23000 in Chicago -- announced to be bigger than those at the JP Morgan Chase races in Boston, San Francisco or New York! This year the weather wasn't quite as stunning as it has often been in the past. But at least, temparatures in the high 40s and an overcast sky make for good running conditions.

This time, two colleagues and I tried to make it close enough to the starting line to not waste too much time 'surfing' around slower runners who for whatever reason think they have to be up at the front. And that seems to have worked: despite a still crowded start, I ran even, steady and fast enough to beat the PR from 2005 by a decent margin with a (hand-stopped) time of 20 minutes and 46.65 seconds. That yields 5:5619 min/mile (or for Christian, 3:4132 min/km) which seems too fast given the splits I saw at miles two and three. Oh well -- same cours as as the other five times that I've run this, so I trust the course is USATF certified.

And as always, good to hang with folks from work for a cold one or two afterwards. Given the temperatures, I didn't last very long though.

/sports/running | permanent link

Sat, 10 May 2008

Quarryman Challenge 2008
This morning was the 2008 edition of Quarryman Challenge, a 5km and 10mile race in Lemont, which is southwest of Chicago along the Illinois-Michigan Canal.

Three of us ran the 10 mile race, which was nicely organised. But is it ever friggin' hilly there: the race course takes three turns from the lower levels near the canal up towards those hills. As the elevation chart (that I cut out of this pdf file with the course map) shows, it is not so much the total elevation but rather how steep the incline is.

Quarryman Challenge elevation profile

That said, I did okay: even though the legs were really tired throughout from those inclines, I finished in 1:12:08 for a pace of 7:13. And given the reasonably small field, that yielded 34th place overall and third in my age group.

/sports/running | permanent link

Fri, 09 May 2008

On modes of transportation
Something I never really mentioned was the purchase of the foldable bike: a Dahon with a reasonably lightweight aluminum frame, a seven-speed hub and high-pressure tires. It's great fun in the city for the rides to and from the commuter train, or across downtown for occasional errrands after work.

I have had this foldable bike for nearly two years, and used it almost (work-)daily, even in the Chicago winters. 'Almost' because I did suffer from broken parts on a few occassions: a pedal broke (easy replacement), the axis in the front wheel broke (a good week for a new and inexpensive wheel) but the bummer was that a part of the frame-folding mechanism broke last fall. Given that the bike, which I bought used via craigslist, is a few years old, the part was no longer standard and so we waited for it to be shipped from the manufacturer. And waited and waited some more until Dan's decided to give me a matching part from a bike in their inventory. But apart from that episode, and the occassional problem with conductors on the Metra commuter trains, it has been a smooth ride. Highly recommended, and I do see a few more foldable bikes downtown.

Trek and Dahon bikes
But what is new now is that I finally gave in and bought a road bike, once again off craigslist. My daily commute is about ten miles one-way, which works out to about 35 to 40 minutes of cycling, plus a few minutes of locking/un-locking, changing, etc. I had used my trusted (yet heavier) touring bike with its steel frame a number of times, but felt that a road bike may make for a faster ride. While it saves a few minutes, it is not really a time saver as the bike-train-bike commute also takes around 40 to 50 minutes. That said, riding is simply a nice way to clear the head before or after work. I am back on the schedule I tried for a few weeks last summer / fall: running on Tuesday and Thursday leaves Monday, Wednesday and Friday for the bike commute. So far, I am 8 for 9 over the last three weeks. On the downside: one rather wet ride home, and already to minor flats that (luckily) still allowed riding home. The hardest part is meeting up with some other riders at 6:00am meaning that I am now getting up at 5:00am whether I am running or not. But all told a nice way to get some exercise in outside of running.

/sports/cycling | permanent link

Sun, 04 May 2008

On soccer, promises and hair cuts
Both my daughters have been playing soccer for a while now. And for another little while, I had been promising that if they ever scored three goals in a game, I'd shave my head.

As the attentive reader may have guess by now, that day finally came. This weekend saw a suburban tournament in nearby Oak Brook, and lo and behold Anna scored three goals in the first game! So home we went, out came the tool and she rather professionally separated me from my hair. So today on day two of the new look, a friend took this picture of me (scaled down from 2.4mb to around 80kb) at the same tournament:

Dirk on 4 May 2008 with a new look

They actually played just about the best soccer I have seen them play, won their group (with three shutouts!) and lost a hard-fought and well-played final 2:4. And today the weather even cooperated as one can see from the photo. Nice weekend, all told. And yes, the head feels kinda nice ;-)

/misc | permanent link