Thu, 22 Oct 2009

From ORD Sessions to R-Forge in 12 hours with RProtoBuf

Yesterday, via in invitation from fellow Chicago-area Google Summer of Code mentor Borja Sotomayor, I attended the Second ORD Sessions. These are happening at the HQ of Inventable where a couple of technologists and Open Source geeks from the Chicagoland area get together and riff on code for a few hours after work over some pizza and beer.

Sounded good, and I needed an excuse to try to mix the awesome Protocol Buffers with my favourite data tool, R. What are Protocol Buffers? To quote from the Google overview page referenced above:

Protocol buffers are a flexible, efficient, automated mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs that are compiled against the "old" format.
and later on that page:
Protocol buffers are now Google's lingua franca for data – at time of writing, there are 48,162 different message types defined in the Google code tree across 12,183 .proto files. They're used both in RPC systems and for persistent storage of data in a variety of storage systems.

So three hours later, I had an implementation of the 'addressbook reader' C++ example wrapped in a tiny yet complete R package that passed R CMD check. And one lingua franca for data has met another.

So before going to bed, I quickly registered a new project at R-Forge, everybody's favourite R hosting site, and thanks to the tireless Stefan Theussl (and some favourable timezone differences) the project was approved and the stanza available by the time I got up. So I quickly filled the SVN repo and, presto, we had the RProtoBuf project at R-Forge within 12 hours of the ORD Sessions hackfest. I will try to follow up on RProtoBuf in a couple of days, this may lead to some changes in my Rcpp R / C++ interface package as well.

/computers/misc | permanent link