… that I made my first upload to CRAN as demonstrated by the very bottom of the ChangeLog file of the RQuantLib package:
2002-02-25 Dirk Eddelbuettel <edd@debian.org>
* Initial 0.1.0 release
And quite a few more uploads followed since.
(Also see the earlier twenty years ago … post about my initial contributions to the Debian R package I had by then adopted too.)
If you like this or other open-source work I do, you can now sponsor me at GitHub.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
A few days ago, a friend and I were riffing about the wonderful stability of R and (subsets of) R packages. The rigorous ASAN/UBSAN/Valgrind/… checks, while at times frustrating for us package maintainers when we do not have easily replicable setups [1], really help in ensuring code quality. As do of course all other layers of quality control at CRAN, and for R. In passing, I mentioned there was an older blog post demonstrating a little power-law-alike behaviour between the most frequent R Core committer and everybody else.
So I was intrigued. Could we just pick up a blog post I had written in August of 2007, or almost fourteen years ago, and run it as is? [2]
Yes, we can.
Which is truly, truly awesome.
Back then I must have taken a minor shortcut and analysed just one calendar year of SVN that was pre-extracted (and a few more still exists here if one scrolls down). Maybe then I might not have had the r-devel SVN repo checkout. But these days (and for probably a decade now) I do, and just a few lines of bash get us a full log:
#!/bin/bash
## adjust as needed
svn=${HOME}/svn/r-devel
rev=$(cd ${svn} && svn info --show-item revision)
today=$(date +%Y-%m-%d)
echo -n "Extracting ${rev} revisions at ${today} ... "
(cd ${svn} && svn log --limit ${rev} ) | gzip -9 > svn-log-${today}.txt.gz
echo "done"
So that leads to one code adjustments given the different input source. But otherwise the first paragraph runs as is (and now gives us 49.2% for the amazing Prof Ripley):
<- "svn-log-2021-03-20.txt.gz"
logfile
## cf http://dirk.eddelbuettel.com/blog/2007/08/11/
<- readLines(logfile)
x <- x[grep("^r",x)]
rx <- gsub(" ","",sapply(strsplit(rx,"\\|"),"[",2))
who <- table(who)
twho "ripley"]/sum(twho) twho[
That is what one gets by trusting stable interfaces: code untouched for fourteen years runs unchanged.
R itself has had well over sixty releases since then, including two major and eighteen minor releases. Yet the code just runs, including the code for two graphs one can reproduce with the exact same code as we show next.
<- unlist(sapply(rx,function(x)strsplit(x,split=" ")[[1]][6]))
tod <- tod[who=="ripley"]
tod
<- sub(pattern=".*(-[0-9]{4}).*",replacement="\\1",x=rx)
tz <- tz[who=="ripley"]
tz <- as.numeric(tz)/100
tz <- 3600*tz
offset
<- strptime(tod,format="%H:%M:%S")
z hist(z,"hours",main="Ripley Commit Times in SVN TZ")
<- z - offset
h <- format(h,format="%H")
h <- factor(as.numeric(h), levels=0:23)
h ## added as.vector() here to suppress a warning
dotchart(as.vector(table(h)), main="Ripley Commit Times, By Hour in GMT",
labels=paste(0:23,1:24,sep=":"))
The code reproduces the chart from 2008, but this time uses the full twenty plus years of SVN history. I added just one as.vector()
to suppress one new warning which appears under current R and which was presumably added in the fourteen years since (at the chart is produces without it too).
The remainder of the code also runs. I just added one library(zoo)
my blog post had omitted. No other changes.
## rather extract both date and time
<- unlist(sapply(rx, function(x) {
dat <- strsplit(x,split=" ")[[1]]
txt paste(txt[5], txt[6])
}))## subset on Prof Ripley
<- dat[who == "ripley"]
dat ## and convert to POSIXct, correcting by tz as well
<- as.POSIXct(strptime(dat,format="%Y-%m-%d %H:%M:%S")) - offset
datpt
## turn into zoo -- we use a constant series of ones as each
## committ is taken as a timestamped event
library(zoo)
<- zoo(1, order.by=datpt)
datzoo ## and use zoo to aggregate into commits per date
<- aggregate(datzoo, as.Date(index(datzoo)), sum)
daily
## now plot as grey bars
plot(daily, col='darkgrey', type='h', lwd=2,
ylab="Nb of SVN commits, three-week median",
xlab="R release dates 2.5.0 and 2.5.1 shown in orange",
main="The amazing Prof. Ripley")
## mark the two R releases of 2007
abline(v=c(as.Date("2007-04-24"),as.Date("2007-06-28")),col='orange',lwd=1.5)
## and do a quick centered rolling median
lines(rollmedian(daily, 21, align="center"), lwd=3)
It produces this chart spanning two decades of commits. [3]
The subtitle highlighting the then-most-recent releases is a little quaint now given that R has had eighten major.minor releases, and over sixty total releases, since then.
Stable and rigourously maintained interfaces are a fantastic resource that is dramatically under-appreciated. Efforts such as the ten-year reproduction challenge demonstrate that this really is not a given. Maybe instead of celebrating band aides (“look, I reproduce via code I have frozen in a virtual environment / container / machine / …”) we should celebrate languages, ecosystems, packages, … that allow us to rely on just the code itself.
Because we can.
And we should strengthen and reinforce that ability. And discourage rapid changes just for changes’ sake. Code running for a decade, or even longer, is a huge boon to everybody relying on it.
Three cheers to R Core.
[1] Docker containers would be really good, and a step above the specs in the README. Winston’s nice r-debug “sumo” container comes closest and helps a lot, and is updated regularly (which my earlier r-devel-san container is not).
[2] The post owes some of its code ideas to Ben Bolker and Simon Jackman, but links to now-stale prior affiliations of theirs.
[3] And the singularly impressive contributions charted remain unparalled, but were already the focus of the previous post. Yet over three times as a long period, they remain even more stunning.
Edit 2021-03-21: Two minor fixes for grammar and typing.
… this week that I made a first cameo in the debian/changelog for the Debian R package:
r-base (0.63.1-1) unstable; urgency=low
- New upstream release
- Linked html directory to /usr/doc/r-base/doc/html (Dirk Eddelbuettel)
– Douglas Bates bates@stat.wisc.edu Fri, 4 Dec 1998 14:22:19 -0600
For the next few years I assisted Doug here and there, and then formally took over in late 2001.
It’s been a really good and rewarding experience, and I hope to be able to help with this for a few more years to come.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
R 3.5.0 was released a few weeks ago. As it changes some (important) internals, packages installed with a previous version of R have to be rebuilt. This was known and expected, and we took several measured steps to get R binaries to everybody without breakage.
The question of but how do I upgrade without breaking my system was asked a few times, e.g., on the r-sig-debian list as well as in this StackOverflow question.
Core Distribution As usual, we packaged R 3.5.0 as soon as it was released – but only for the experimental
distribution, awaiting a green light from the release masters to start the transition. A one-off repository [drr35](https://github.com/eddelbuettel/drr35)
was created to provide R 3.5.0 binaries more immediately; this was used, e.g., by the r-base
Rocker Project container / the official R Docker container which we also update after each release.
The actual transition was started last Friday, June 1, and concluded this Friday, June 8. Well over 600 packages have been rebuilt under R 3.5.0, and are now ready in the unstable
distribution from which they should migrate to testing
soon. The Rocker
container r-base
was also updated.
So if you use Debian unstable or testing, these are ready now (or will be soon once migrated to testing
). This should include most Rocker
containers built from Debian images.
Contributed CRAN Binaries Johannes also provided backports with a -cran35
suffix in his CRAN-mirrored Debian backport repositories, see the README.
Core (Upcoming) Distribution Ubuntu, for the upcoming 18.10, has undertaken a similar transition. Few users access this release yet, so the next section may be more important.
Contributed CRAN and PPA Binaries Two new Launchpad PPA repositories were created as well. Given the rather large scope of thousands of packages, multiplied by several Ubuntu releases, this too took a moment but is now fully usable and should get mirrored to CRAN ‘soon’. It covers the most recent and still supported LTS releases as well as the current release 18.04.
One PPA contains base R and the recommended packages, RRutter3.5. This is source of the packages that will soon be available on CRAN. The second PPA (c2d4u3.5) contains over 3,500 packages mainly derived from CRAN Task Views. Details on updates can be found at Michael’s R Ubuntu Blog.
This can used for, e.g., Travis if you managed your own sources as Dirk’s r-travis
does. We expect to use this relatively soon, possibly as an opt-in via a variable upon which run.sh
selects the appropriate repository set. It will also be used for Rocker
releases built based off Ubuntu.
In both cases, you may need to adjust the sources
list for apt
accordingly.
There may also be ongoing efforts within Arch and other Debian-derived distributions, but we are not really aware of what is happening there. If you use those, and coordination is needed, please feel free to reach out via the the r-sig-debian list.
In case of questions or concerns, please consider posting to the r-sig-debian list.
Dirk, Michael and Johannes, June 2018
The tenth (!!) annual annual R/Finance conference will take in Chicago on the UIC campus on June 1 and 2, 2018. Please see the call for papers below (or at the website) and consider submitting a paper.
We are once again very excited about our conference, thrilled about who we hope may agree to be our anniversary keynotes, and hope that many R / Finance users will not only join us in Chicago in June -- and also submit an exciting proposal.
So read on below, and see you in Chicago in June!
R/Finance 2018: Applied Finance with R
June 1 and 2, 2018
University of Illinois at Chicago, IL, USA
The tenth annual R/Finance conference for applied finance using R will be held June 1 and 2, 2018 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past nine years, R/Finance has includedattendeesfrom around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2018.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.
All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Please submit proposals online at http://go.uic.edu/rfinsubmit. Submissions will be reviewed and accepted on a rolling basis with a final submission deadline of February 2, 2018. Submitters will be notified via email by March 2, 2018 of acceptance, presentation length, and financial assistance (if requested).
Financial assistance for travel and accommodation may be available to presenters. Requests for financial assistance do not affect acceptance decisions. Requests should be made at the time of submission. Requests made after submission are much less likely to be fulfilled. Assistance will be granted at the discretion of the conference committee.
Additional details will be announced via the conference website at http://www.RinFinance.com/ as they become available. Information on previous years'presenters and their presentations are also at the conference website. We will make a separate announcement when registration opens.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
Last week, Josh sent the call for papers to the R-SIG-Finance list making everyone aware that we will have our nineth annual R/Finance conference in Chicago in May. Please see the call for paper (at the link, below, or at the website) and consider submitting a paper.
We are once again very excited about our conference, thrilled about upcoming keynotes and hope that many R / Finance users will not only join us in Chicago in May 2017 -- but also submit an exciting proposal.
We also overhauled the website, so please see R/Finance. It should render well and fast on devices of all sizes: phones, tablets, desktops with browsers in different resolutions. The program and registration details still correspond to last year's conference and will be updated in due course.
So read on below, and see you in Chicago in May!
R/Finance 2017: Applied Finance with R
May 19 and 20, 2017
University of Illinois at Chicago, IL, USA
The ninth annual R/Finance conference for applied finance using R will be held on May 19 and 20, 2017 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past eight years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2017.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.
All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.
Please submit proposals online at http://go.uic.edu/rfinsubmit.
Submissions will be reviewed and accepted on a rolling basis with a final deadline of February 28, 2017. Submitters will be notified via email by March 31, 2017 of acceptance, presentation length, and financial assistance (if requested).
Additional details will be announced via the conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website. We will make a separate announcement when registration opens.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
Earlier today, Josh sent the text below in this message to the R-SIG-Finance list as the very first heads-up concerning the 2016 edition of our successful R/Finance series.
We are once again very excited about our conference, thrilled about upcoming keynotes (some of which are confirmed and some of which are in the works), and hope that many R / Finance users will not only join us in Chicago in May 2016 -- but also submit an exciting proposal.
So read on below, and see you in Chicago in May!
R/Finance 2016: Applied Finance with R
May 20 and 21, 2016
University of Illinois at Chicago, IL, USA
The eight annual R/Finance conference for applied finance using R will be held on May 20 and 21, 2016, in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past seven years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2016.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.
All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.
Please make your submission online at this link. The submission deadline is January 29, 2016. Submitters will be notified via email by February 29, 2016 of acceptance, presentation length, and financial assistance (if requested).
Additional details will be announced via the R/Finance conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
The annoucement below just went to the R-SIG-Finance list. More information is as usual at the R / Finance page.
The conference will take place on May 29 and 30, at UIC in Chicago. Building on the success of the previous conferences in 2009-2014, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.
We are very excited about the four keynote presentations given by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.
The conference agenda (currently) includes 18 full presentations and 19 shorter "lightning talks". As in previous years, several (optional) pre-conference seminars are offered on Friday morning.
There is also an (optional) conference dinner at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.
Registration information and agenda details can be found on the conference website as they are being finalized.
Registration is also available directly at the registration page.
We would to thank our 2015 sponsors for the continued support enabling us to host such an exciting conference:
International Center for Futures and Derivatives at UIC
Revolution Analytics
MS-Computational Finance and Risk Management at University of Washington
Ketchum Trading
OneMarketData
RStudio
SYMMS
On behalf of the committee and sponsors, we look forward to seeing you in Chicago!
For the program committee:See you in Chicago in May!
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
What the world needs right now is
Given the Warren Center's focus, the workshop centered around Big Data and Open Science with R. Yihui Xie and myself alternated on delivering four units on an Introduction to R, Writing R packages, Dynamic Documents with R, and HPC with Rcpp and RcppArmadillo.
So I had to come up with a plan for teaching R CMD ...
commands and only then switching to taking advantage of an environment such as the RStudio IDE.
The resulting slides are now available on my presentations page. The code examples are in a repo subdirectory on GitHub as well. While both were designed to support the parallel live instruction offered in the workshop, I would be interested in feedback (preferably via email) about how useful the slides are by themselves.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
Earlier today, Josh send the text below to the R-SIG-Finance list, and I updated the R/Finance website, including its Call for Papers page, accordingly.
We are once again very excited about our conference, thrilled about the four confirmed keynotes, and hope that many R / Finance users will not only join us in Chicago in May 2015 -- but also submit an exciting proposal.
So read on below, and see you in Chicago in May!
R/Finance 2015: Applied Finance with R
May 29 and 30, 2015
University of Illinois at Chicago, IL, USA
The seventh annual R/Finance conference for applied finance using R will be held on May 29 and 30, 2015 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past six years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2015. This year will include invited keynote presentations by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.
All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.
Please make your submission online at this link. The submission deadline is January 31, 2015. Submitters will be notified via email by February 28, 2015 of acceptance, presentation length, and financial assistance (if requested).
Additional details will be announced via the R/Finance conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal,
Jeffrey Ryan, Joshua Ulrich
Earlier this evening I gave a short talk about R and Docker at the September Meetup of the Docker Chicago group.
Thanks to Karl Grzeszczak for setting the meeting, and for providing a pretty thorough intro talk regarding CoreOS and Docker.
My slides are now up on my presentations page.
http://www.RinFinance.com/agenda/Registration information is available at
http://www.RinFinance.com/register/and can also be directly accessed by going to
http://www.regonline.com/RFinance2014We would to thank our 2014 Sponsors for the continued support enabling us to host such an exciting conference:
International Center for Futures and Derivatives at UIC
Revolution Analytics
MS-Computational Finance at University of Washington
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
Some participants, myself included, had already posted on their personal websites (though had forgotten to mention it here). In any event, I just updated the website with links to the pdf (or ppt) slides of all presenters who shared their material with us. Supplemental material may be made available too at a later date.
We hope you find these slides useful. Please do spread the word about the R/Finance conference as we expect to have a sixth edition in May 2014---and we do look forward to receiving even more outstanding submissions. Dates, details, call for papers, etc will be forthcoming over the next few months.
Now open for registrations:
R / Finance 2013: Applied Finance with R
May 17 and 18, 2013
Chicago, IL, USA
The registration for R/Finance 2013 -- which will take place May 17 and 18 in Chicago -- is NOW OPEN!
Building on the success of the previous conferences in 2009, 2010, 2011 and 2012, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.
We are very excited about the four keynotes by Sanjiv Das, Attilio Meucci, Ryan Sheftel and Ruey Tsay. The main agenda (currently) includes seventeen full presentations and fifteen shorter "lightning talks". We are also excited to offer five optional pre-conference seminars on Friday morning.
To celebrate the fifth year of the conference in style, the dinner will be held at The Terrace of the Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.
More details of the agenda are available at:
http://www.RinFinance.com/agenda/Registration information is available at
http://www.RinFinance.com/register/and can also be directly accessed by going tohttp://www.regonline.com/RFinance2013We would to thank our 2013 Sponsors for the continued support enabling us to host such an exciting conference:International Center for Futures and Derivatives at UICRevolution Analytics
MS-Computational Finance at University of Washington
lemnica
OpenGamma
OneMarketData
RStudio
On behalf of the committee and sponsors, we look forward to seeing you in Chicago!
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
See you in Chicago in May!!
Call for Papers:
R/Finance 2013: Applied Finance with R
May 17 and 18, 2013
University of Illinois, Chicago, IL, USA
The fifth annual R/Finance conference for applied finance using R will be held on May 17 and 18, 2013 in Chicago, IL, USA at the University of Illinois at Chicago. The conference is expected to cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past four years, R/Finance has included attendees from around the world. It featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2013.
We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for full talks and abbreviated "lightning talks". Both academic and practitioner proposals related to R are encouraged.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters at the discretion of the conference committee. Requests for assistance should be made at the time of submission.
Please send submissions to: committee at RinFinance.com. The submission deadline is February 15, 2013. Submitters will be notified of acceptance via email by February 28, 2013. Notification of whether a presentation will be a long presentation or a lightning talk will also be made at that time.
Additional details will be announced at this website as they become available. Information on previous year's presenters and their presentations are also at the conference website.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
So see you in Chicago in May!
Now open for registrations:
R / Finance 2012: Applied Finance with R
May 11 and 12, 2012
Chicago, IL, USA
The registration for R/Finance 2012 -- which will take place May 11 and 12 in Chicago -- is NOW OPEN!
Building on the success of the three previous conferences in 2009, 2010, and 2011, we expect more than 250 attendees from around the world. R users from industry, academia, and government will join 40+ presenters covering all areas of finance with R.
This year's conference will start earlier in the day on Friday, to accommodate the tremendous line up of speakers for 2012, as well as to provide more time between talks for networking.
We are very excited about the four keynotes by Paul Gilbert, Blair Hull, Rob McCulloch, and Simon Urbanek. The main agenda includes nineteen full presentations and eighteen shorter "lightning talks". We are also excited to offer six optional pre-conference seminars on Friday morning.
Once again, we are hosting the R/Finance conference dinner on Friday evening, where you can continue conversations while dining and drinking atop a West Loop restaurant overlooking the Chicago skyline.
More details of the agenda are available at:
http://www.RinFinance.com/agenda/Registration information is available at
http://www.RinFinance.com/register/and can also be directly accessed by going tohttp://www.regonline.com/RFinance2012On behalf of the committee and sponsors, we look forward to seeing you in Chicago!
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,Our 2012 Sponsors:
Dale Rosenthal, Jeffrey Ryan, Joshua UlrichInternational Center for Futures and Derivatives at UICRevolution Analytics
Sybase
MS-Computational Finance at University of Washington
lemnica
OpenGamma
OneTick
RStudio
Tick Data
See you in Chicago in May!!
Call for Papers:
R/Finance 2012: Applied Finance with R
May 11 and 12, 2012
University of Illinois, Chicago, IL, USA
The fourth annual R/Finance conference for applied finance using R will be held on May 11 and 12, 2012 in Chicago, IL, USA on the campus of the University of Illinois at Chicago. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Over the past three years, R/Finance has included attendees from around the world and featured keynote presentations from prominent academics and practitioners. We anticipate another exciting line-up for 2012 --- including keynote presentations from Blair Hull, Paul Gilbert, Rob McCulloch, and Simon Urbanek.
We invite you to submit complete papers or one-page abstracts (in txt or pdf format) for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated "lightning talks", and for a limited number of (longer) pre-conference seminar sessions.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Travel and accommodation grants may be available for selected presenters at the discretion of the committee. In addition, the conference will award prizes for best papers. To be eligible for a best paper award, a submission must be a full paper. Extended abstracts, even if a full paper by conference time, are not eligible for a best paper award.
Please send submissions to: committee at RinFinance.com.
The submission deadline is January 31, 2012. Submitters will be notified of acceptance via email by February 28, 2012. Notification of whether a presentation will be a long presentation or a lightning talk will also be made at that time.
Additional details will be announced at this website as they become available. Information on previous year's presenters and their presentations are also at the conference website R/Finance 2009, 2010 and 2011.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
So see you in Chicago in May!
Update: Corrected urls to past conference thanks to heads-up by Josh. Thanks!
The organizing committee for the R/Finance 2011 conference is pleased to announce the availability of presentation slides from the 3rd annual R/Finance conference. This year's two-day conference once again attracted over 200 participants from across the globe. Academics, students and industry professionals enjoyed almost 30 talks covering trading, optimization, risk management and more --- all using R!The majority of these presentations are now available for download at:
http://www.RinFinance.com/agenda/This year we began offering prizes for the best paper submissions. The 2011 recipients are Robert Gramacy (University of Chicago) and David Matteson (Cornell University) who each won USD 1000. Also new was a graduate student travel award: Mikko Niemenmaa (Aalto University) and Clément Dunand-Châtellet (École Polytechnique) each received USD 500.With this, the organizing committee would like to thank our lead conference sponsors, the International Center for Futures and Derivatives at UIC and Revolution Analytics, as well as our conference sponsors OneMarketData, RStudio and Lemnica for their continued support.
The organising committee would also like to thank all of the presenters and participants for making R/Finance 2011 so successful. We look forward to seeing you in 2012, with the prospective dates of May 17 - 19 to be confirmed.
For the organizing committee,
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
Enjoy!
so the compiler is not yet used for R's own base and recommended packages.o Package compiler is now provided as a standard package. See ?compiler::compile for information on how to use the compiler. This package implements a byte code compiler for R: by default the compiler is not used in this release. See the 'R Installation and Administration Manual' for how to compile the base and recommended packages.
While working on my slides for the upcoming Rcpp workshop preceding R/Finance 2011, I thought of a nice test example to illustrate the compiler. Last summer and fall, Radford Neal had sent a very nice, long and detailed list of possible patches for R performance improvements to the development list. Some patches were integrated and some more discussion ensued. One strand was on the difference in parsing between normal parens and curly braces. In isolation, these seem too large, but (as I recall) this is due some things 'deep down' in the parser.
However, some folks really harped on this topic. And it just doesn't die as a post from last week demonstrated once more. Last year, Christian Robert had whittled it down to a set of functions, and I made a somewhat sarcastic post argueing that I'd rather use Rcpp to get 80-fold speed increase than spend my time argueing over ten percent changes in code that could be made faster so easily.
So let us now examine what the compiler package can do for us. The starting point is the same as last year: five variants
of computing 1/(1+x) for a scalar x inside an explicit for
loop. Real code would never do it this way as vectorisation comes
to the rescue. But for (Christian's) argument's sake, it is useful to highlight differences in the parser. We once again use
the nice rbenchmark package to run, time and summarise alternatives:
This replicates Christian's result. We find that function> ## cf http://dirk.eddelbuettel.com/blog/2010/09/07#straight_curly_or_compiled > f <- function(n, x=1) for (i in 1:n) x=1/(1+x) > g <- function(n, x=1) for (i in 1:n) x=(1/(1+x)) > h <- function(n, x=1) for (i in 1:n) x=(1+x)^(-1) > j <- function(n, x=1) for (i in 1:n) x={1/{1+x}} > k <- function(n, x=1) for (i in 1:n) x=1/{1+x} > ## now load some tools > library(rbenchmark) > ## now run the benchmark > N <- 1e6 > benchmark(f(N,1), g(N,1), h(N,1), j(N,1), k(N,1), + columns=c("test", "replications", + "elapsed", "relative"), + order="relative", replications=10) test replications elapsed relative 5 k(N, 1) 10 9.764 1.00000 1 f(N, 1) 10 9.998 1.02397 4 j(N, 1) 10 11.019 1.12853 2 g(N, 1) 10 11.822 1.21077 3 h(N, 1) 10 14.560 1.49119
k()
is the fastest using curlies, and that explicit exponentiation in
function h()
is the slowest with a relative penalty of 49%, or an absolute difference of almost five seconds between the 9.7
for the winner and 14.6 for the worst variant. On the other hand, function f()
, the normal way of writing things, does pretty
well.
So what happens when we throw the compiler into the mix? Let's first create compiled variants using the new cmpfun()
function and then try again:
Now things have gotten interesting and substantially faster, for very little cost. Usage is straightforward: take your function and compile it, and reap more than a threefold speed gain. Not bad at all. Also of note, the difference between the different expressions essentially vanishes. The explicit exponentiation is still the loser, but there may be an additional explicit function call involved.> ## R 2.13.0 brings this toy > library(compiler) > lf <- cmpfun(f) > lg <- cmpfun(g) > lh <- cmpfun(h) > lj <- cmpfun(j) > lk <- cmpfun(k) > # now run the benchmark > N <- 1e6 > benchmark(f(N,1), g(N,1), h(N,1), j(N,1), k(N,1), + lf(N,1), lg(N,1), lh(N,1), lj(N,1), lk(N,1), + columns=c("test", "replications", + "elapsed", "relative"), + order="relative", replications=10) test replications elapsed relative 9 lj(N, 1) 10 2.971 1.00000 10 lk(N, 1) 10 2.980 1.00303 6 lf(N, 1) 10 2.998 1.00909 7 lg(N, 1) 10 3.007 1.01212 8 lh(N, 1) 10 4.024 1.35443 1 f(N, 1) 10 9.479 3.19051 5 k(N, 1) 10 9.526 3.20633 4 j(N, 1) 10 10.775 3.62673 2 g(N, 1) 10 11.299 3.80310 3 h(N, 1) 10 14.049 4.72871
So we do see the new compiler as a potentially very useful addition. I am sure more folks will jump on this and run more
tests to find clearer corner cases. To finish, we have to of course once more go back to
Rcpp for some
Rcpp still shoots the lights out by a factor of 80 (or even almost 120 to the worst manual implementation) relative to interpreted code. Relative to the compiled byte code, the speed difference is about 25-fold. Now, these are of course entirely unrealistic code examples that are in no way, shape or form representative of real R work. Effective speed gains will be smaller for both the (pretty exciting new) compiler package and also for our C++ integration package Rcpp.> ## now with Rcpp and C++ > library(inline) > ## and define our version in C++ > src <- 'int n = as<int>(ns); + double x = as<double>(xs); + for (int i=0; i<n; i++) x=1/(1+x); + return wrap(x); ' > l <- cxxfunction(signature(ns="integer", + xs="numeric"), + body=src, plugin="Rcpp") > ## now run the benchmark again > benchmark(f(N,1), g(N,1), h(N,1), j(N,1), k(N,1), + l(N,1), + lf(N,1), lg(N,1), lh(N,1), lj(N,1), lk(N,1), + columns=c("test", "replications", + "elapsed", "relative"), + order="relative", replications=10) test replications elapsed relative 6 l(N, 1) 10 0.120 1.0000 11 lk(N, 1) 10 2.961 24.6750 7 lf(N, 1) 10 3.128 26.0667 8 lg(N, 1) 10 3.140 26.1667 10 lj(N, 1) 10 3.161 26.3417 9 lh(N, 1) 10 4.212 35.1000 5 k(N, 1) 10 9.500 79.1667 1 f(N, 1) 10 9.621 80.1750 4 j(N, 1) 10 10.868 90.5667 2 g(N, 1) 10 11.409 95.0750 3 h(N, 1) 10 14.077 117.3083
Before I close, two more public service announcements. First, if you use Ubuntu see this post by Michael on r-sig-debian announcing his implementation of a suggestion of mine: we now have R alpha/beta/rc builds via his Launchpad PPA. Last Friday, I had the current R-rc snapshot of R 2.13.0 on my Ubuntu box only about six hours after I (as Debian maintainer for R) uploaded the underlying new R-rc package build to Debian unstable. This will be nice for testing of upcoming releases. Second, as I mentioned, the Rcpp workshop on April 28 preceding R/Finance 2011 on April 29 and 30 still has a few slots available, as has the conference itself.
One week ago, I sent the updated announcement below to the r-sig-finance list; this was kindly blogged about by fellow committee member Josh and by our pal Dave @ REvo. By now. I also updated the R / Finance conference website. So to round things off, a quick post here is in order as well. It may even get a few of the esteemed reader to make a New Year's resolution about submitting a paper :)
Dear R / Finance community,The preparations for R/Finance 2011 are progressing, and due to favourable responses from the different sponsors we contacted, we are now able to offer
More details are below in the updated Call for Papers. Please feel free to re-circulate this Call for Papers with collegues, students and other associations.
Cheers, and Season's Greeting,
Dirk (on behalf of the organizing / program committee)
The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
Complete papers or one-page abstracts (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated lightning talks, and for a limited number of pre-conference (longer) seminar sessions.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
The conference will award two $1000 prizes for best paper: one for best practitioner-oriented paper and one for best academic-oriented paper. Further, to defray costs for graduate students, two travel and expense grants of up to $500 each will be awarded to graduate students whose papers are accepted. To be eligible, a submission must be a full paper; extended abstracts are not eligible.
Please send submissions to: committee at RinFinance.com
The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling. The graduate student grant winners will be notified by February 23rd, 2011.
Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.
R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011---including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich
The user group meetings have a meme of showing how to use R with different editors, UIs, IDEs,... It started with a presentation on Eclipse and its StatET plugin. So a while ago I had offered to present on ESS, the wonderful Emacs mode for R (and as well as SAS, Stata, BUGS, JAGS, ...). And now I owe a big thanks to the ESS Core team for keeping all their documentation, talks, papers etc in their SVN archive, and particularly to Stephen Eglen for putting the source code to Tony Rossini's tutorial from useR! 2006 in Vienna there. This allowed me to quickly whip up a few slides though a good part of the presentation did involve a live demo missing from the slides. Again, big thanks to Tony for the old slides and to Stephen for making them accessible when I mentioned the idea of this talk a while back -- it allowed to put this together on short notice.
And for those going to useR! 2011 in Warwick next summer, Stephen will present a full three-hour ESS tutorial which will cover ESS in much more detail.
A video recording of our ninety-minute talk is already available via the YouTube channel for Google Tech Talks. The (large) pdf with slides (which Romain had already posted on slideshare) is also available from my presentations page.
The remainder of the weekend was nice too (with the notably exception of the extremly sucky weather). We got to to spend some time at the Google Summer of Code Mentor Summit which is always a fun event and a great way to meet other open source folks in person. And we also took one afternoon off to spend some with John Chambers discussing further work involving Rcpp and the new ReferenceClasses that appeared in the just-released R version 2.12.0. This should be a nice avenue to further integrate R and C++ in the near future.
Call for Papers:
R/Finance 2011: Applied Finance with R
April 29 and 30, 2011
Chicago, IL, USA
The third annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 29 and 30, 2011. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.
One-page abstracts or complete papers (in txt or pdf format) are invited to be submitted for consideration. Academic and practitioner proposals related to R are encouraged. We welcome submissions for full talks, abbreviated "lightning talks", and for a limited number of pre-conference (longer) seminar sessions.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Please send submissions to: committee at RinFinance.com.
The submission deadline is February 15th, 2011. Early submissions may receive early acceptance and scheduling.
Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.
R/Finance 2009 and 2010 included attendees from around the world and featured keynote presentations from prominent academics and practitioners. 2009-2010 presenters names and presentations are online at the conference website. We anticipate another exciting line-up for 2011 including keynote presentations from John Bollinger, Mebane Faber, Stefano Iacus, and Louis Kates. Additional details will be announced via the conference website as they become available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,
Dale Rosenthal, Jeffrey Ryan, Joshua UlrichSo see you in Chicago in April!
Now, let me prefix this by saying that I really enjoyed Radford's posts. He obviously put a lot of time into finding a number of (all somewhat small in isolation) inefficiencies in R which, when taken together, can make a difference in performance. I already spotted one commit by Duncan in the SVN logs for R so this is being looked at.
Yet Christian, on the other hand, goes a little overboard in bemoaning performance differences somewhere between ten and fifteen percent -- the difference between curly and straight braces (as noticed in Radford's first post). Maybe he spent too much time waiting for his MCMC runs to finish to realize the obvious: compiled code is evidently much faster.
And before everybody goes and moans and groans that that is hard, allow me to just interject and note that it is not. It really
doesn't have to be. Here is a quick
cleaned up version of Christian's example code, with proper assigment operators and a second variable x
. We then get to the
meat and potatoes and load our
Rcpp package as well as
inline to define the same little test function in C++. Throw in
rbenchmark which I am becoming increasingly fond of for these little timing tests,
et voila, we have ourselves a horserace:
# Xian's code, using <- for assignments and passing x down f <- function(n, x=1) for (i in 1:n) x=1/(1+x) g <- function(n, x=1) for (i in 1:n) x=(1/(1+x)) h <- function(n, x=1) for (i in 1:n) x=(1+x)^(-1) j <- function(n, x=1) for (i in 1:n) x={1/{1+x}} k <- function(n, x=1) for (i in 1:n) x=1/{1+x} # now load some tools library(Rcpp) library(inline) # and define our version in C++ l <- cxxfunction(signature(ns="integer", xs="numeric"), 'int n = as<int>(ns); double x=as<double>(xs); for (int i=0; i<n; i++) x=1/(1+x); return wrap(x); ', plugin="Rcpp") # more tools library(rbenchmark) # now run the benchmark N <- 1e6 benchmark(f(N, 1), g(N, 1), h(N, 1), j(N, 1), k(N, 1), l(N, 1), columns=c("test", "replications", "elapsed", "relative"), order="relative", replications=10)
And how does it do? Well, glad you asked. On my i7, which the other three cores standing around and watching, we get an eighty-fold increase relative to the best interpreted version:
/tmp$ Rscript xian.R Loading required package: methods test replications elapsed relative 6 l(N, 1) 10 0.122 1.000 5 k(N, 1) 10 9.880 80.984 1 f(N, 1) 10 9.978 81.787 4 j(N, 1) 10 11.293 92.566 2 g(N, 1) 10 12.027 98.582 3 h(N, 1) 10 15.372 126.000 /tmp$So do we really want to spend time arguing about the ten and fifteen percent differences? Moore's law gets you those gains in a couple of weeks anyway. I'd much rather have a conversation about how we can get people speed increases that are orders of magnitude, not fractions. Rcpp is one such tool. Let's get more of them.
As at the preceding useR! 2008 in Dortmund and useR! 2009 in Rennes, I presented a three-hour tutorial on high-performance computing with R. This covers scripting/automation, profiling, vectorisation, interfacing compiled code, parallel computing and large-memory approaches. The slides, as well as a condensed 2-up version, are now on my presentations page.
On Wednesday, Romain and I had a chance to talk about recent work on Rcpp, our R and C++ integration. Thursday, we followed up with a presentation on RProtoBuf -- a project integrating Google's Protocol Buffers with R which much to our delight already seems to be in use at Google itself! It was quite fun to do these two talks jointly with Romain. But my other coauthor Khanh had to be at a conference related to his actual PhD work. So on Friday it was just me to give a presentation about RQuantLib which brings QuantLib to R.
Slides from all these talks have now been added to my presentations page. I will also upload them via the conference form so that they can be part of the conference's collection of presentations which should be forthcoming.
On Friday, I also gave an informal lecture / tutorial / workshop to some of the Stats and Finance Ph.D. students, drawing largely from the section on parallel computing of the most recent Introduction to High-Performance Computing with R tutorial.
My sincere thanks to Kurt Hornik and Stefan Theussl for the invite -- it was a great trip, notwithstanding the mostly unseasonally cold and wet weather.
As a co-organizer, it was a great pleasure to see so many users of R in Finance---from both industry and academia---come to Chicago to discuss and share recent work. There is a lot going on, and it is always good to exchange ideas with others sharing the same infrastructure. Participants appeared to enjoy the conference. My thanks to everybody who helped to put it together, from the local committee to the helping hands at UIC and of course the sponsors.
I just put my slides from the Extending and Embedding R with C++ tutorial preceding the conference, as well as the RQuantLib: Interfacing QuantLin from R presentation (with Khanh Nguyen), up onto my presentations page. I do have a usb-drive with all conference presentations and will provide them via the R / Finance site in a few days.
The only truly sour note is the fact that several presenters from Europe had their travels schedules turned upside down by the disruption to international air travel caused by the Icelandic volcano eruption and the resulting ash clouds. While we are glad to have had them for a little longer in Chicago, we understand that they are getting eager to return home. I hope this extended stay in the Windy City does not take away from the overall usefulness of the trip.
Thanks also to David Smith (at the REvolutions blog) and Drew Conway (at his blog) for spreading the word about the presentation video and slides -- quite a few folks have come to my presentations page to get them.
The talks centered around R and C++ integration using both Rcpp and RInside and summarise where both projects stand after all the recent work Romain and I put in over the last few months. The presentations went fairly well; I received some favourable comments.
Szilard and the R User Group had also suggested a group discussion about CRAN, its growth and how to maximise its usefulness. Given my CRANberries feed, my work on the CRAN Task Views for Empirical Finance and High-Performance Computing with R as well as our cran2deb binary package generator, I had some views and ideas that helped frame the discussion which turned out to very useful and informed. So maybe we should do this User Group thing in Chicago too!
Special thanks to Jan de Leeuw and Szilard Pafka for organising the meeting, talks and discussion.
But what everybody seems to be forgetting is that R has had a Sudoku solver for years, thanks to the sudoku package by David Brahm and Greg Snow which was first posted four years ago. What comes around, goes around.
With that, and about one minute of Emacs editing to get the Le Monde puzzle into the required ascii-art form, all we need to do is this:
That took all of five seconds while my computer was also compiling a particularly resource-hungry C++ package....R> library(sudoku) R> s <- readSudoku("/tmp/sudoku.txt") R> s [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 8 0 0 0 0 1 2 0 0 [2,] 0 7 5 0 0 0 0 0 0 [3,] 0 0 0 0 5 0 0 6 4 [4,] 0 0 7 0 0 0 0 0 6 [5,] 9 0 0 7 0 0 0 0 0 [6,] 5 2 0 0 0 9 0 4 7 [7,] 2 3 1 0 0 0 0 0 0 [8,] 0 0 6 0 2 0 1 0 9 [9,] 0 0 0 0 0 0 0 0 0 R> system.time(solveSudoku(s)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 8 4 9 6 7 1 2 5 3 [2,] 6 7 5 2 4 3 9 1 8 [3,] 3 1 2 9 5 8 7 6 4 [4,] 1 8 7 4 3 2 5 9 6 [5,] 9 6 4 7 8 5 3 2 1 [6,] 5 2 3 1 6 9 8 4 7 [7,] 2 3 1 8 9 4 6 7 5 [8,] 4 5 6 3 2 7 1 8 9 [9,] 7 9 8 5 1 6 4 3 2 user system elapsed 5.288 0.004 5.951 R>
Just in case we needed another illustration that it is hard to navigate the riches and wonders that is CRAN...
Now open for registrations:
R / Finance 2010: Applied Finance with R
April 16 and 17, 2010
Chicago, IL, USA
The second annual R / Finance conference for applied finance using R, the premier free software system for statistical computation and graphics, will be held this spring in Chicago, IL, USA on Friday April 16 and Saturday April 17.
Building on the success of the inaugural R / Finance 2009 event, this two-day conference will cover topics as diverse as portfolio theory, time-series analysis, as well as advanced risk tools, high-performance computing, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management and trading.
Invited keynote presentations by Bernhard Pfaff, Ralph Vince, Mark Wildi and Achim Zeileis are complemented by over twenty talks (both full-length and 'lightning') selected from the submissions. Four optional tutorials are also offered on Friday April 16.
R / Finance 2010 is organized by a local group of R package authors and community contributors, and hosted by the International Center for Futures and Derivatives (ICFD) at the University of Illinois at Chicago.
Conference registration is now open. Special advanced registration pricing is available, as well as discounted pricing for academic and student registrations.
More details and registration information can be found at the website at
http://www.RinFinance.com
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan
See you in Chicago in April!
As mentioned yesterday, I spent a few days last week in Japan as I had an opportunity to present the Introduction to High-Performance Computing with R tutorial at the Institute for Statistical Mathematics in Tachikawa near Tokyo thanks to an invitation by Junji Nakano.
An updated version of the presentations slides (with a few typos corrected) is now available as is a 2-up handout version. Compared to previous versions, and reflecting the fact that this was the 'all-day variant' of almost five hours of lectures, the following changes were made:
Comments and suggestions are, as always, appreciated.
So without further ado, and given the success of our initial R / Finance 2009 conference about R in Finance, here is the call for papers for next spring:
Call for Papers:
R/Finance 2010: Applied Finance with R
April 16 and 17, 2010
Chicago, IL, USA
The second annual R/Finance conference for applied finance using R will be held this spring in Chicago, IL, USA on April 16 and 17, 2010. The two-day conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management and trading.
One-page abstracts or complete papers (in txt or pdf format) are invited for consideration. Academic and practitioner research proposals related to R are encouraged. We will accept submissions for full talks, abbreviated "lightning talks", and a limited number of pre-conference tutorial sessions. Please indicate with your submission if you would be willing to produce a formal paper (10-15 pages) for a peer-reviewed conference proceedings publication.
Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.
Please send submissions to: committee at RinFinance.com
The submission deadline is December 31st, 2009.
Submissions will be evaluated and submitters notified via email on a rolling basis. Determination of whether a presentation will be a long presentation or a lightning talk will be made once the full list of presenters is known.
R/Finance 2009 included keynote presentations by Patrick Burns, Robert Grossman, David Kane, Roger Koenker, David Ruppert, Diethelm Wuertz, and Eric Zivot. Attendees included practitioners, academics, and government officials. We anticipate another exciting line-up for 2010 and will announce details at the conference website http://www.RinFinance.com as they become available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan
See you in Chicago in April!
One of such cases just happened a few minutes ago. The aforementioned Garmin Forerunner 405 can cooperate quite nicely with Linux using the gant reader for the ant wireless communication protocol between the usb hardware dongle and the Garmin 405. (Sources for gant are both this file and this git archive.) I had meant to blog about this tool and the resulting files one of these days anyway, but today I just want to mention that the default filenames created by the program were somewhat horrid such as 20.09.2009 101112.TCX to denote the 20th of September of this year at 10:11h and 12 seconds. As we all know, filenames with spaces are bad for the environment as well as plain annoying. So I had made the simple change in the C sources to switch to a saner format such as 20090920-101112.TCX (and I see that the git archive now contains a similar fix). But that still left me with some 80+ files with the dreaded names.
There are of course many ways to skin this cat and to rename the files in bulk. However, I found the following four lines to be fairly succinct
#!/usr/bin/r files <- dir(".", pattern=".*\\.TCX$") res <- lapply(files, function(f) { pt <- strptime(f, "%d.%m.%Y %H%M%S.TCX") # parsed time ft <- strftime(pt, "%Y%m%d-%H%M%S.TCX") # formatted time file.rename(f, ft) })as they show, among other things,
Lastly, I do not mean to imply that Python or Perl or Ruby or (insert favourite tool here) cannot do it equally well. I simply meant to say that programmatically creating new filenames is definitely easier in R than it would have been in shell. And as an added bonus, we even get fully parsed time objects that I could have tested for. But then tests and documentation never get written on a Saturday.
This is essentially a '2.0' version of earlier work with Steffen Moeller and David Vernazobres which we had presented in 2007. Then, the approach was top-down and monolithic which started to show its limits. This time, the idea was to borrow the successful bottom-up approach of my CRANberries feed.
The bulk of the work was done by Charles Blundell as part of his Google Summer of Code 2008 project which I had suggested and mentored. After that project had concluded, we both felt we should continue with it and bring it to 'production'. The CRAN hosts provided us with a (virtual Xen) machine to build on, and we are now ready to more publically announce the availability of the repositories for i386 and amd64:
deb http://debian.cran.r-project.org/cran2deb/debian-i386 testing/and
deb http://debian.cran.r-project.org/cran2deb/debian-amd64 testing/
A few more details are provided in our presentation slides. We look forward to hearing from folks using; the r-sig-debian list may be a good venue for this.
As last year (and again at the BoC in December), I presented a three-hour tutorial on high-performance computing with R. This covers profiling, vectorisation, interfacing compiled code, debugging, parallel computing, as well as scripting and automation. Slides, and a 2-up version, are now on my presentations page.
I also gave two regular conference presentations. The first was on my Rcpp and RInside packages which facilitate interfacing R and C++. The second talk, based on joint work with Charles Blundell, describes our cran2deb system for creating Debian packages of essentially all CRAN packages. I will try to follow up on this with another post. Slides from these talks are also on my presentations page.
Speaking of broken, I had neither noticed that this R version now returns an additional field (for the repository) in the per-package metadata via available.packages(), nor that this change had broken my oh-so-useful and increasingly popular CRANberrries html and rss summaries of CRAN changes. So with the usual beta and rc releases or R 2.9.1 in Debian starting a week prior, CRANberries had been silent for six days from Friday the 21st to last Thursday. I rectified it once I noticed, and changed the code to no longer fall on its nose at that spot. Sorry for the few days without service.
This tutorial was a shorter format of just an hour which did not allow for any parallel computing with R. However, parallel computing with R via MPI, snow, nws, ... is covered in the slides from December's workshop at the BoC.
We were fortunate to get seven outstanding invited keynote speakers, as well as eleven excellent presentations. This was preceded by four short tutorials (and I'll post slides from my Introduction to High-Performance Computing with R soon). With about 150 registered participants, plus keynoters, presenters, committee members, representatives from the sponsors (a quick shout of Thanks! to them), some folks from UIC (especially Holly without whom few things would have happened), we were probably around 200 people gathered at UIC. And then there was an extended social program at Jaks which is rather appropriate as we had numerous important committee meetings there over the preceding months. All in all it seems like a successful event. We may even do it again.
I just posted my slides on my presentations page. The slides give a brief overview of R, the CRAN network and the by now over 1600 packages, mention the Finance Task View, briefly present four different packages (or package sets) and of course beat the drum for our upcoming R/Finance conference that will take place here in Chicago at the end of next month.
See you in Chicago in April!
Anyway, the reason for this post was that the R / kdb+ glue code works well ... but not for datetimes. I really like to be able to pass date/time objects natively between systems as easily as, say, numbers or strings (and see e.g. my Rcpp package for doing this with R and C++) and I was a bit annoyed when the millisecond timestamps didn't move smoothly. Turns out that the basic converter function in the code had a number of problems: it converted to integer, only covered a single scalar rather than vectorised mode, and erroneously reduced a reference count. A better version, in my view, is as follows:
This deals with vectors as well as scalars, converts Kdb's 'fractional days since Jan 1, 2000' to the Unix standard of seconds since the epoch -- including the R extension of fractional seconds -- and as importantly, sets the class attributes tostatic SEXP from_datetime_kobject(K x) { SEXP result; int i, length = x->n; if (scalar(x)) { result = PROTECT(allocVector(REALSXP, 1)); REAL(result)[0] = (kF(x)[0] + 10957) * 86400; } else { result = PROTECT(allocVector(REALSXP, length)); for(i = 0; i < length; i++) { REAL(result)[i] = (kF(x)[i] + 10957) * 86400; } } SEXP datetimeclass = PROTECT(allocVector(STRSXP,2)); SET_STRING_ELT(datetimeclass, 0, mkChar("POSIXt")); SET_STRING_ELT(datetimeclass, 1, mkChar("POSIXct")); setAttrib(result, R_ClassSymbol, datetimeclass); UNPROTECT(2); return result; }
POSIXt POSIXct
as needed by R. With
that, a simple select max datetime from table
does just that,
and vectors of timestamped records of trades or quotes or whatever also
come with proper POSIXct
behaviour into R. Note that it needs TZ to be set to UTC, though,
or you get a timezone offset you may not want.
"I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet."
That's silly on so many levels. A concise and rather appropriate follow-up came in early from Frank Harrell, a long-time S and R advocate:
This is great to see. It's interesting that SAS Institute feels that non-peer-reviewed software with hidden implementations of analytic methods that cannot be reproduced by others should be trusted when building aircraft engines.
Achim already added this (and two more posts from the aforementioned threads) to the fortunes package that collects such choice quotes.
R in Finance (the topic of our upcoming conference) gets mentioned as well. Now, as editor of the Finance task view, I find that second half of
The financial services community has demonstrated a particular affinity for R; dozens of packages exist for derivatives analysis alone.to be a little off the mark. But that's minor as the article is broadly sympathetic, and mostly "gets it" where it matters. Recommended.
Call for PapersSee you in Chicago in April!The Finance Department of the University of Illinois at Chicago (UIC),
the International Center for Futures and Derivatives at UIC, and
members of the R finance community are pleased to announceR/Finance 2009: Applied Finance with R
on April 24 and 25, 2009, in Chicago, IL, USA
Confirmed keynote speakers include:
Patrick Burns (Burns Statistics)We invite all users of R in Finance to submit one-page abstracts or
David Kane (Kane Capital)
Roger Koenker (U of Illinois at Urbana/Champaign)
David Ruppert (Cornell)
Diethelm Wuertz (ETH Zuerich)
Eric Zivot (U of Washington)
complete papers (in txt/pdf/doc format). We encourage papers both on
academic research topics and related to use of R by Finance practitioners.
Presenters are strongly encouraged to provide working R code to accompany
the presentation/paper. Datasets need not be made public.Please send submissions to committee@RinFinance.com.
The submission deadline is January 31st, 2009.
Submissions will be evaluated and submitters notified via email on a rolling basis.Additional details about the conference will be announced as available.
For the program committee:
Gib Bassett, Peter Carl, Dirk Eddelbuettel, John Miller,
Brian Peterson, Dale Rosenthal, Jeffrey Ryan
I just posted the updated slides from this talk, and there is also an updated live cdrom on the Alioth server. Also, it looks like the tutorial will be held again at UseR 2009 in Rennes, see here for a brief synopsis.
It was nice to get back to Canada, even if it was a 24 hour whirlwind trip. Ottaws looked quite pretty in all the snow. And it seems that I got rather lucky with the travel dates as both the days before and after my trip had a large number of flight cancellations and delays due to snow storms.
So I quickly put together some simple css formatting to make it look a little better than the default blosxom theme it sported previously. That said, you probably should read the rss version (more about rss here) anyway!
Update: Oops. And it even works with a correct path to the css file. Now fixed.
The talk introduces and extends an example related to some of the material from the tutorial itself. The slides from the talk are a little rough as the talk was somewhat ad-hoc: As session chair, I was confronted with a fairly last-minute cancellation and a 15 minute hole, and thought this would make a good little talk. It does show a nice trick for using littler with Open MPI (via snow) under the powerful slurm resource manager and batch/queue engine.
In a nutshell, the tutorial covered
how to measure / profile R performance for
speed and memory use, how to accelerate R using vectorised expression and
tools like Ra / jit, how to add compiled code to R using either
the .C
or .Call
interface and using the
inline
and RCpp
packages, how to use R code in
parallel (explicitly using NWS
, Rmpi
or
snow
as well as implicitly using pnmath
/ OpenMP),
and how to script / automate R using littler
, Rscript
or RPy
.
The final version of the slides is now available via my presentations page, and the live cdrom with software support for all the software used is at Alioth.
Update: Corrected link to presentations page thanks to heads-up by Charles. Thanks!
But these changes also affected my my CRANberries (see the html or better yet rss view) summaries of new packages as some of the source information moved. So I just updated the (surprisingly short at 189 lines including plenty of whitespace and comments) script, and things should work now come the next update.
While updating the 'more info' link for new and updates posts to point to the new-style entry at CRAN, I also took the opportunity to update the format of the `blog' entry for updates where we now show title and description along with the diffstat output,
I also manually copied in two of the recent entries: the new package
emu where CRANberries had fallen over as we
could not find the package description (in the new spot), and the existing package
GEOmap where diffstat
failed
as we somehow didn't have a proper tarnall of the previous sources.
x <- readLines("http://developer.r-project.org/R.svnlog.2007") rx <- x[grep("^r",x)] who <- gsub(" ","",sapply(strsplit(rx,"\\|"),"[",2)) twho <- table(who) twho["ripley"]/sum(twho)In five lines (that could be shortened to three at the expense of some readibility), the SVN log for R is downloaded directly from the website, the revision authors are extraced and then tabulated by submitter. The relative percentage of Brian Ripley is found to be a staggering 74.8% -- or about three times as much as the other fifteen committers combined. Smokes.
[ Oh, and for those who don't know him, he's also got a day job which presumably entails looking after his graduate students at Oxford. Who knows, he may even teach. Kidding aside, he's actually one of the nicest persons you'll ever meet in real life. ]
Now yesterday, Simon Jackman who had at first simply repeated Ben's analysis on his own blog followed up with a nice analysis (albeit typeset in a way that rendered the code inoperational, which has now been fixes) that creates both a histogram and a dotplot of commits per hour of the day. Omitting Ben's code which Simon reuses, we have the following for histogram and dotchart:
tod <- unlist(sapply(rx,function(x)strsplit(x,split=" ")[[1]][6])) tod <- tod[who=="ripley"] tz <- sub(pattern=".*(-[0-9]{4}).*",replacement="\\1",x=rx) tz <- tz[who=="ripley"] tz <- as.numeric(tz)/100 offset <- 3600*tz z <- strptime(tod,format="%H:%M:%S") hist(z,"hours",main="Ripley Commit Times in SVN TZ") h <- z - offset h <- format(h,format="%H") h <- factor(as.numeric(h), levels=0:23) dotchart(table(h), main="Ripley Commit Times, By Hour in GMT", labels=paste(0:23,1:24,sep=":"))This extracts the commit times, subsets to the ones by Prof. Ripley, extracts the timezones component (as
strptime
seemingly doesn't do that
which is a pain), extracts the tz-less time via strptime
into a
variable 'z' for which the histogram is drawn. He then corrects the times by
the tz offset expressed in seconds, formats is as hour of the day and turns
it into a 'factor' (an R data type for qualitative variables which may be
ordered as is the case here) and draws a dotplot. This results in the
following chart:
Now, nobody has looked at the time series. So we correct this and add the following:
## rather extract both date and time dat <- unlist(sapply(rx, function(x) { txt <- strsplit(x,split=" ")[[1]] paste(txt[5], txt[6]) })) ## subset on Prof Ripley dat <- dat[who == "ripley"] ## and convert to POSIXct, correcting by tz as well datpt <- as.POSIXct(strptime(dat,format="%Y-%m-%d %H:%M:%S")) - offset ## turn into zoo -- we use a constant series of ones as each ## committ is taken as a timestamped event datzoo <- zoo(1, order.by=datpt) ## and use zoo to aggregate into commits per date daily <- aggregate(datzoo, as.Date(index(datzoo)), sum) ## now plot as grey bars plot(daily, col='darkgrey', type='h', lwd=2, ylab="Nb of SVN commits, three-week median", xlab="R release dates 2.5.0 and 2.5.1 shown in orange", main="The amazing Prof. Ripley") ## mark the two R releases of 2007 abline(v=c(as.Date("2007-04-24"),as.Date("2007-06-28")),col='orange',lwd=1.5) ## and do a quick centered rolling median lines(rollmedian(daily, 21, align="center"), lwd=3)This extracts both date and time, creates a proper R time object (a so-called POSIXct type) from it, fills a zoo ('the' magic class for time series) object with it, uses zoo to aggregate commits per day and plots those in a barchart-alike (I know, I know, ...) plot to which we add the two releases as well as a rolling and centered three-week median (as a real quick hack rather than a proper smooth).
This shows that Prof Ripley averaged about ten commits a day before and after the release of R 2.5.0, and that he has slowed down ever so slightly since then to end up at around a mere seven commits a day. Every day. For the seven-plus months we looked at.
So, anyone for analysing his r-help posting frequencies ?
The hope is that this proves helpful for keeping tabs on the amazing growth of CRAN (which is now at over one thousand packages) as well as the number of updates to existing packages. The feed(s) can be consumed standalone, or via the brand new Planet R aggregator that Elijah announced today too.
As some of my points didn't seem to make it across, I will reiterate them more plainly:
Sven also addresses the fact that what we really want is to see the quantiles
of the data set. Quite right, and taking logs makes that easier. Consider
the two charts below which plot the 'package age in days' as an empirical
cumulative distribution function using built-in R functions ecdf
and plot.stepfun
(rather than
redoing it ad-hoc as I had done), and also add explicitly quantiles. The two
charts use the exact same instructions; however the second chart transforms
the x-axis to a logarithmic scale.
While it is close to impossible to find the 25 or 50 percentile on the first
chart, it becomes a lot easier on the second chart because the x-axis is
'stretched' using the log transform. About one quarters of the distribution appears
to be rebuild within 1.5 months old, and about half is younger than four
months (as a quick call to summary(pkgAge)
confirms). Reading
these proprtions off the original chart, or the non-log chart, is much more difficult.