Welcome to the 44th post in the $R^4 series.
A few weeks ago, and following an informal ‘call for talks’ by James Lamb, I had an opportunity to talk about r2u to the Chicago ML and MLops meetup groups. You can find the slides here.
Over the last 2 1/2 years, r2u has become a widely-deployed mechanism in a number of settings, including (but not limited to) software testing via continuous integration, deployment on cloud servers—besides of course to more standard use on local laptops or workstation. 30 million downloads illustrate this. My thesis for the talk was that this extends equally to ML(ops) where no surprises, no hickups automated deployments are key for large-scale model training, evaluation and of course production deployments.
In this context, I introduce r2u while giving credit both to what came before it, the existing alternatives (or ‘competitors’ for mindshare if one prefers that phrasing), and of course what lies underneath it.
The central takeaway, I argue, is that r2u can and does take advantage of a unique situation in that we can ‘join’ the package manager task for the underlying (operating) system and and the application domain, here R and its unique CRAN repository network. Other approaches can, and of course do, provide binaries, but by doing this outside the realm of the system package manager can only arrive at a lesser integration (and I show a common error arising in that case). So where r2u is feasible, it dominates the alternatives (while the alternatives may well provide deployment on more platforms which, even when less integrated, may be of greater importance for some). As always, it all depends.
But the talk, and its slides, motivate and illustrate why we keep calling r2u by its slogan of r2u: Fast. Easy. Reliable. Pick All Three.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub. Please report excessive re-aggregation in third-party for-profit settings.