Thu, 23 Oct 2014

Introducing Rocker: Docker for R

You only know two things about Docker. First, it uses Linux
containers. Second, the Internet won't shut up about it.

-- attributed to Solomon Hykes, Docker CEO

So what is Docker?

Docker is a relatively new open source application and service, which is seeing interest across a number of areas. It uses recent Linux kernel features (containers, namespaces) to shield processes. While its use (superficially) resembles that of virtual machines, it is much more lightweight as it operates at the level of a single process (rather than an emulation of an entire OS layer). This also allows it to start almost instantly, require very little resources and hence permits an order of magnitude more deployments per host than a virtual machine.

Docker offers a standard interface to creation, distribution and deployment. The shipping container analogy is apt: just how shipping containers (via their standard size and "interface") allow global trade to prosper, Docker is aiming for nothing less for deployment. A Dockerfile provides a concise, extensible, and executable description of the computational environment. Docker software then builds a Docker image from the Dockerfile. Docker images are analogous to virtual machine images, but smaller and built in discrete, extensible and reuseable layers. Images can be distributed and run on any machine that has Docker software installed---including Windows, OS X and of course Linux. Running instances are called Docker containers. A single machine can run hundreds of such containers, including multiple containers running the same image.

There are many good tutorials and introductory materials on Docker on the web. The official online tutorial is a good place to start; this post can not go into more detail in order to remain short and introductory.

So what is Rocker?

rocker logo

At its core, Rocker is a project for running R using Docker containers. We provide a collection of Dockerfiles and pre-built Docker images that can be used and extended for many purposes.

Rocker is the the name of our GitHub repository contained with the Rocker-Org GitHub organization.

Rocker is also the name the account under which the automated builds at Docker provide containers ready for download.

Current Rocker Status

Core Rocker Containers

The Rocker project develops the following containers in the core Rocker repository

  • r-base provides a base R container to build from
  • r-devel provides the basic R container, as well as a complete R-devel build based on current SVN sources of R
  • rstudio provides the base R container as well an RStudio Server instance

We have settled on these three core images after earlier work in repositories such as docker-debian-r and docker-ubuntu-r.

Rocker Use Case Containers

Within the Rocker-org organization on GitHub, we are also working on

  • Hadleyverse which extends the rstudio container with a number of Hadley packages
  • rOpenSci which extends hadleyverse with a number of rOpenSci packages
  • r-devel-san provides an R-devel build for "Sanitizer" run-time diagnostics via a properly instrumented version of R-devel via a recent compiler build
  • rocker-versioned aims to provided containers with 'versioned' previous R releases and matching packages

Other repositories will probably be added as new needs and opportunities are identified.

Deprecation

The Rocker effort supersedes and replaces earlier work by Dirk (in the docker-debian-r and docker-ubuntu-r GitHub repositories) and Carl. Please use the Rocker GitHub repo and Rocker Containers from Docker.com going forward.

Next Steps

We intend to follow-up with more posts detailing usage of both the source Dockerfiles and binary containers on different platforms.

Rocker containers are fully functional. We invite you to take them for a spin. Bug reports, comments, and suggestions are welcome; we suggest you use the GitHub issue tracker.

Acknowledgments

We are very appreciative of all comments received by early adopters and testers. We also would like to thank RStudio for allowing us the redistribution of their RStudio Server binary.

Published concurrently at rOpenSci blog and Dirk's blog.

Authors

Dirk Eddelbuettel and Carl Boettiger

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/rocker | permanent link