Sat, 29 Oct 2016

drat 0.1.2: Mostly harmless

CRAN requested a release updating any URLs for Omegahat to the (actually working) omegahat.net URL. So that caused this 0.1.2 release which arrived on CRAN yesterday. It contains the requested change along with one or two other mostly minor changes which accumulated since the last release.

drat stands for drat R Archive Template, and helps with easy-to-create and easy-to-use repositories for R packages. Since its inception in early 2015 it has found reasonably widespread adoption among R users because repositories is what we use. In other words, friends don't let friends use install_github(). Just kidding. Maybe. Or not.

The NEWS file summarises the release as follows:

Changes in drat version 0.1.2 (2016-10-28)

  • Changes in drat documentation

    • The FAQ vignette added a new question Why use drat

    • URLs were made canonical, omegahat.net was updated from .org

    • Several files (README.md, Description, help pages) were edited

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sun, 07 Aug 2016

drat 0.1.1: Updates schmupdates!

One year ago (tomorrow) drat 0.1.0 was released. It held up rather well, but a number of small fixes and enhancements piled up, along with somewhat-finished to still-raw additions to the examples/ sections. With that, we are happy to announce drat release 0.1.1 which arrived on CRAN earlier today.

drat stands for drat R Archive Template, and helps with easy-to-create and easy-to-use repositories for R packages. Since its inception in early 2015 it has found reasonably widespread adoption among R users because repositories is what we use. In other words, friends don't let friends use install_github(). Just kidding. Maybe. Or not.

This version 0.1.1 builds on the previous release from one year ago. Several users sent in nicely focused pull request, and I added a bit of spit and polish here and there.

The NEWS file (added belatedly in this release) summarises the release as follows:

Changes in drat version 0.1.1 (2016-08-07)

  • Changes in drat functionality

    • Use dir.exists, leading to versioned Depends on R (>= 3.2.0)

    • Optionally pull remote before insert (Mark in PR #38)

    • Fix support for dots (Jan G. in PR #40)

    • Accept dots in package names (Antonio in PR #48)

    • Switch to htpps URLs at GitHub (Colin in PR #50)

    • Support additional fields in PACKAGE file (Jan G. in PR #54)

  • Changes in drat documentation

    • Further improvements and clarifications to vignettes

    • Travis script switched to run.sh from our fork

    • This NEWS file was (belatedly) added

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sat, 08 Aug 2015

drat 0.1.0: Even more repository support

drat and double drat

A new version 0.1.0 of the drat package arrived on CRAN today. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages, and is finding increasing by other projects.

This version 0.1.0 builds on the previous releases and now adds complete support for binaries for both Windows and OS X. This builds on what had been added in the previous release 0.0.4.

This and other new features are listed below:

  • updated vignettes
  • more complete support for binaries thanks to work by Jan Schulz, Matt Jones and myself
  • new support to (optionally) archive existing packages thanks to Thomas Leeper
  • various smaller fixes.

Also of note: The rOpenSci project now uses drat to distribute their code, and Carl Boettiger has written a nice blog post about it.

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Wed, 27 May 2015

drat 0.0.4: Yet more features and documentation

A new version, now at 0.0.4, of the drat package arrived on CRAN yesterday. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages, and is finding increasing by other projects.

Version 0.0.4 brings both new code and more documentation:

  • support for binary repos on Windows and OS X thanks to Jan Schulz;
  • new (still raw) helper functions initRepo() to create a git-based repository, and pruneRepo() to remove older versions of packages;
  • the insertRepo() functions now uses tryCatch() around git commands (with thanks to Carl Boettiger);
  • when adding a file to a drat repo we ensure that the repo path does not contains spaces (with thank to Stefan Bache);
  • stress that file-based repos need a URL of the form file:/some/path with one colon but not two slashes (also thanks to Stefan Bache);
  • new Using Drat with Travis CI vignette thanks to Colin Gillespie;
  • new Drat FAQ vignette;
  • other fixes and extensions.

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Tue, 21 Apr 2015

Introducing ghrr: GitHub Hosted R Repository

ghrr logo

Background

R relies on package repositories for initial installation of a package via install.packages(). A crucial second step is update.packages(): For all currently installed packages, a list of available updates is constructed or offered for either one-by-one or bulk updates. This keeps the local packages in sync with upstream, and provides for a very convenient way to obtain new features, bug fixes and other improvements. So by installing from a repository, we automatically have the ability to track the repository for updates.

Enter drat

Fairly recently, the drat package was added to the R ecosystem. It makes both aspects of package distribution easy: providing a package (if you are an author) as well as installing it (if you are a user). Now, because drat is at the same time source code (as it is also a package providing the functionality), and a repository (using what drat provides ib features), the "namespace" becomes a little cluttered.

But because a key feature of drat is the "one variable" unique identification via the GitHub, I opted to create a drat repository in the name of a new organisation: ghrr. This is a simple acronym for GitHub Hosted R Repository.

Use cases

We can outline several use case for packages in ghrr:

  • packages not published in a repo by their authors: I already use two like that:
    • fasttime, an impeccably fast parser for ISO datetimes by Simon Urbanek which was however never released into a repo by Simon, and
    • RcppR6, a very nice extension to both R6 (by Winston) and Rcpp, by Rich FitzJohn; similarly never released beyond GitHub;
  • packages possibly unsuitable for mainline repos:
    • Rblpapi is a great package by Whit Armstong and John Laing to which I have been contributing quite a bit of late. As it requires a free-to-use but not open source library and headers from Bloomberg, it will never make it to the mainline repository for R, but hosting it in ghrr is perfect as I can easily update several machines at work once I cut a new development release;
    • winsorize is a small package I needed a few weeks ago; it is spun out of robustHD but does not yet contain new code so Andreas and I are content to keep it in this drat for now;
  • packages in pre-relase mode:
    • RcppArmadillo where I announced both a release candidate before Armadillo 5.000 came out, as well as the actual RcppArmadillo 0.500.0.0 which is not (yet) on the mainline repository as two affected packages need a small update first. Users, however, can get RcppArmadillo already from the sibling Rcpp drat repo.
    • RcppToml is a new package I am currently working on implementing a toml parser based on cpptoml. It works, but it not quite ready for public announcements yet, and hence perfect for ghrr.

Going forward

ghrr is meant to be open. While anybody can open a drat repository, particularly on GitHub, it may be beneficial to somehow group packages. This is however not something that can be planned ex-ante: it may just happen if others who see similar benefits in this can in fact contribute. In that spirit, I strongly encourage pull requests.

Early on, I made my commit messages conform to a pattern of package version sha1 repourl to make code provenance of every commit very clear. Ideally, subsequent commits would conform to such a scheme, or replace it with a better one.

Some Resources

A few links to learn more about drat and ghrr:

Comments and questions via email or issue tickets are more than welcome. We hope that others find ghrr to be a useful tool for easy repository management and use via GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Fri, 10 Apr 2015

drat 0.0.3: More features, more fixes, more documentation

Several weeks ago we introduced the drat package. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages. Two early blog posts describe drat: First Steps Towards Lightweight Repositories, and Publishing a Package, and since the previous release, a a guest post on drat was also added to the blog.

Several people have started to use drat to publish their packages, and this is a very encouraging sign. I also created a new repository and may have more to say about this in another post once I get time to write something up.

Today version 0.0.3 arrived on CRAN. It rounds out functionality, adds some fixes and brings more documentation:

  • git support via git2r is improved; it is still optional (ie commit=TRUE is needed when adding packages) but plan to expand it
  • several small bugs got fixed, including issues #9 and #7,
  • four new vignettes got added, including two guests posts by Steven and Colin as well as two focusing, respectively, on drat for authors and and drat for users.

The work on the vignettes is clearly in progress as Colin's guest post isn't really finished yet, and I am not too impressed with how the markdown is rendered at CRAN so some may still become pdf files instead.

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Fri, 13 Mar 2015

Why Drat? A Guest Post by Steven Pav

Editorial Note: The following post was kindly contributed by Steven Pav.

Why Drat?

After playing around with drat for a few days now, my impressions of it are best captured by Dirk's quote:

It just works.

Demo

To get some idea of what I mean by this, suppose you are a happy consumer of R packages, but want access to, say, the latest, greatest releases of my distribution package, sadist. You can simply add the following to your .Rprofile file:

drat:::add("shabbychef")

After this, you instantly have access to new releases in the github/shabbychef drat store via the package tools you already know and tolerate. You can use

install.packages('sadists')

to install the sadists package from the drat store, for example. Similarly, if you issue

update.packages(ask=FALSE)

all the drat stores you have added will be checked for package updates, along with their dependencies which may well come from other repositories including CRAN.

Use cases

The most obvious use cases are:

  1. Micro releases. For package authors, this provides a means to get feedback from the early adopters, but also allows one to push small changes and bug fixes without burning through your CRAN karma (if you have any left). My personal drat store tends to be a few minor releases ahead of my CRAN releases.

  2. Local repositories. In my professional life, I write and maintain proprietary packages. Pushing package updates used to involve saving the package .tar.gz to a NAS, then calling something like R CMD INSTALL package_name_0.3.1.9001.tar.gz. This is not something I wanted to ask of my colleagues. With drat, they can instead add the following stanza to .Rprofile: drat:::addRepo('localRepo','file:///mnt/NAS/r/local/drat'), and then rely on update.packages to do the rest.

I suspect that in the future, drat might be (ab)used in the following ways:

  1. Rolling your own vanilla CRAN mirror, though I suspect there are better existing ways to accomplish this.

  2. Patching CRAN. Suppose you found a bug in a package on CRAN (inconceivable!). As it stands now, you email the maintainer, and wait for a fix. Maybe the patch is trivial, but suppose it is never delivered. Now, you can simply make the patch yourself, pick a higher revision number, and stash it in your drat store. The only downside is that eventually the package maintainer might bump their revision number without pushing a fix, and you are stuck in an arms race of version numbers.

  3. Forgoing CRAN altogether. While some package maintainers might find this attractive, I think I would prefer a single huge repository, warts and all, to a landscape of a million microrepos. Perhaps some enterprising group will set up a CRAN-like drat store on github, and accept packages by pull request (whether github CDN can or will support the traffic that CRAN does is another matter), but this seems a bit too futuristic for me now.

My wish list

In exchange for writing this blog post, I get to lobby Dirk for some features in drat:

  1. I shudder at the thought of hundreds of tiny drat stores. Perhaps there should be a way to aggregate addRepo commands in some way. This would allow curators to publish their suggested lists of repos.

  2. Drat stores are served in the gh-pages branch of a github repo. I wish there were some way to keep the index.html file in that directory reflect the packages present in the sources. Maybe this could be achieved with some canonical RMarkdown code that most people use.

Update:Two typos fixed in code examples on 2015-Mar-27.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sun, 01 Mar 2015

drat 0.0.2: Improved Support for Lightweight R Repositories

A few weeks ago we introduced the drat package. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages. Two early blog posts describe drat: First Steps Towards Lightweight Repositories, and Publishing a Package.

A new version 0.0.2 is now on CRAN. It adds several new features:

  • beginnings of native git support via the excellent new git2r package,
  • a new helper function to prune a repo of older versions of packages (as R repositories only show the newest release of a package),
  • improved core functionality in inserting a package, and adding a repo.

Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sun, 22 Feb 2015

drat Tutorial: Publishing a package

Introduction

The drat package was released earlier this month, and described in a first blog post. I received some helpful feedback about what works and what doesn't. For example, Jenny Bryan pointed out that I was not making a clear enough distinction between the role of using drat to publish code, and using drat to receive/install code. Very fair point, and somewhat tricky as R aims to blur the line between being a user and developer of statistical analyses, and hence packages. Many of us are both. Both the main point is well taken, and this note aims to clarify this issue a little by focusing on the former.

Another point make by Jenny concerns the double use of repository. And indeed, I conflated repository (in the sense of a GitHub code repository) with repository for a package store used by a package manager. The former, a GitHub repository, is something we use to implement a personal drat with: A GitHub repository happens to be uniquely identifiable just by its account name, and given an (optional) gh-pages branch also offers a stable and performant webserver we use to deliver packages for R. A (personal) code repository on the other hand is something we implement somewhere---possibly via drat which supports local directories, possibly on a network share, as well as anywhere web-accessible, e.g. via a GitHub repository. It is a little confusing, but I will aim to make the distinction clearer.

Just once: Setting up a drat repository

So let us for the remainder of this post assume the role of a code publisher. Assume you have a package you would like to make available, which may not be on CRAN and for which you would like to make installation by others easier via drat. The example below will use an interim version of drat which I pushed out yesterday (after fixing a bug noticed when pushing the very new RcppAPT package).

For the following, all we assume (apart from having a package to publish) is that you have a drat directory setup within your git / GitHub repository. This is not an onerous restriction. First off, you don't have to use git or GitHub to publish via drat: local file stores and other web servers work just as well (and are documented). GitHub simply makes it easiest. Second, bootstrapping one is trivial: just fork my drat GitHub repository and then create a local clone of the fork.

There is one additional requirement: you need a gh-pages branch. Using the fork-and-clone approach ensures this. Otherwise, if you know your way around git you already know how to create a gh-pages branch.

Enough of the prerequisities. And on towards real fun. Let's ensure we are in the gh-pages branch:

edd@max:~/git/drat(master)$ git checkout gh-pages
Switched to branch 'gh-pages'
Your branch is up-to-date with 'origin/gh-pages'.
edd@max:~/git/drat(gh-pages)$ 

Publish: Run one drat command to insert a package

Now, let us assume you have a package to publish. In my case this was version 0.0.1.2 of drat itself as it contains a fix for the very command I am showing here. So if you want to run this, ensure you have this version of drat as the CRAN version is currently behind at release 0.0.1 (though I plan to correct that in the next few days).

To publish an R package into a code repository created via drat running on a drat GitHub repository, just run insertPackage(packagefile) which we show here with the optional commit=TRUE. The path to the package can be absolute are relative; the easists is often to go up one directory from the sources to where R CMD build ... has created the package file.

edd@max:~/git$ Rscript -e 'library(drat); insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE)'
[gh-pages 0d2093a] adding drat_0.0.1.2.tar.gz to drat
 3 files changed, 2 insertions(+), 2 deletions(-)
 create mode 100644 src/contrib/drat_0.0.1.2.tar.gz
Counting objects: 7, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 7.37 KiB | 0 bytes/s, done.
Total 7 (delta 1), reused 0 (delta 0)
To git@github.com:eddelbuettel/drat.git
   206d2fa..0d2093a  gh-pages -> gh-pages
edd@max:~/git$ 

You can equally well run this as insertPackage("drat_0.0.1.2.tar.gz"), then inspect the repo and only then run the git commands add, commit and push. Also note that future versions of drat will most likely support git operations directly by relying on the very promising git2r package. But this just affect package internals, the user-facing call of e.g. insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE) will remain unchanged.

And in a nutshell that really is all there is to it. With the newly drat-ed package pushed to your GitHub repository with a single function call), it is available via the automatically-provided gh-pages webserver access to anyone in the world. All they need to do is to point R's package management code (which is built into R itself and used for e.g._ CRAN and BioConductor R package repositories) to the new repo---and that is also just a single drat command. We showed this in the first blog post and may expand on it again in a follow-up.

So in summary, that really is all there is to it. After a one-time setup / ensuring you are on the gh-pages branch, all it takes is a single function call from the drat package to publish your package to your drat GitHub repository.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link

Sat, 07 Feb 2015

drat Tutorial: First Steps towards Lightweight R Repositories

Now that drat is on CRAN and I got a bit of feedback (or typo corrections) in three issue tickets, I thought I could show how to quickly post such an interim version in a drat repository.

Now, I obviously already have a checkout of drat. If you, dear reader, wanted to play along and create your own drat repository, one rather simple way would be to simply clone my repo as this gets you the desired gh-pages branch with the required src/contrib/ directories. Otherwise just do it by hand.

Back to a new interim version. I just pushed commit fd06293 which bumps the version and date for the new interim release based mostly on the three tickets addresses right after the initial release 0.0.1. So by building it we get a new version 0.0.1.1:

edd@max:~/git$ R CMD build drat
* checking for file ‘drat/DESCRIPTION’ ... OK
* preparing ‘drat’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘drat_0.0.1.1.tar.gz’

edd@max:~/git$ 

Because I want to use the drat repo next, I need to now switch from master to gh-pages; a step I am omitting as we can assume that your drat repo will already be on its gh-pages branch.

Next we simply call the drat function to add the release:

edd@max:~/git$ r -e 'drat:::insert("drat_0.0.1.1.tar.gz")'
edd@max:~/git$ 

As expected, now have two updated PACKAGES files (compressed and plain) and a new tarball:

edd@max:~/git/drat(gh-pages)$ git status
On branch gh-pages
Your branch is up-to-date with 'origin/gh-pages'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   src/contrib/PACKAGES
        modified:   src/contrib/PACKAGES.gz

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        src/contrib/drat_0.0.1.1.tar.gz

no changes added to commit (use "git add" and/or "git commit -a")
edd@max:~/git/drat(gh-pages)$

All is left to is to add, commit and push---either as I usually via the spectacularly useful editor mode, or on the command-line, or by simply adding commit=TRUE in the call to insert() or insertPackage().

I prefer to use littler's r for command-line work, so I am setting the desired repos in ~/.littler.r which is read on startup (since the recent littler release 0.2.2) with these two lines:


## add RStudio CRAN mirror
drat:::add("CRAN", "http://cran.rstudio.com")

## add Dirk's drat
drat:::add("eddelbuettel")

After that, repos are set as I like them (at home at least):

edd@max:~/git$ r -e'print(options("repos"))'
$repos
                                 CRAN                          eddelbuettel 
            "http://cran.rstudio.com" "http://eddelbuettel.github.io/drat/" 

edd@max:~/git$ 

And with that, we can just call update.packages() specifying the package directory to update:

edd@max:~/git$ r -e 'update.packages(ask=FALSE, lib.loc="/usr/local/lib/R/site-library")'                                                                                                                            
trying URL 'http://eddelbuettel.github.io/drat/src/contrib/drat_0.0.1.1.tar.gz'
Content type 'application/octet-stream' length 5829 bytes
opened URL
==================================================
downloaded 5829 bytes

* installing *source* package ‘drat’ ...
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (drat)

The downloaded source packages are in
        ‘/tmp/downloaded_packagesedd@max:~/git$

and presto, a new version of a package we have installed (here the very drat interim release we just pushed above) is updated.

Writing this up made me realize I need to update the handy update.r script (see e.g. the littler examples page for more) and it hard-wires just one repo which needs to be relaxed for drat. Maybe in install2.r which already has docopt support...

/code/drat | permanent link

Thu, 05 Feb 2015

Introducing drat: Lightweight R Repositories

A new package of mine just got to CRAN in its very first version 0.0.1: drat. Its name stands for drat R Archive Template, and an introduction is provided at the drat page, the GitHub repository, and below.

drat builds on a core strength of R: the ability to query multiple repositories. Just how one could always query, say, CRAN, BioConductor and OmegaHat---one can now adds drats of one or more other developers with ease. drat also builds on a core strength of GitHub. Every user automagically has a corresponding github.io address, and by appending drat we are getting a standardized URL.

drat combines both strengths. So after an initial install.packages("drat") to get drat, you can just do either one of

library(drat)
addRepo("eddelbuettel")

or equally

drat:::add("eddelbuettel")

to register my drat. Now install.packages() will work using this new drat, as will update.packages(). The fact that the update mechanism works is a key strength: not only can you get a package, but you can gets its updates once its author replaces them into his drat.

How does one do that? Easy! For a package foo_0.1.0.tar.gz we do

library(drat)
insertPackage("foo_0.1.0.tar.gz")

The default git repository locally is taken as the default ~/git/drat/ but can be overriden as both a local default (via options()) or directly on the command-line. Note that this also assumes that you a) have a gh-pages branch and b) have that branch as the currently active branch. Automating this / testing for this is left for a subsequent release. Also available is an alternative unexported short-hand function:

drat:::insert("foo_0.1.0.tar.gz", "/opt/myWork/git")

show here with the alternate use case of a local fileshare you can copy into and query from---something we do at work where we share packages only locally.

The easiest way to obtain the corresponding file system layout may be to just fork the drat repository.

So that is it. Two exported functions, and two unexported (potentially name-clobbering) shorthands. Now drat away!

Courtesy of CRANberries, there is also a copy of the DESCRIPTION file for this initial release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/drat | permanent link