31 January 2013

repmis: misc. tools for reproducible research in R

I've started to put together an R package called repmis. It has miscellaneous tools for reproducible research with R. The idea behind the package is to collate commands that simplify some of the common R code used within knitr-type reproducible research papers.

It's still very much in the early stages of development and has two commands:

  • LoadandCite: a command to load all of the R packages used in a paper and create a BibTeX file containing citation information for them. It can also install the packages if they are on CRAN.
  • source_GitHubData: a command for downloading plain-text formatted data stored on GitHub or at any other secure (https) URL.

I've written about why you might want to use source_GitHubData before (see here and here).

You can use LoadandCite in a code chunk near the beginning of a knitr reproducible research document to load all of the R packages you will use in the document and automatically generate a BibTeX file you can draw on to cite them. Here's an example:

# Create vector of package names
PackagesUsed <- c("knitr", "xtable")

# Load and Cite
repmis::LoadandCite(PackagesUsed, file = "PackageCitations.bib") 

LoadandCite draws on knitr's write_bib command to create the bibliographies, so each citation is given a BibTeX key like this: R-package_name. For example the key for the xtable package is R-xtable. Be careful to save the citations in a new .bib file, because LoadandCite overwrites existing files.

Citation of R packages is very inconsistent in academic publications. Hopefully by making it easier to cite packages more people will do so.


Instructions for how to install repmis are available here.

Please feel free to fork the package and suggest additional commands that could be included.


Toby Dylan Hocking said...

Have you considered a method to specify package versions in LoadandCite? Since packages change over time, it is useful to specify which versions were used. I proposed to deal with this problem with a works_with_R() header in The difficulty of reproducible research using R.

Christopher Gandrud said...

Great idea! I just implemented it. Please download the newest version and let me know if there is anything else I can improve.