Sharing analysis code

“An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.” - Buckheit & Donoho, 1995 (read the full article here)

This quote nicely summarizes the importance of sharing data, methods and code so that others can evaluate the actual work that was done in a research paper. Luckily, web-based technologies make it very easy to share these materials (and even to share the complete software environment using Docker and Singularity containers). However, just releasing your code without annotation is not very informative because others (and future you!) can't make a lot of sense of it. Two helpful tools to annotate your code are RMarkdown and Jupyter notebooks.

How do tools like R Markdown and Jupyter notebooks make research more reproducible?

R Markdown

Especially when you do your data analysis in R / RStudio, R Markdown is a very useful tool to put your text and analysis together in one place. It is basically R + Markdown (a markup language to format text). It can be used to write a whole paper, including code to generate figures. This code can be outputted in many formats such as html, pdf and Word. For full documentation see also the R Markdown documentation and this neat cheatsheet (pdf).

RMarkdown example
Example of RMarkdown chunk in RStudio with associated html output (from RMarkdown docs)

Installing R Markdown

Reference lists using Zotero in R Markdown

When writing papers, it is also very useful to connect RStudio with Zotero.
Zotero is a free and open source reference manager with a very handy browser plugin. If you have never used a reference manager before: it is a great way to keep a library of all your literature (including pdf's) together and will help you to cite papers in the right way and produce automatic reference lists in the right format for you. This can be done in a word processor like Microsoft Word, but also in R Markdown.

The basic steps you need to make this work:

  1. install Zotero and import references (e.g., using the browser plugin)
  2. install the Better BibTex plugin for Zotero by clicking Tools > Add-ons within Zotero and follow these instructions
  3. install the citr R package

Now when writing text in an RMarkdown file in RStudio:

Jupyter notebooks

For analyses that are conducted using Python, Jupyter notebooks are a great way to keep executable code and annotation in one place (note that many other programming languages are also supported by Jupyter notebooks: the name is reference to the 3 core languages Julia, Python, and R). For full documentation see the Jupyter Notebook docs and https://jupyter.org/ for more information about the larger Project Jupyter ecosystem.

When opening a Jupyter Notebook, you are opening an interactive session. Here you can add different sort of cells: code cells that can be executed (after execution the results will be displayed in the notebook), and Markdown cells that can be used to add descriptive text that can be marked up using the Markdown language.

example notebook
Example GIF of a Jupyter notebook for the Qoala-T tool. See notebook here

Installing Jupyter notebooks