Tech news 8: Sharing code through your own R package

experience
R
package
technews
Published

February 16, 2024

This month, I thought of writing a little bit about sharing code with each other.

Collaborating on code

Let’s imagine having a workshop with a group of colleagues and over the course of the week, some will develop tools and some will use said tools for various analyses. The workflow will most likely be:

  1. let’s define our needs, some functions, the data they take in input and what we all need out
  2. some begin writing functions, each their own maybe and then share with everyone,
  3. the code is emailed around, saved on a USB key or shared on DropBox
    1. users test the functions, find bugs, need improvements, need more details on input format, give feedback
    1. In the meantime, programers fixed some things, fixed functions, created others and back to 2.

It’s hard keeping track of the versions and consistently and easily distributing updates to users. It’s hard distributing documentation, testing functions reproducibly and controlling the dependencies. And if you’re teaching and want to distribute code and data all in one?

Make your life easier, make a package

If the functions were written in a package, documentation and tests can easily be written and processed thanks to rstudio built-in tools. The users can install the package, from the shared dropbox or from github/gitlab almost as usual, and load it with library(yourPackage) as usual. They can access the help documentation you wrote with “?” as they are used to, read examples, enjoy auto-completion of the arguments, etc. When they give you feedback, the version is clear and the code changes can be tested in seconds with a shortcut.

You can also share data inside the package, and again, the version will be a timestamp and get you a reproducibility badge. It’s also going to be very easy to access for the users:

library(yourPackage)
dataName

How to make a package?

There are so many great (better) resources online! But let me just give a couple of tips:

The absolute minimum that makes an R package an R package is one DESCRIPTION file and one R script, that’s it!

Begin small, discover the thrill, clarity and comfort of writing documentation and tests for your functions. This will give you ways of imagining edge cases, eg thinking of the missing values, and controlling. Make a change, Ctrl+Shift+T(est), make another change, Ctrl+Shift+T, etc!

Make a change, Ctrl+Shift+T(est), make another change, Ctrl+Shift+T, etc!

Take advantage of tools you already have on your computer: RStudio, the devtools and testthat packages for example.

For whatever questions you might have while building, documenting or testing, the first DuckDuckGo result will very likely comes from Wickham’s book (https://r-pkgs.org/), it’s just so complete…

Wanna do it differently? Build the tests first to build up the framework of exactly how you want your function to behave, then write the function.

Some useful tools

  • devtools::document() will build the documentation, help pages, vignettes everything
  • devtools::install(), devtools::install(“/mydropbox/myPackage”) or devtools::install_github(“yourRepo/yourPackage”) will install the package from source from wherever is the code.
  • testthat::test_package() will run all of your tests and give you encouraging words when they fail, how thoughtful!

I hope this decreased the barrier of trying out building a package. I hope however that it is not decreasing the impressiveness of it!

Best wishes,

Alban