This tutorial will take you through the steps of creating a new package in R.
There are five primary tasks:
The entire script you need to build a package will look something like this.
# set your working directory
# if you don't want to create
# the package in your default
# working directory
getwd() # default directory, usually my documents
library(devtools)
# step 1
usethis::create_package( "montyhall" )
# step 2 move R script to montyhall/R folder
# after completing documentation fields
# step 3
setwd( "montyhall" )
devtools::document()
# step 4
setwd( ".." )
devtools::install( "montyhall" )
library( montyhall )
create_game()
# step 5: close R and re-open new console
devtools::install_github( "yourGitHubName/montyhall" )
These five steps are explained in detail below. The annotated script explains some of the i’s that need dotting and t’s that need to be crossed while you create a package. For example, different steps are executed while you are inside different folders, so you need to mind your working directories. Roxygen comments are sensitive to formatting, etc. Similar to errors you have encountered when learning how to use RMD documents, these are the types of details that will trip you up the first time you create a package.
Once you have completed the process, a key takeaway from the exercise is that packages are NOT hard to create. R has become a successful language because packages are easy to build, which means that barriers to sharing cool ideas is low, which means more people create packages, which means that the R ecosystem becomes dynamic and robust.
R has been much more successful as a social network than as a
computer language[*]
. Packages are not only useful for
thought leaders that want to share innovative solutions with a global
community by uploading packages to the CRAN. They are also great for
amplifying expertise within organizations (senior analysts sharing
existing code with junior analysts), for documentation
(institutionalization of knowledge that prevents disruption as a result
of turnover), and quality control. In this instance packages might be
used internally and not shared broadly. They can be a powerful
management tool, not just a tool for elite R users.
ASIDE ON R AS A LANGUAGE
[*]
There is an important caveat to the statement that R
is not successful because of its prowess as a computer language.
Specifically, R is a powerful language designed for data
programming, statistical analysis, and
scientific computing. For a high-level language that is
easy to learn it does A LOT OF THINGS well, which makes it a great
general toolkit for data scientists.
For example, it can utilize object-oriented frameworks, it has packages that optimize speed on specific tasks, it you can run database queries directly in R. However, if object-oriented features are the most important requirements for your project then Java is better suited, if speed is the most important requirement then you might write some of the code in C++, and if database queries are the most important requirement then SQL will be useful. R can do all of the things these other languages can do, but it can’t do them as well as a language designed for a specific purpose.
You will often see the argument made that Python is a better
https://github.com/matloff/R-vs.-Python-for-Data-Science
https://qz.com/1661487/hadley-wickham-on-the-future-of-r-python-and-the-tidyverse
https://towardsdatascience.com/python-vs-r-for-data-science-cf2699dfff4b
When installing packages there are a few file management and path navigation details that you need to understand. This annotated version of the steps above is more explicit than you will eventually need, but it will be helpful your first time through:
ANNOTATED PACKAGE CREATION SCRIPT (right-click and save)
It is recommended to complete this lab in a regular R console, NOT in R Studio.
R Studio asserts more control over file paths and working directories and it is dependent upon your operating system and R Studio settings, so you may get different behavior from the instructions.
The basic R console is ‘dumb’ - it does not try to guess what you want. That makes it more predictable for tutorials like this.
You can eventually create packages using RMD docs and other R Studio tools that help manage the process. But the steps will be different from this tutorial.
Note, if you need to Google any steps for additional help try to avoid instructions that use R Studio options. Their template for packages is helpful, but it will follow a different process than the one described here.
Note, if you make a mistake or encounter an error, you can always delete the new montyhall folder and start over at Step 1 (just keep a copy of your R script if you have already completed the roxygen comments).
Packages in R are just folders with R files and some extra documentation. They are designed to be minimalist.
You will need to download the Monty Hall Problem functions that we completed during Labs 01 and 02 from the link below. This script contains some roxygen text to get you started with the process of documentation.
DOWNLOAD THE TEMPLATE: monty-hall-pkg.R
Note that you will update this script and place it in the montyhall/R folder after you complete Step 01 and the package skeleton has been created (the working directories for your package are built).
The package code is provided - you just need to complete the documentation and develop the test code for each function (you can adapt these from the unit testing examples in the labs).
Once completed you can move monty-hall-pkg.R to the montyhall/R folder that is created in Step 1.
Note, you should NOT be using RMD files in this assignment. The only files in the R folder should be R scripts with the functions needed for the package and complete roxygen fields. This script becomes the library of functions inside of your new package and the roxygen fields are converted into formal documentation for the package.
# Windows users must also install Rtools from:
# https://cran.r-project.org/bin/windows/Rtools/
install.packages(
c( "devtools", "roxygen2",
"usethis", "testthat",
"knitr" ) )
You can check that you have everything installed and working by running the following code:
## Your system is ready to build packages!
# set your directory if you want the package
# created in a folder other than the default "documents"
setwd( "some/path/here" )
This example is using the default “documents” directory:
getwd() # documents in this example
usethis::create_package( "montyhall" )
documents # package created inside documents
├─ montyhall # new folder that will appear after create_package()
│ ├─ \R
│ ├─ DESCRIPTION
│ ├─ NAMESPACE
What you should see:
# > usethis::create_package( "montyhall" )
# ✔ Creating 'montyhall/'
# ✔ Setting active project to 'C:/Users/jdlecy/Documents/montyhall'
# ✔ Creating 'R/'
# ✔ Writing 'DESCRIPTION'
# Package: montyhall
# Title: What the Package Does (One Line, Title Case)
# Version: 0.0.0.9000
# Authors@R (parsed):
# * First Last <first.last@example.com> [aut, cre] (<https://orcid.org/YOUR-ORCID-ID>)
# Description: What the package does (one paragraph).
# License: What license it uses
# Encoding: UTF-8
# LazyData: true
# ✔ Writing 'NAMESPACE'
# ✔ Writing 'montyhall.Rproj'
# ✔ Adding '.Rproj.user' to '.gitignore'
# ✔ Adding '^montyhall\\.Rproj$', '^\\.Rproj\\.user$' to '.Rbuildignore'
# ✔ Opening 'montyhall/' in new RStudio session
# ✔ Setting active project to '<no active project>'
You will now have a directory in your current folder called montyhall. This will contain the files you need for your package.
Note, some of these files like gitignore are from GitHub - they will appear after completing Step 05. Files may vary by operating system, but you should at the very least have the R folder, a DESCRIPTION file, and a NAMESPACE file.
Note that usethis::create_package( “montyhall” ) creates a new folder called “montyhall” inside your current directory.
documents # base project folder
├─ montyhall # new folder that will appear after create_package()
│ ├─ \R
│ ├─ DESCRIPTION
│ ├─ NAMESPACE
Do NOT create your own folder called “montyhall”, navigate to that folder, and then call create_package() or you will end up with this scenario:
documents
├─ montyhall # base project folder
│ ├─ montyhall # new folder that will appear after create_package()
│ ├─ \R
│ ├─ DESCRIPTION
│ ├─ NAMESPACE
If you try you should get the following warning message:
usethis::create_package( "montyhall" )
New project 'montyhall' is nested inside an existing project './', which is rarely a good idea.
If this is unexpected, the here package has a function, `here::dr_here()` that reveals why './' is regarded as a project.
Do you want to create anyway?
1: No
2: Absolutely not
3: I agree
Add your roxygen comments to your R scripts:
#' @title
#' Sum of vector elements.
#'
#' @description
#' `sum(x)` returns the sum of all the values present in its arguments.
#'
#' @details
#' This is a generic function: methods can be defined for it directly
#' or via the [Summary] group generic. For this to work properly,
#' the arguments `...` should be unnamed, and dispatch is on the
#' first argument.
#'
#' @param x Numeric, complex, or logical vectors.
#' @param na.rm A logical scalar. Should missing values (including `NaN`)
#' be removed?
#' @return If all inputs are integer and logical, then the output
#' will be an integer. Otherwise it will be a length-one numeric or
#' complex vector.
#'
#' Zero-length vectors have sum 0 by definition. See
#' <http://en.wikipedia.org/wiki/Empty_sum> for more details.
#'
#' @examples
#' sum(1:10)
#' sum(1:5, 6:10)
#' sum(F, F, F, T, T)
#'
#' sum(.Machine$integer.max, 1L)
#' sum(.Machine$integer.max, 1)
#'
#' \dontrun{
#' sum("a")
#' }
sum <- function(..., na.rm = TRUE) {}
Note that good documentation describes all of the arguments needed by the function, including the required data types of each object. And clearly describe what will be returned when the function runs (type of object, what it contains).
The information you provide becomes the documentation that appears
when you type help("function_name")
:
Roxygen, unlike R, is sensitive to the number of spaces your use. So don’t alter the formatting of the default comments in your script.
Place your documented R scripts into the “R” folder in your package directory, then try:
getwd() # should be '.../Documents/montyhall'
# OTHERWISE NAVIGATE THERE:
# move one level up: setwd( ".." )
# go into a folder: setwd( "montyhall" )
devtools::document()
documents
├─ montyhall # should be inside here now
│ ├─ \R # your R scripts should be in here
│ ├─ DESCRIPTION
│ ├─ NAMESPACE
You should see:
# Updating montyhall documentation
# Updating roxygen version in C:\Users\jdlecy\Documents\montyhall/DESCRIPTION
# Writing NAMESPACE
# Loading montyhall
# Writing create_game.Rd
documents
├─ montyhall
│ ├─ \R
│ ├─ \man # new *.rd files will be here
│ ├─ DESCRIPTION
│ ├─ NAMESPACE
You will see some errors as well if you have not yet finished documenting your functions. Ignore them for now.
Depending upon your OS and your R devtools version you may be required to complete ALL documentation before preceding to the next steps. At the very least each function should have a title.
You will now have a new folder in your montyhall directory called “man”, short for “manuals”. The documentation files have an .Rd (R documentation) extension. The man folder should contain one .Rd file for each exported function in your script (change_door.Rd, create_game.Rd, etc.).
Skip to the testing step below if you want to see if the package is now functional.
Navigate to the main “montyhall” package folder on your computer and open the file called “DESCRIPTION” in a text editor (your computer will have a text editor like notebook). You will see something like this:
Package: montyhall
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R:
person(given = "First",
family = "Last",
role = c("aut", "cre"),
email = "first.last@example.com",
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1
Since we use dplyr in the package, we need to add another line to import and attach it. Add:
Depends:
dplyr
Now complete the rest of the fields from “Title” to “Description”.
Package: montyhall
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R:
person(given = "First",
family = "Last",
role = c("aut", "cre"),
email = "first.last@example.com",
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
Depends:
dplyr
License: What license it uses
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1
Go up one level in your directory so you are outside of the package folder, but in the same folder where the package folder lives.
documents # should be back here
├─ montyhall
│ ├─ R
│ ├─ man
Then run the command to install your new package.
The install() command is looking for a folder called “montyhall”. It can’t find the folder if we are currently inside the folder. Thus we move up one level in the directory first:
setwd( ".." ) # move up one level with two periods
getwd() # should be /documents NOT /montyhall
devtools::install( "montyhall" )
If successful you will see messages like this:
# Updating montyhall documentation
# Updating roxygen version in C:\Users\jdlecy\Documents\montyhall/DESCRIPTION
# Writing NAMESPACE
# Loading montyhall
# Writing create_game.Rd
# > getwd()
# [1] "C:/Users/jdlecy/Documents/montyhall"
# > setwd( ".." )
# > devtools::install( "montyhall" )
# √ checking for file 'C:\Users\jdlecy\Documents\montyhall/DESCRIPTION'
# - preparing 'montyhall':
# √ checking DESCRIPTION meta-information ...
# - checking for LF line-endings in source and make files and shell scripts
# - checking for empty or unneeded directories
# - building 'montyhall_0.0.0.9000.tar.gz'
#
# Running "C:/PROGRA~1/R/R-36~1.1/bin/x64/Rcmd.exe" INSTALL \
# "C:\Users\jdlecy\AppData\Local\Temp\RtmpKetkbm/montyhall_0.0.0.9000.tar.gz" --install-tests
# * installing to library 'C:/Users/jdlecy/Documents/R/win-library/3.6'
# * installing *source* package 'montyhall' ...
# ** using staged installation
# ** R
# ** byte-compile and prepare package for lazy loading
# ** help
# *** installing help indices
# 'montyhall'g help for package finding HTML links ...
# done
# -reate_game html
# ** building package indices
# ** testing if installed package can be loaded from temporary location
# *** arch - i386
# *** arch - x64
# ** testing if installed package can be loaded from final location
# *** arch - i386
# *** arch - x64
# ** testing if installed package keeps a record of temporary installation path
# * DONE (montyhall)
In a new R session try:
## [1] "goat" "car" "goat"
You should be able to preview the help files that you created with your roxygen comments.
If you encounter an error or the package is not working properly, you might need to fix your code and try again.
You should be able to update the package then simply reinstall it. R will recognize that there are changes to the package and will install the most recent version.
If you run into problems you can force a package deletion then install fresh.
# library() attaches the package to the current environment
# detach is the opposite of library() - closes the package
detach( "package:montyhall" ) # closes the package so not locked
remove.packages( "montyhall" ) # deletes from your computer
If that fails, you probably have the package loaded in another R session or have package files locked by editing them in another program.
If all else fails, you can delete the package manually from your personal R packages library, which usually lives in your Documents folder.
I will grade assignments by running the script below, which generates a report with printouts of all of the help file text that you added, and also executes all of the test code included in the examples sections. I will look to ensure you have added documentation for all functions and that the examples are able to execute without error.
You can check your own work ahead of time by generating the report using the following script after adding your GitHub username.
pkgs <- c("tools","devtools","purrr","rmarkdown")
install.packages()
wd <- getwd()
dir.create("MONTY")
################
git.hub.name <- # YOUR GITHUB USERNAME
###############
## GENERATE PACKAGE REPORT
filepath <- paste0( wd, "/MONTY/montyhall-test-", toupper(git.hub.name), ".HTML" )
download.file(
url="https://raw.githubusercontent.com/Watts-College/paf-514-template/main/labs/create-r-package-test.rmd",
destfile="./MONTY/create-r-package-test.rmd" )
rmarkdown::render(
input = "./MONTY/create-r-package-test.rmd",
output_file = filepath,
params = list( name=git.hub.name ) )
# file location
filepath
# preview file
shell( filepath )
These are a few good resources for reference:
A nice tutorial by Fong Chun Chan
The official R Packages book by Hadley Wickham and Jenny Bryant