I have provided you with a LAB-01 RMD template:
You will use the following functions for this lab:
names() # variable names
head() # preview dataset
length() # vector length (number of elements)
dim(), nrow(), ncol() # dataset dimensions
sum(), summary() # summarize numeric vectors
table() # summarize factors / character vectors
This lab uses city tax parcel data from Syracuse, NY. [ Data Dictionary ]
You can load the dataset by including the following code chunk in your file:
<- "https://raw.githubusercontent.com/DS4PS/Data-Science-Class/master/DATA/syr_parcels.csv"
URL <- read.csv( URL, stringsAsFactors=FALSE ) dat
Note that referencing variables in R requires both the
dataset name and variable name,
separated by the $
operator:
summary( dat$acres )
Unlike other stats programs, you can have several datasets loaded at
the same time in R. They will often have variables with the same name
(if you create a subset, for example, and save it as a new object you
will have two datasets with identical names). To avoid conflicts R
forces you to use the dataset$variable
convention.
Answer the following questions using the Syracuse parcels dataset and the functions listed.
Your solution should include a written response to the question, as well as the code used to generate the result.
dataset dimensions: dim() or nrow()
sum() over the numeric acres vector
sum() over the vacantbuil logical vector
sum() plus length() functions withthe logical tax.exempt vector
table() with the neighborhood variable
table() with the neighborhood and land_use variables
HELPFUL HINTS:
When you apply a sum() function to a numeric vector it returns the sum of all elements in the vector.
sum( c(10,20,5) ) # 35
When you apply a sum() function to a logical vector, it will count all of the TRUEs:
<- c( TRUE, TRUE, FALSE, FALSE, FALSE )
x sum( x ) # number of TRUEs
sum( x ) / length( x ) # proportion of TRUEs
R wants to make sure you are aware of missing values, so it will return NA (not available) for functions performed on vectors with missing values.
Add the ‘NA remove’ argument (na.rm=TRUE
) to functions
to ignore missing values:
sum( dat$star, na.rm=T )
Use the following instructions to submit your assignment, which may vary depending on your course’s platform.
When you have completed your assignment, click the “Knit” button to
render your .RMD
file into a .HTML
report.
Perform the following depending on your course’s platform:
.RMD
and
.HTML
files to the appropriate link.RMD
and .HTML
files in a .ZIP
file and upload to the appropriate link.HTML
files are preferred but not allowed by all
platforms.
Remember to ensure the following before submitting your assignment.
head()
See Google’s R Style Guide for examples of common conventions.
.RMD
files are knit into .HTML
and other
formats procedural, or line-by-line.
install.packages()
or
setwd()
are bound to cause errors in knittinglibrary()
in a previous chunkIf All Else Fails: If you cannot determine and fix
the errors in a code chunk that’s preventing you from knitting your
document, add eval = FALSE
inside the brackets of
{r}
at the beginning of a chunk to ensure that R does not
attempt to evaluate it, that is: {r eval = FALSE}
. This
will prevent an erroneous chunk of code from halting the knitting
process.