In this lab you will practice working with the US census data searching for variables, loading them into R, performing basic calculations and data wrangling tasks, visualizing the data with figures and maps, and offer substantive feedback on the data analysis.
The topic for our analysis will be to compare the house-price-to-income ratio across US counties and over time. The ratio tells the number of years it would take for the median income household to buy the median household price. Under healthy economic conditions, the rule of thumb is that a buyer can afford a house if its price is equivalent to a house-price-to-income ratio of 2.6. Read the Citylab report report by Richard Florida for more background on the ratio and its importance and US county rankings.
For loading, manipulating, and plotting Census data in R, refer to Lecture 2 video and notes, and in particular, the part of lecture that introduces tidycensus
. In that portion of lecture, the exact codes are already given for you. All you need to do typically is to change variable name or year value, and re-run the code.
Remember to set census_api_key("your_key_here")
to access data
You can get a Census API Key at: https://api.census.gov/data/key_signup.html
Use tidycensus::get_acs()
to download information on median household income and median household value for ACS 5-year estimate, 2013-2017 period. The notation tidycensus::get_acs()
means the get_acs()
function is located inside the tidycensus
package, i.e. you need that package loaded to use the function.
Call the dataset CenDF
Notes:
Use load_variables
to view and search for variables. In filter, search for Median value
and use first row to get median household price variable name.
Remember that year=2017
and survey="acs5"
reports 5-year estimates for the last five years from the specified date (i.e. year=2017 uses data from 2013-2017).
Use spread()
to transform the dataset from long to wide and save the dataset back into CenDF
Use mutate()
to create the ratio of median household value divided by household median income, and call it HHInc_HousePrice_Ratio
Notes
HHIncome
and HouseValue
. You can do this using mutate()
and case_when()
.MOE
variable using select()
Use base R function order()
and rev(order)
to explore which counties have the lowest and highest house-price-to-income ratio.
CenDF[ order(CenDF$HHInc_HousePrice_Ratio) , ]
sorts the data by HHInc_HousePrice_Ratio low to highCenDF[ rev(order(CenDF$HHInc_HousePrice_Ratio)) , ]
sorts the data by HHInc_HousePrice_Ratio high to lowThere is always more than one way to accomplish the same task in R. For example, to get the lowest and highest house-price-to-income ratios
arrange( CenDF, HHInc_HousePrice_Ratio )
sorts the df low to higharrange( CenDF, desc(HHInc_HousePrice_Ratio) )
sorts the df high to lowBonus: Use datatable()
function in the DT
library to get interactive table that allows you to order columns
datatable(CenDF)
1a. In step 3 above, what is the county with the highest (and lowest) house-price-to-income ratio? What is the average house-price-to-income ratio across all US counties over the time period?
- Answer:
1b. Where is Maricopa County on the list? How many years of median income will it take to buy a home in Maricopa County?
- Answer:
1c. Where is Los Angeles on the list and how does its ranking compare to the list reported in the Citylab report report by Richard Florida?
- Answer:
1a. Go back to Step 1 above, and enter in 2012 as the year value, which refers to the 2008-2012 5-year ACS estimate. The time period corresponds to the aftermath of the 2007/08 Great Financial Crisis. Repeat Steps 2-4, and re-answer Question 1 above for the 2008-2012 time period.
- Answer:
2b. Compare and contrast the findings over the different time periods. Calculate the change in the housing value-to-income ratio from 2008-2012 to 2013-2017. What is the change? Did the average house-price-to-income ratio increase or decrease? Did the house-price-to-income ratio increase (or decrease) for Maricopa County and Los Angeles county?
- Answer:
3a. Go back to Step 1 above and keep 2017 as the year value. This time change geography value from county
to tract.
Also add state = "AZ"
and county = "Maricopa County"
within the get_acs
function. Repeat Steps 2-4. What are the summary statistics (min, max, median, mean, sd) for the house-price-to-income ratio in Maricopa county?
- Answer:
3b. Compare and contrast the findings for the different levels of geography. How does the minimum, maxim, and mean value for census tracts within Maricopa county compare to results found in Question 1 looking at county-level data?
- Answer: