Introduction

This report uses a variety of statistical techniques to investigate the impact of tax credit programs on social vulnerability. The project uses the CDC ATSDR’s Social Vulnerability Index (SVI) to assess geographic risk to hazard based on social characteristics. It uses 16 indicators from the American Community Survey to calculate scores for four themes and overall vulnerability. The project also looks at two tax credit programs. The New Markets Tax Credit (NMTC) Program incentivizes business investment into disadvantaged neighborhoods. The Low Income Housing Tax Credit (LIHTC) Program incentivizes the development of affordable housing.

The project uses several analytical methods, including waffle charts, choropleth mapping, bivariate mapping, correlation analyses, k-means clustering, and difference-in-difference regression analysis. While there were strong correlations between project awards and social vulnerability flags, the regression model ultimately failed to prove the tax credit programs had an effect on social vulnerability. Participation in the NMTC Program did lead to growth in median income, however. More research is needed.

Data

The SVI uses data from the American Community Survey by the U.S. Census Bureau. The national SVI data sets contained 73,057 census tracts each. The Pacific Division contains 10,867 census tracts. Census traacts in Los Angeles County and Fresno County were consistently among the most vulnerable geographies. Similarly, California was the most vulnerable state.

This project specifically looks at census tracts that had not previously received tax dollars to ensure the tax credits could be studied as the treatment effect. 4,237 tracts were eligible for the NMTC, and 584 tracts were eligible for the LIHTC. For the diff-in-diff model, tracts were grouped at the cbsa level. For clustering, correlation, and mapping, tracts were grouped at the county level.

The project also looked at the Housing Price Index (HPI), median income, and median home value. Median income and home value are available from the U.S. Census Bureau, and the HPI comes from the Federal Housing Finance Agency. HPI looks at shifts in pricing for single-family homes.

Analysis

The project uses several analytical methods, including waffle charts, choropleth mapping, bivariate mapping, correlation analyses, k-means clustering, and difference-in-difference regression analysis.

Waffle charts visually explore the SVI data using proportions. The choropleth maps display the number of SVI flags per 1,000 people in the population, a field calculated using population and flag counts. This allows us to adjust for population variation and enables more fair comparisons between geographies. The bivariate maps show the correlation strength between 2010 SVI flag counts and tax credit project award totals. These maps highlight geographic clusters with similar characteristics, unique geographies, and areas of greatest interest (high flag count/high award totals, low flag count/high award totals, etc.).

The correlation analyses examines the strength and direction of relationships between SVI flag count and project award totals. K-means clustering investigates relationships among the data points themselves, identifying groups of geographies with similar attributes. An elbow plot is used to select the ideal number of clusters for the final k-means clusters.

Lastly, a differences-in-differences regression model is used to study the difference in trends over time between counties that received tax credit project awards (treatment/policy group) and counties that didn’t (control group). The model assumes that trends between these groups would be similar in the absence of project dollars, which is how the counterfactual and program effect are calculated.

Results

NMTC Diff-In-Diff Models

Socioeconomic SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_SES with treat, post and cbsa (formula: SVI_FLAG_COUNT_SES ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.27, F(69, 8170) = 43.83, p < .001, adj. R2 = 0.26)

The effect of treat × post is statistically non-significant and negative (beta = -0.05, 95% CI [-0.32, 0.22], t(8170) = -0.35, p = 0.724; Std. beta = -3.34e-03, 95% CI [-0.02, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on socioeconomic status-related social vulnerability and economic outcomes.

Household Characteristics SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_HHCHAR with treat, post and cbsa (formula: SVI_FLAG_COUNT_HHCHAR ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and weak proportion of variance (R2 = 0.11, F(69, 8170) = 14.88, p < .001, adj. R2 = 0.10)

The effect of treat × post is statistically non-significant and negative (beta = -0.05, 95% CI [-0.25, 0.15], t(8170) = -0.48, p = 0.634; Std. beta = -4.97e-03, 95% CI [-0.03, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on household characteristics-related social vulnerability and economic outcomes.

Racial and Ethnic Minority SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_REM with treat, post and cbsa (formula: SVI_FLAG_COUNT_REM ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.30, F(69, 8170) = 51.08, p < .001, adj. R2 = 0.30)

The effect of treat × post is statistically non-significant and negative (beta = -0.01, 95% CI [-0.09, 0.07], t(8170) = -0.32, p = 0.749; Std. beta = -2.96e-03, 95% CI [-0.02, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on racial and ethnic minority status-related social vulnerability and economic outcomes.

Housing and Transportation SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_HOUSETRANSPT with treat, post and cbsa (formula: SVI_FLAG_COUNT_HOUSETRANSPT ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and weak proportion of variance (R2 = 0.06, F(69, 8170) = 7.56, p < .001, adj. R2 = 0.05)

The effect of treat × post is statistically non-significant and negative (beta = -0.01, 95% CI [-0.22, 0.19], t(8170) = -0.12, p = 0.901; Std. beta = -1.33e-03, 95% CI [-0.02, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on housing and transportation access-related social vulnerability and economic outcomes.

Overall SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_OVERALL with treat, post and cbsa (formula: SVI_FLAG_COUNT_OVERALL ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and moderate proportion of variance (R2 = 0.23, F(69, 8170) = 36.00, p < .001, adj. R2 = 0.23)

The effect of treat × post is statistically non-significant and negative (beta = -0.12, 95% CI [-0.67, 0.42], t(8170) = -0.44, p = 0.659; Std. beta = -4.27e-03, 95% CI [-0.02, 0.01])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on socioeconomic, household characteristics, racial and ethnic minority status, and housing and transportation access-related social vulnerability and economic outcomes.

Median Income Economic Outcomes

We fitted a linear model (estimated using OLS) to predict MEDIAN_INCOME with treat, post and cbsa (formula: MEDIAN_INCOME ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and moderate proportion of variance (R2 = 0.18, F(69, 8166) = 26.31, p < .001, adj. R2 = 0.17)

The effect of treat × post is statistically significant and positive (beta = 0.06, 95% CI [4.76e-03, 0.11], t(8166) = 2.14, p = 0.033; Std. beta = 0.02, 95% CI [1.77e-03, 0.04])

Since the effect of treat x post is statistically significant, we can conclude that the NMTC program had a measurable impact on Median Income-related social vulnerability and economic outcomes.

Median Home Value Economic Outcomes

We fitted a linear model (estimated using OLS) to predict MEDIAN_HOME_VALUE with treat, post and cbsa (formula: MEDIAN_HOME_VALUE ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.42, F(68, 7847) = 82.65, p < .001, adj. R2 = 0.41)

The effect of treat × post is statistically non-significant and positive (beta = 2.91e-03, 95% CI [-0.08, 0.08], t(7847) = 0.07, p = 0.943; Std. beta = 6.13e-04, 95% CI [-0.02, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on Median Home Value-related social vulnerability and economic outcomes.

House Price Index Economic Outcomes

We fitted a linear model (estimated using OLS) to predict HOUSE_PRICE_INDEX with treat, post and cbsa (formula: HOUSE_PRICE_INDEX ~ treat + post + treat * post + cbsa) where treat represents NMTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.49, F(67, 5458) = 77.52, p < .001, adj. R2 = 0.48)

The effect of treat × post is statistically non-significant and negative (beta = -0.02, 95% CI [-0.12, 0.09], t(5458) = -0.36, p = 0.716; Std. beta = -3.52e-03, 95% CI [-0.02, 0.02])

Since the effect of treat x post is not statistically significant, we cannot conclude that the NMTC program had a measurable impact on House Price Index-related social vulnerability and economic outcomes.

LIHTC Diff-In-Diff Models

Socioeconomic SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_SES with treat, post and cbsa (formula: SVI_FLAG_COUNT_SES ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and moderate proportion of variance (R2 = 0.24, F(35, 1110) = 10.06, p < .001, adj. R2 = 0.22)

The effect of treat × post is statistically non-significant and positive (beta = 7.39e-03, 95% CI [-0.32, 0.34], t(1110) = 0.04, p = 0.965; Std. beta = 1.15e-03, 95% CI [-0.05, 0.05])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on socioeconomic status-related social vulnerability and economic outcomes.

Household Characteristics SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_HHCHAR with treat, post and cbsa (formula: SVI_FLAG_COUNT_HHCHAR ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and moderate proportion of variance (R2 = 0.21, F(35, 1110) = 8.59, p < .001, adj. R2 = 0.19)

The effect of treat × post is statistically non-significant and negative (beta = -0.05, 95% CI [-0.33, 0.23], t(1110) = -0.36, p = 0.720; Std. beta = -9.57e-03, 95% CI [-0.06, 0.04])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on household characteristics-related social vulnerability and economic outcomes.

Racial and Ethnic Minority SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_REM with treat, post and cbsa (formula: SVI_FLAG_COUNT_REM ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.33, F(35, 1110) = 15.92, p < .001, adj. R2 = 0.31)

The effect of treat × post is statistically non-significant and positive (beta = 8.81e-03, 95% CI [-0.10, 0.12], t(1110) = 0.16, p = 0.877; Std. beta = 3.80e-03, 95% CI [-0.04, 0.05])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on racial and ethnic minority status-related social vulnerability and economic outcomes.

Housing and Transportation SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_HOUSETRANSPT with treat, post and cbsa (formula: SVI_FLAG_COUNT_HOUSETRANSPT ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and weak proportion of variance (R2 = 0.06, F(35, 1110) = 2.11, p < .001, adj. R2 = 0.03)

The effect of treat × post is statistically non-significant and positive (beta = 0.10, 95% CI [-0.17, 0.38], t(1110) = 0.73, p = 0.463; Std. beta = 0.02, 95% CI [-0.04, 0.08])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on housing and transportation access-related social vulnerability and economic outcomes.

Overall SVI

We fitted a linear model (estimated using OLS) to predict SVI_FLAG_COUNT_OVERALL with treat, post and cbsa (formula: SVI_FLAG_COUNT_OVERALL ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.27, F(35, 1110) = 11.51, p < .001, adj. R2 = 0.24)

The effect of treat × post is statistically non-significant and positive (beta = 0.07, 95% CI [-0.60, 0.73], t(1110) = 0.20, p = 0.842; Std. beta = 5.14e-03, 95% CI [-0.05, 0.06])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on socioeconomic, household characteristics, racial and ethnic minority status, and housing and transportation access-related social vulnerability and economic outcomes.

Median Income Economic Outcomes

We fitted a linear model (estimated using OLS) to predict MEDIAN_INCOME with treat, post and cbsa (formula: MEDIAN_INCOME ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and moderate proportion of variance (R2 = 0.23, F(35, 1110) = 9.35, p < .001, adj. R2 = 0.20)

The effect of treat × post is statistically non-significant and positive (beta = 0.02, 95% CI [-0.07, 0.10], t(1110) = 0.40, p = 0.688; Std. beta = 0.01, 95% CI [-0.04, 0.06])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on Median Income-related social vulnerability and economic outcomes.

Median Home Value Economic Outcomes

We fitted a linear model (estimated using OLS) to predict MEDIAN_HOME_VALUE with treat, post and cbsa (formula: MEDIAN_HOME_VALUE ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.35, F(34, 1031) = 16.30, p < .001, adj. R2 = 0.33)

The effect of treat × post is statistically non-significant and positive (beta = 2.73e-03, 95% CI [-0.14, 0.15], t(1031) = 0.04, p = 0.970; Std. beta = 9.33e-04, 95% CI [-0.05, 0.05])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on Median Home Value-related social vulnerability and economic outcomes.

House Price Index Economic Outcomes

We fitted a linear model (estimated using OLS) to predict HOUSE_PRICE_INDEX with treat, post and cbsa (formula: HOUSE_PRICE_INDEX ~ treat + post + treat * post + cbsa) where treat represents LIHTC program participation, post is the year of 2020 after starting period of 2010, and cbsa controls for metro-level effects.

The model explains a statistically significant and substantial proportion of variance (R2 = 0.55, F(31, 586) = 22.81, p < .001, adj. R2 = 0.52)

The effect of treat × post is statistically non-significant and negative (beta = -0.05, 95% CI [-0.20, 0.09], t(586) = -0.70, p = 0.483; Std. beta = -0.02, 95% CI [-0.07, 0.04])

Since the effect of treat x post is not statistically significant, we cannot conclude that the LIHTC program had a measurable impact on House Price Index-related social vulnerability and economic outcomes.

Discussion and Recommendations

The waffle charts revealed unique characteristics about the Pacific Division. Despite a decline in the number of cost-burdened households, the Pacific Division has a higher rate than the nation. Additionally, more people in the division live in crowded living spaces compared to the nation. This was the first indicator that the Pacific Division may utilize the tax credit programs more frequently than other divisions.

The SVI flag to population ratio maps also show interesting trends. In 2010, Alaska had the densest concentration of counties with high ratios. The Yukon-Koyukuk Census Area ranked highest in the division. Alaska still had the densest concentration of counties with high ratios in 2020, but Yakutat City and Borough took over the #1 spot in the division.

For the NMTC Program, the correlation between flag count and award totals is very strong and positive, meaning counties with more social vulnerability flags in 2010 received more NMTC dollars. The Pacific Division contains several outliers, and Los Angeles County (LAC) is an influential data point. Excluding LAC drops the correlation meaningfully but still returns a moderate, positive correlation. For the LIHTC Program, the correlation between flag count and award totals is very strong and positive in the Pacific Division. Excluding LAC drops the correlation strength, but it’s still strong and positive.

Only one diff-in-diff model finding is statistically significant. NMTC projects improved median income in the Pacific Division. None of the models return statistically significant effects on social vulnerability. However, this study is not enough to disprove potential effects. Further investigation is needed.

The Pacific Division highlights the extreme effect of Los Angeles County on the entire Pacific Division. It is an influential data point. The x and y values for both its NMTC and LIHTC data sets skewed the results. While it makes sense to include LAC as a natural part of the population under study, the analysis was not able to account for LAC’s influence. Techniques like k-means clustering are not appropriate with the abundance of outliers in this census division. Future studies using NMTC and LIHTC data should use statistical methods that are less sensitive to outliers. Certain parts of the Pacific Division report used a common technique of running analyses with and without an outlier. In future studies, it would be useful to do this consistently throughout the project.

References

R Version

Analyses were conducted using the R Statistical language (version 4.4.2; R Core Team, 2024) on Windows 10 x64 (build 19045).

R Packages

Data

Readings