Lab 1: Census Data Quality for Policy Decisions

Evaluating Data Reliability for Algorithmic Decision-Making

Author

Gab Chen

Published

January 29, 2026

Assignment Overview

Scenario

You are a data analyst for the Philadelphia Department of Human Services. The department is considering implementing an algorithmic system to identify communities that should receive priority for social service funding and outreach programs. Your supervisor has asked you to evaluate the quality and reliability of available census data to inform this decision.

Drawing on our Week 2 discussion of algorithmic bias, you need to assess not just what the data shows, but how reliable it is and what communities might be affected by data quality issues.

Learning Objectives

Apply dplyr functions to real census data for policy analysis
Evaluate data quality using margins of error
Connect technical analysis to algorithmic decision-making
Identify potential equity implications of data reliability issues
Create professional documentation for policy stakeholders

Submission Instructions

Submit by posting your updated portfolio link on Canvas. Your assignment should be accessible at your-portfolio-url/labs/lab_1/

Make sure to update your _quarto.yml navigation to include this assignment under an “Labs” menu.

Part 1: Portfolio Integration

Create this assignment in your portfolio repository under an labs/lab_1/ folder structure. Update your navigation menu to include:

- text: Assignments
  menu:
    - href: labs/lab_1/your_file_name.qmd
      text: "Lab 1: Census Data Exploration"

If there is a special character like a colon, you need use double quote mark so that the quarto can identify this as text

Setup

[1] "e2510a23b0c3dba6ac091ea554e6f2605dbde651"

State Selection: I have chosen Pennsylvania for this analysis because: For consistency.

Part 2: County-Level Resource Assessment

2.1 Data Retrieval

Your Task: Use get_acs() to retrieve county-level data for your chosen state.

Requirements: - Geography: county level - Variables: median household income (B19013_001) and total population (B01003_001)
- Year: 2022 - Survey: acs5 - Output format: wide

Hint: Remember to give your variables descriptive names using the variables = c(name = "code") syntax.

# Write your get_acs() code here
pa <- get_acs(geography = "county", variables = c(totpop = "B01003_001", medinc = "B19013_001"),
        state = "PA", survey = "acs5", year = 2022, output = "wide")

# Clean the county names to remove state name and "County" 
# Hint: use mutate() with str_remove()
pa_clean <- pa %>%
  mutate(
    county_name = str_remove(NAME, " County, Pennsylvania")
  )

# Display the first few rows
head(pa_clean)

# A tibble: 6 × 7
  GEOID NAME                         totpopE totpopM medincE medincM county_name
  <chr> <chr>                          <dbl>   <dbl>   <dbl>   <dbl> <chr>      
1 42001 Adams County, Pennsylvania    104604      NA   78975    3334 Adams      
2 42003 Allegheny County, Pennsylva… 1245310      NA   72537     869 Allegheny  
3 42005 Armstrong County, Pennsylva…   65538      NA   61011    2202 Armstrong  
4 42007 Beaver County, Pennsylvania   167629      NA   67194    1531 Beaver     
5 42009 Bedford County, Pennsylvania   47613      NA   58337    2606 Bedford    
6 42011 Berks County, Pennsylvania    428483      NA   74617    1191 Berks

2.2 Data Quality Assessment

Your Task: Calculate margin of error percentages and create reliability categories.

Requirements: - Calculate MOE percentage: (margin of error / estimate) * 100 - Create reliability categories: - High Confidence: MOE < 5% - Moderate Confidence: MOE 5-10%
- Low Confidence: MOE > 10% - Create a flag for unreliable estimates (MOE > 10%)

Hint: Use mutate() with case_when() for the categories.

# Calculate MOE percentage and reliability categories using mutate()
pa_reliability <- pa_clean %>%
  mutate(
    moe_percentage = round((medincM / medincE) * 100, 2),
    reliability = case_when(
      moe_percentage < 5 ~ "High Confidence",
      moe_percentage >= 5 & moe_percentage <= 10 ~ "Moderate",
      moe_percentage > 10 ~ "Low Confidence"
    )
  )

# Create a summary showing count of counties in each reliability category
# Hint: use count() and mutate() to add percentages
pa_reliability %>%
  count(reliability) %>%
  mutate(
    percent = round(n / sum(n) * 100, 2)
  )

# A tibble: 2 × 3
  reliability         n percent
  <chr>           <int>   <dbl>
1 High Confidence    57    85.1
2 Moderate           10    14.9

2.3 High Uncertainty Counties

Your Task: Identify the 5 counties with the highest MOE percentages.

Requirements: - Sort by MOE percentage (highest first) - Select the top 5 counties - Display: county name, median income, margin of error, MOE percentage, reliability category - Format as a professional table using kable()

Hint: Use arrange(), slice(), and select() functions.

# Create table of top 3 counties by MOE percentage
pa_reliability %>%
  arrange(desc(moe_percentage)) %>%
  slice(1:5) %>%
  select(county_name, medincE, moe_percentage, reliability) %>%
# Format as table with kable() - include appropriate column names and caption
kable(col.names = c(
  "County", 
  "Median Household Income", 
  "MOE Percentage (%)", 
  "Reliability Category"), 
  caption = "Top 5 Pennsylvania Counties with Highest Income Estimate Uncertainty"
)

Top 5 Pennsylvania Counties with Highest Income Estimate Uncertainty
County	Median Household Income	MOE Percentage (%)	Reliability Category
Forest	46188	9.99	Moderate
Sullivan	62910	9.25	Moderate
Union	64914	7.32	Moderate
Montour	72626	7.09	Moderate
Elk	61672	6.63	Moderate

Data Quality Commentary:

[Counties such as Forest, Sullivan, and Union show relatively high uncertainty in median household income estimates, which means algorithms relying on this data may make less reliable decisions for these areas. This could lead to misallocation of resources or misclassification in models used for funding, eligibility, or planning. Higher uncertainty in these counties is likely driven by small populations, rural characteristics, and limited survey samples, which increase margins of error in ACS estimates.]

Part 3: Neighborhood-Level Analysis

3.1 Focus Area Selection

Your Task: Select 2-3 counties from your reliability analysis for detailed tract-level study.

Strategy: Choose counties that represent different reliability levels (e.g., 1 high confidence, 1 moderate, 1 low confidence) to compare how data quality varies.

# Use filter() to select 2-3 counties from your county_reliability data
# Store the selected counties in a variable called selected_counties
selected_counties <- pa_reliability %>%
  group_by(reliability) %>%
  slice(1)

# Display the selected counties with their key characteristics
# Show: county name, median income, MOE percentage, reliability category
print(selected_counties) %>%
select(county_name, medincE, moe_percentage, reliability)

# A tibble: 2 × 9
# Groups:   reliability [2]
  GEOID NAME          totpopE totpopM medincE medincM county_name moe_percentage
  <chr> <chr>           <dbl>   <dbl>   <dbl>   <dbl> <chr>                <dbl>
1 42001 Adams County…  104604      NA   78975    3334 Adams                 4.22
2 42023 Cameron Coun…    4536      NA   46186    2605 Cameron               5.64
# ℹ 1 more variable: reliability <chr>

# A tibble: 2 × 4
# Groups:   reliability [2]
  county_name medincE moe_percentage reliability    
  <chr>         <dbl>          <dbl> <chr>          
1 Adams         78975           4.22 High Confidence
2 Cameron       46186           5.64 Moderate

Comment on the output: [Using filter() to select a certain number of observations requires you to have a basic understanding of your data frame.One county with the highest MOE percentage is selected from each reliability group.]

3.2 Tract-Level Demographics

Your Task: Get demographic data for census tracts in your selected counties.

Requirements: - Geography: tract level - Variables: white alone (B03002_003), Black/African American (B03002_004), Hispanic/Latino (B03002_012), total population (B03002_001) - Use the same state and year as before - Output format: wide - Challenge: You’ll need county codes, not names. Look at the GEOID patterns in your county data for hints.

# Define your race/ethnicity variables with descriptive names
race_vars <- c(
  white = "B03002_003",  # white alone
  black = "B03002_004", # Black/African American
  his_lat = "B03002_012", # Hispanic/Latino
  totpop = "B03002_001" # total population
)
# Use get_acs() to retrieve tract-level data
# Census API bug, failed at Forest and Pike County
# To avoid the Census API bug, choose Adams and Cameron County

county_fip <- c("42001", "42023") # Adams and Cameron

tract_data <- get_acs(
  geography = "tract",
  variables = race_vars,
  state = "PA",
  year = 2022,
  survey = "acs5",
  output = "wide"
) %>%
# Clean county names
  mutate(
    County_clean1 = str_extract(NAME, "(?<=;).*?(?=;)") # extract whatever is between two ;
  ) %>%
  mutate(
    County_clean = str_extract(County_clean1, "(?<=\\s)\\S+(?=\\s)") # extract whatever is between two spaces
  )
# Hint: You may need to specify county codes in the county parameter
# filter for Adams and Cameron County
selected_tract <- tract_data %>%
filter(County_clean == c("Adams", "Cameron"))

# Calculate percentage of each group using mutate()
# Create percentages for white, Black, and Hispanic populations
pct_race <- selected_tract %>%
  mutate(
    pct_white = round(whiteE / totpopE, 3) * 100,
    pct_black = round(blackE / totpopE, 3) * 100,
    pct_his_lat = round(his_latE / totpopE, 3) * 100
  )

# Add readable tract and county name columns using str_extract() or similar
pct_race_clean <- pct_race %>%
  mutate(
    TRACT = str_extract(NAME, "\\d+.?\\d+" ),
    COUNTY = str_extract(NAME, "(?<=;)\\s[^ ]+")
    )

3.3 Demographic Analysis

Your Task: Analyze the demographic patterns in your selected areas.

# Find the tract with the highest percentage of Hispanic/Latino residents
# Hint: use arrange() and slice() to get the top tract
pct_race_clean %>%
  arrange(desc(pct_his_lat)) %>%
  slice(1)

# A tibble: 1 × 17
  GEOID      NAME  whiteE whiteM blackE blackM his_latE his_latM totpopE totpopM
  <chr>      <chr>  <dbl>  <dbl>  <dbl>  <dbl>    <dbl>    <dbl>   <dbl>   <dbl>
1 420010315… Cens…   2856    314    107     74      816      265    3908     292
# ℹ 7 more variables: County_clean1 <chr>, County_clean <chr>, pct_white <dbl>,
#   pct_black <dbl>, pct_his_lat <dbl>, TRACT <chr>, COUNTY <chr>

# Calculate average demographics by county using group_by() and summarize()
# Show: number of tracts, average percentage for each racial/ethnic group
pct_race_clean %>%
  group_by(COUNTY) %>%
  summarize(
    n_tracts = n(),
    pct_white = round(mean(pct_white, na.rm = TRUE), 1),
    pct_black = round(mean(pct_black, na.rm = TRUE), 1),
    pct_his_lat = round(mean(pct_his_lat, na.rm = TRUE), 1)
  ) %>%

# Create a nicely formatted table of your results using kable()
kable(col.names = c(
  "County", 
  "Number of Tracts", 
  "Percent of White Residents (%)", 
  "Percent of Black Residents (%)", 
  "Percent of Hispanic or Latino Residents (%)"
  ),
  caption = "Average Demographis by County"
  )

Average Demographis by County
County	Number of Tracts	Percent of White Residents (%)	Percent of Black Residents (%)	Percent of Hispanic or Latino Residents (%)
Adams	14	87.6	1.2	8.0
Cameron	1	88.6	0.0	2.9

Part 4: Comprehensive Data Quality Evaluation

4.1 MOE Analysis for Demographic Variables

Your Task: Examine margins of error for demographic variables to see if some communities have less reliable data.

Requirements: - Calculate MOE percentages for each demographic variable - Flag tracts where any demographic variable has MOE > 15% - Create summary statistics

# Calculate MOE percentages for white, Black, and Hispanic variables
# Hint: use the same formula as before (margin/estimate * 100)
pct_race_clean <- pct_race_clean %>%
  mutate(
    MOE_white = round(whiteM / whiteE * 100, 2),
    MOE_black = round(blackM / blackE * 100, 2),
    MOE_his_lat = round(his_latM / his_latE * 100, 2)
  ) 

# Create a flag for tracts with high MOE on any demographic variable
# Use logical operators (| for OR) in an ifelse() statement
flag_tract <- pct_race_clean %>%
  mutate(
    MOE_issue = if_else(
      MOE_white > 15 | MOE_black > 15 | MOE_his_lat > 15,
      TRUE,
      FALSE
    )
  )

# Create summary statistics showing how many tracts have data quality issues
flag_tract %>%
  summarise(
    tracts_with_MOE_issues = sum(MOE_issue, na.rm = TRUE)
  )

# A tibble: 1 × 1
  tracts_with_MOE_issues
                   <int>
1                     15

4.2 Pattern Analysis

Your Task: Investigate whether data quality problems are randomly distributed or concentrated in certain types of communities.

# Group tracts by whether they have high MOE issues
# Categorize MOE > 100 as "high", MOE <= 100 as "low"
flag_tract <- flag_tract %>%
  mutate(
    MOE_category = case_when(
      MOE_white > 100 | MOE_black > 100 | MOE_his_lat > 100 ~ "high",
      MOE_white <= 100 | MOE_black <= 100 | MOE_his_lat <= 100 ~ "low",
    )
  )

# Calculate average characteristics for each group:
# - population size, demographic percentages
# Use group_by() and summarize() to create this comparison
flag_tract %>%
group_by(MOE_category) %>%
summarise(
  avg_pop = round(mean(totpopE, na.rm = TRUE), 0),
  avg_pct_white = round(mean(pct_white, na.rm = TRUE), 1),
  avg_pct_black = round(mean(pct_black, na.rm = TRUE), 1),
  avg_pct_hispanic = round(mean(pct_his_lat, na.rm = TRUE), 1)
) %>%

# Create a professional table showing the patterns
kable(
  col.names = c(
    "MOE Issue Category",
    "Average Population Size",
    "Average percentage of white population (%)",
    "Average percentage of black population (%)",
    "Average percentage of hispanic or latino population (%)"
  ),
  caption = "Average Demographic Characteristics for Tracts with Different Levels of MOE Issues"
)

Average Demographic Characteristics for Tracts with Different Levels of MOE Issues
MOE Issue Category	Average Population Size	Average percentage of white population (%)	Average percentage of black population (%)	Average percentage of hispanic or latino population (%)
high	2896	89.7	0.7	5.9
low	4645	82.0	2.5	12.3

Pattern Analysis: [Tracts with high MOE issues have a smaller population size and more demographically homogeneous, with a larger share of white presence and smaller share of black and hispanic or latino population. Tracts with smaller population size have smaller survey samples. More homogeneous tracts have limited variation across demographic categories. Both of the cases increase statistical uncertainty and inflate margins of error.]

Part 5: Policy Recommendations

5.1 Analysis Integration and Professional Summary

Your Task: Write an executive summary that integrates findings from all four analyses.

Executive Summary Requirements: 1. Overall Pattern Identification: What are the systematic patterns across all your analyses? 2. Equity Assessment: Which communities face the greatest risk of algorithmic bias based on your findings? 3. Root Cause Analysis: What underlying factors drive both data quality issues and bias risk? 4. Strategic Recommendations: What should the Department implement to address these systematic issues?

Executive Summary:

[Across the analysis of census data reliability at county levels in Pennsylvania, approximately 85% of the state’s 67 counties have a margin of error below 5% (“High Confidence”). Approximately 15% of all counties have a margin of error between 5% to 10% (“Moderate”).No county exhibits a margin of error exceeding 10% (“Low Confidence”), indicating that median income estimates at the county level are generally reliable. One county was selected from each MOE reliability category to conduct a tract-level analysis of data reliability. Examining the margins of error associated with racial population shares at the census tract level reveals that many tracts in Pennsylvania experience some degree of MOE-related data quality issues. On average, tracts with higher MOE issues tend to have larger population sizes and more racially homogeneous compositions, characterized by a higher proportion of White residents and smaller shares of Black and Hispanic or Latino populations. These patterns suggest that higher MOE issues are driven by both total population size and the interaction between survey sampling limitations and small subgroup populations, particularly in racially homogeneous tracts. When using census data to inform policy decisions, special care should be taken in interpreting estimates for these communities to avoid introducing statistical bias into policy design and implementation: To address systematic MOE issues, the Department should integrate data quality metrics directly into analytic and algorithmic workflows, particularly for tract-level decision-making. The following strategic recommendations outline practical steps to mitigate data uncertainty and reduce the risk of biased policy outcomes: 1. Incorporate data quality thresholds into decision-making. 2. Supplement ACS data with local or administrative data. 3. Prioritize transparency and documentation.]

6.3 Specific Recommendations

Your Task: Create a decision framework for algorithm implementation.

# Create a summary table using your county reliability data
# Include: county name, median income, MOE percentage, reliability category

# Add a new column with algorithm recommendations using case_when():
# - High Confidence: "Safe for algorithmic decisions"
# - Moderate Confidence: "Use with caution - monitor outcomes"  
# - Low Confidence: "Requires manual review or additional data"
pa_recommendation <- pa_reliability %>%
  mutate(
    recommendation = case_when(
      reliability == "High Confidence" ~ "Safe for algorithmic decisions",
      reliability == "Moderate" ~ "Use with caution - monitor outcomes",
      reliability == "Low Confidence" ~ "Requires manual review or additional data",
      TRUE ~ NA_character_
    )
  ) %>%
  select(county_name, medincE, moe_percentage, reliability, recommendation)

# Format as a professional table with kable()
pa_recommendation %>%
  
kable(
  col.names = c(
    "County", 
    "Median Income", 
    "MOE Percentage (%)", 
    "MOE reliability", 
    "Algorithm Recommendation"
  ),
  caption = "Algorithmic Use Recommendations Based on County-Level Data Reliability"
)

Algorithmic Use Recommendations Based on County-Level Data Reliability
County	Median Income	MOE Percentage (%)	MOE reliability	Algorithm Recommendation
Adams	78975	4.22	High Confidence	Safe for algorithmic decisions
Allegheny	72537	1.20	High Confidence	Safe for algorithmic decisions
Armstrong	61011	3.61	High Confidence	Safe for algorithmic decisions
Beaver	67194	2.28	High Confidence	Safe for algorithmic decisions
Bedford	58337	4.47	High Confidence	Safe for algorithmic decisions
Berks	74617	1.60	High Confidence	Safe for algorithmic decisions
Blair	59386	3.47	High Confidence	Safe for algorithmic decisions
Bradford	60650	3.57	High Confidence	Safe for algorithmic decisions
Bucks	107826	1.41	High Confidence	Safe for algorithmic decisions
Butler	82932	2.61	High Confidence	Safe for algorithmic decisions
Cambria	54221	3.34	High Confidence	Safe for algorithmic decisions
Cameron	46186	5.64	Moderate	Use with caution - monitor outcomes
Carbon	64538	5.31	Moderate	Use with caution - monitor outcomes
Centre	70087	2.77	High Confidence	Safe for algorithmic decisions
Chester	118574	1.70	High Confidence	Safe for algorithmic decisions
Clarion	58690	4.37	High Confidence	Safe for algorithmic decisions
Clearfield	56982	2.79	High Confidence	Safe for algorithmic decisions
Clinton	59011	3.86	High Confidence	Safe for algorithmic decisions
Columbia	59457	3.76	High Confidence	Safe for algorithmic decisions
Crawford	58734	3.91	High Confidence	Safe for algorithmic decisions
Cumberland	82849	2.20	High Confidence	Safe for algorithmic decisions
Dauphin	71046	2.27	High Confidence	Safe for algorithmic decisions
Delaware	86390	1.53	High Confidence	Safe for algorithmic decisions
Elk	61672	6.63	Moderate	Use with caution - monitor outcomes
Erie	59396	2.55	High Confidence	Safe for algorithmic decisions
Fayette	55579	4.16	High Confidence	Safe for algorithmic decisions
Forest	46188	9.99	Moderate	Use with caution - monitor outcomes
Franklin	71808	3.00	High Confidence	Safe for algorithmic decisions
Fulton	63153	3.65	High Confidence	Safe for algorithmic decisions
Greene	66283	6.41	Moderate	Use with caution - monitor outcomes
Huntingdon	61300	4.72	High Confidence	Safe for algorithmic decisions
Indiana	57170	4.65	High Confidence	Safe for algorithmic decisions
Jefferson	56607	3.41	High Confidence	Safe for algorithmic decisions
Juniata	61915	4.79	High Confidence	Safe for algorithmic decisions
Lackawanna	63739	2.58	High Confidence	Safe for algorithmic decisions
Lancaster	81458	1.79	High Confidence	Safe for algorithmic decisions
Lawrence	57585	3.07	High Confidence	Safe for algorithmic decisions
Lebanon	72532	2.69	High Confidence	Safe for algorithmic decisions
Lehigh	74973	2.00	High Confidence	Safe for algorithmic decisions
Luzerne	60836	2.35	High Confidence	Safe for algorithmic decisions
Lycoming	63437	4.39	High Confidence	Safe for algorithmic decisions
McKean	57861	4.75	High Confidence	Safe for algorithmic decisions
Mercer	57353	3.63	High Confidence	Safe for algorithmic decisions
Mifflin	58012	3.43	High Confidence	Safe for algorithmic decisions
Monroe	80656	3.17	High Confidence	Safe for algorithmic decisions
Montgomery	107441	1.27	High Confidence	Safe for algorithmic decisions
Montour	72626	7.09	Moderate	Use with caution - monitor outcomes
Northampton	82201	1.93	High Confidence	Safe for algorithmic decisions
Northumberland	55952	2.67	High Confidence	Safe for algorithmic decisions
Perry	76103	3.17	High Confidence	Safe for algorithmic decisions
Philadelphia	57537	1.38	High Confidence	Safe for algorithmic decisions
Pike	76416	4.90	High Confidence	Safe for algorithmic decisions
Potter	56491	4.42	High Confidence	Safe for algorithmic decisions
Schuylkill	63574	2.40	High Confidence	Safe for algorithmic decisions
Snyder	65914	5.56	Moderate	Use with caution - monitor outcomes
Somerset	57357	2.78	High Confidence	Safe for algorithmic decisions
Sullivan	62910	9.25	Moderate	Use with caution - monitor outcomes
Susquehanna	63968	3.14	High Confidence	Safe for algorithmic decisions
Tioga	59707	3.23	High Confidence	Safe for algorithmic decisions
Union	64914	7.32	Moderate	Use with caution - monitor outcomes
Venango	59278	3.45	High Confidence	Safe for algorithmic decisions
Warren	57925	5.19	Moderate	Use with caution - monitor outcomes
Washington	74403	2.38	High Confidence	Safe for algorithmic decisions
Wayne	59240	4.79	High Confidence	Safe for algorithmic decisions
Westmoreland	69454	1.99	High Confidence	Safe for algorithmic decisions
Wyoming	67968	3.85	High Confidence	Safe for algorithmic decisions
York	79183	1.79	High Confidence	Safe for algorithmic decisions

Key Recommendations:

Your Task: Use your analysis results to provide specific guidance to the department.

Counties suitable for immediate algorithmic implementation: [Counties classified as High Confidence—with low MOE percentages—are appropriate for immediate use in algorithmic decision-making. In these counties, median income estimates are statistically reliable, reducing the risk that automated systems will misallocate resources or misclassify need. Algorithms applied in these contexts can be expected to perform consistently, provided routine validation checks remain in place. (Adams, Allegheny, Armstrong, Beaver, Bedford, etc.)]
Counties requiring additional oversight: [Counties with Moderate Confidence data should be included in algorithmic workflows but paired with active monitoring and evaluation. In these areas, algorithmic outputs should be reviewed periodically against observed outcomes to detect potential bias or instability. Incorporating performance audits or sensitivity checks can help ensure that moderate data uncertainty does not translate into systematic policy errors. (Forest, Greene, Elk, Cameron, Carbon, etc.)]
Counties needing alternative approaches: [Counties identified as Low Confidence require alternative approaches due to high margins of error and limited data reliability. For these counties, algorithmic outputs should not be used as the sole basis for decision-making. Instead, the Department should rely on manual review, aggregated or multi-year data, supplemental administrative records, or targeted local surveys to inform policy decisions and reduce the risk of statistical bias.]

Questions for Further Investigation

How do MOE patterns vary spatially across Pennsylvania, and are high-uncertainty tracts geographically clustered in rural, suburban, or peripheral urban areas?
How do margins of error change over time, and are certain counties or tracts becoming more or less reliable?

Technical Notes

Data Sources: - U.S. Census Bureau, American Community Survey 2018-2022 5-Year Estimates - Retrieved via tidycensus R package on 2026-02-10

Reproducibility: - All analysis conducted in R version 4.5.2 - Census API key required for replication - Complete code and documentation available at: https://gabyxchen.github.io/PPA_Portfolio/

Methodology Notes: [Adams and Cameron Counties were randomly selected for detailed demographic analysis to represent their respective MOE risk categories, although this selection may introduce elements of randomness that could influence the results.]

Limitations: [Only one sample county was selected from each MOE risk category group, and the small sample size may limit the generalizability of the findings. Additionally, this analysis relies solely on 2022 ACS 5-year estimates, which capture conditions within a single time period. As a result, the study does not account for temporal variation or longer-term trends in demographic patterns and data reliability.]

Submission Checklist

Before submitting your portfolio link on Canvas:

All code chunks run without errors
All “[Fill this in]” prompts have been completed
Tables are properly formatted and readable
Executive summary addresses all four required components
Portfolio navigation includes this assignment
Census API key is properly set
Document renders correctly to HTML

Remember: Submit your portfolio URL on Canvas, not the file itself. Your assignment should be accessible at your-portfolio-url/labs/lab_1/your_file_name.html