final_exam_math208_2021_Q12_sols.pdf

Question 1 [50 points] data(midwest) midwest_modified<-midwest %>% select(county,state,popdensity, popwhite,popblack, popamerindian,popasian, popother,inmetro) The data for this question comes from a modified version of the midwest dataset from the ggplot library. str(midwest_modified) tibble [437 × 9] (S3: tbl_df/tbl/data.frame) $ county : chr [1:437] "ADAMS" "ALEXANDER" "BOND" "BOONE" ... $ state : chr [1:437] "IL" "IL" "IL" "IL" ... $ popdensity : num [1:437] 1271 759 681 1812 324 ... $ popwhite : int [1:437] 63917 7054 14477 29344 5264 35157 5298 16519 13384 1465 06 ... $ popblack : int [1:437] 1702 3496 429 127 547 50 1 111 16 16559 ... $ popamerindian: int [1:437] 98 19 35 46 14 65 8 30 8 331 ... $ popasian : int [1:437] 249 48 16 150 5 195 15 61 23 8033 ... $ popother : int [1:437] 124 9 34 1139 6 221 0 84 6 1596 ... $ inmetro : int [1:437] 0 0 0 1 0 0 0 0 0 1 ... midwest_modified %>% slice(1:5) %>% select(county:popblack) county <chr> state <chr> popdensity <dbl> popwhite <int> popblack <int> ADAMS IL 1270.9615 63917 1702 ALEXANDER IL 759.0000 7054 3496 BOND IL 681.4091 14477 429 BOONE IL 1812.1176 29344 127 BROWN IL 324.2222 5264 547 5 rows midwest_modified %>% slice(1:5) %>% select(county,popamerindian:popother) county popamerindian popasian popother

<chr> <int> <int> <int> ADAMS 98 249 124 ALEXANDER 19 48 9 BOND 35 16 34 BOONE 46 150 1139 BROWN 14 5 6 5 rows The dataset contains population data from midwest counties in five states in the United States from an unspecified year. There are identifying variables for both the county (the name) and the state (the postal abbreviation). The variable popdensity is a measure of density (population per unspecified area units). The variable inmetro is equal to 1 if the county is classified as a metropolitan area and 0 otherwise. The other variables contain counts of population size within self-identified racial classifications. CONTINUED ON NEXT PAGE a. [5 pts] Write a line of code that will generate the following tibble (or data.frame ) containing the highest population density from each state: state <chr> Highest_Pop_Den <dbl> IL 88018.40 IN 34659.09 MI 60333.91 OH 54313.08 WI 63951.67 5 rows Solution: midwest_modified %>% group_by(state) %>% summarise(Highest_Pop_Den=max(popdensity)) state <chr> Highest_Pop_Den <dbl> IL 88018.40 IN 34659.09 MI 60333.91 OH 54313.08

WI 63951.67 5 rows b. [5 pts] Write a line of code that adds a new column to the midwest_modified tibble called Metro where the elements of that column are equal to a string “Metro” if inmetro is equal to 1 and “NonMetro” if inmetro is equal to 0. The first five rows are given below for the county , state , inmetro and Metro columns: county <chr> state <chr> inmetro <int> Metro <chr> ADAMS IL 0 NonMetro ALEXANDER IL 0 NonMetro BOND IL 0 NonMetro BOONE IL 1 Metro BROWN IL 0 NonMetro 5 rows Solution: midwest_modified<-midwest_modified %>% mutate(Metro=ifelse(inmetro==1,"Metro","NonMet ro")) head(midwest_modified %>% select(county, state,inmetro,Metro),5) county <chr> state <chr> inmetro <int> Metro <chr> ADAMS IL 0 NonMetro ALEXANDER IL 0 NonMetro BOND IL 0 NonMetro BOONE IL 1 Metro BROWN IL 0 NonMetro 5 rows c. [5 pts] Write a line of code that will generate the following tibble (or data.frame ) containing the highest population density from each state for metropolitan and non-metropolitan counties separately, using the modified tibble from part (b). dens_table

Your preview ends here