Map Making: PCPs per Capita

July 10, 2017
howto R research tutorial

I wanted to use this opportunity to work with R markdown, especially since I plan on collaborating with others on this.

As part of my research on health care access, I wanted to create a map to visually represent the number of doctors available on a state level. I came across the AAMC’s 2015 State Physician Workforce Data Book. Their site also has a nice interactive visualization that reminded me of some of the work I’ve done using highcharter in R.


Getting the data

I started out by scraping the data from the AAMC pdf linked to above. I then opened this up in R and cleaned it up a little.

library(tidyverse)
library(ggplot2)
library(maps)

#Load the CSV file
PCP <- read.table(file = "AAMC_PCP.csv", sep = ",", dec = ".", header = TRUE)

#Change States to all lowercase
levels(PCP$State) <- tolower(levels(PCP$State))

Going from State to lat, long coordinates

#Generate geocoords corresponding to each state
map_it <- map_data("state")

This will generate state data in geographic coordinates. For example:

##        long      lat group order  region subregion
## 1 -87.46201 30.38968     1     1 alabama      <NA>
## 2 -87.48493 30.37249     1     2 alabama      <NA>
## 3 -87.52503 30.37249     1     3 alabama      <NA>
## 4 -87.53076 30.33239     1     4 alabama      <NA>
## 5 -87.57087 30.32665     1     5 alabama      <NA>
## 6 -87.58806 30.32665     1     6 alabama      <NA>

Merge State with ‘Region’ and Coordinates

As you can see above map_it generates a data frame that contains our coordinates and each state is listed under the ‘region’ column.

#Add a new column called "region" to help merge coords data w/ each state
PCP$region <- PCP$State

#Remove US, PR and DC
PCP <- filter(PCP, region != "united states" & region != "puerto rico" & region != "district of columbia")


#Merge the States (from PCP) with the Coordinates data set, map_it
PCP <- merge(map_it, PCP, by="region")

Map Making

ggplot(PCP, aes(map_id = region)) +
  geom_map(aes(fill = PCP$PCP_percap), map = map_it, color = "white") +
  expand_limits(x = map_it$long, y = map_it$lat) + 
  scale_fill_continuous(name = "# Per Capita", low = "thistle2", high = "darkred", guide="colorbar") + 
  labs(title = "PCPs per capita by State",
    x = "", y = "", 
    caption = "Mirza Khan. Source: AAMC 2015 State Physician Workforce Data Book.") +
  theme(axis.ticks = element_blank(), 
        axis.text = element_blank(), 
        panel.grid = element_blank(), 
        panel.background = element_blank())

Yes, I am aware that ggplot2 is in tidyverse, but I just like knowing I’ve called it up myself.

h/t cdesante and @hadleywickham for an excellent map_data() and ggplot() tutorial

Importing Data into R

January 21, 2018
howto notes R tutorial

Python Basics: From Zero to Full Monty

September 27, 2017
notes study tutorial python

Tables: Converting Markdown to Huxtable

September 21, 2017
blog howto markdown R