2.1 Data Wrangling for evictions¶
We will need to trim data to Philadelphia only. Take a look at the data dictionary for the descriptions
of the various columns in the file: eviction_lab_data_dictionary.txt.
The column names are shortened — see the end of the above file for the abbreviations. The numbers at the
end of the columns indicate the years. For example, e-16
is the number of evictions in 2016.
we are interested in the number of evictions by census tract for various years. Right now, each year has
it's own column, so it will be easiest to transform to a tidy format. The tidy data frame have four
columns: GEOID
, geometry
, a column holding the number of evictions, and a column
telling you what the name of the original column was for that value.