End Activity Session (Day 7)
A cartoon panda is doing some gardening Midjourney5
Introduction
Today, we will look at the latest plant hardiness zone map distributed in 2023 by the US Department of Agriculture (USDA), and how it differs from the previous map, which came out in 2012.
The map is meant to help gardeners, and people interested in gardening and plants, know which plants will survive and thrive in different parts of the United States. The main data point used by the map is the mean low temperature recorded (in Fahrenheit) for a given location. Locations with similar mean low temperatures, within a 10-degree range, are categorized as being in the same zone. Each zone is then further divided into two sub-zones, with a 5-degree range for minimum temperatures.
The latest zone map showed that for many locations, the minimum temperature had risen somewhat. This doesnβt come as a massive surprise, given trends in climate change, but I thought that it would be interesting to explore the data and understand just where things had changed.
The data was collected by the PRISM research group at Oregon State University (https://prism.oregonstate.edu/projects/plant_hardiness_zones.php). A new data set was just released on November 15th by the US Department of Agriculture, at https://www.ars.usda.gov/news-events/news/research-news/2023/usda-unveils-updated-plant-hardiness-zone-map/.
The data is available in a variety of formats, but I decided that it would probably be easiest to work with it via zip codes, because they are spread out through the entire country. In order to figure out just where the zip codes are located, we will download an additional data set, a map of zip codes along with location information for each one, joining it together with our other data.
Datasets
You will work with three CSV datasets:
First, weβll use the latest (2023) Plant Hardiness Zone report from https://prism.oregonstate.edu/projects/plant_hardiness_zones.php . The data comes in several formats and parts; weβll use the CSV file that provides us with data per US zip code:
https://prism.oregonstate.edu/projects/phm_data/phzm_us_zipcode_2023.csv
Next, weβll download data from the previous survey in 2012, described at https://prism.oregonstate.edu/projects/plant_hardiness_zones_2012.php :
https://prism.oregonstate.edu/projects/public/phm/2012/phm_us_zipcode_2012.csv
Finally, weβll download and work with a CSV file containing US zip codes:
http://uszipcodelist.com/zip_code_database.csv
Tasks
1. Setup
First, import pandas, matplotlib, and seaborn and load the three datasets.
Next, display the first few rows and print out the dataset info to get an idea of the contents of each dataset.
You may have noticed that the zipcodes were read in as integers rather than strings, and therefore might not be 5 digits long. Ensure the zipcode
or zip
column in all datasets is a 5-character string, filling in any zeros that were dropped.
Combine the 2012 and 2023 data together by adding a year
column and then stacking them together.
In the combined plant hardiness dataframe, create two new columns, trange_min
and trange_max
, containing the min and max temperatures of the trange
column. Remove the original trange
column.
Hint: use str.split()
to split the trange
strings where they have spaces and retrieve the first and last components (min and max, respectively)
Tasks
2. Exploration and visualization
On average, how much has the minimum temperature in a zip code changed from 2012 to 2023?
Merge together the combined plant hardiness dataset and the zipcode dataset by zipcode. This will give us more informtaion in the plant hardiness dataset, such as the latitude and longitude for each zipcode.
Create two scatter plot where the x axis is the longitude, the y axis is the latitude, the color is based on the minimum temperature in 2012 for one and 2023 for the other. Only look at longitude < -60.
Now create a single scatter plot where you look at the difference between the minimum temperature in 2012 and 2023. Only look at longitude < -60. Color any zipcodes where you do not have information from both years in grey.
Create a bar plot showing the top 10 states where the average minimum temperature increased the most. Label your axes appropriately.