End Activity Session (Day 3)
Code
import pandas as pd
import numpy as np
Control Flows and Series Analysis
A cartoon panda looking over a yearβs worth of monthly class exams. The panda is doing great; A+! (Midjourney5)[https://www.midjourney.com/jobs/6b63c3ca-c64d-41b8-a791-7e4b2594c781?index=0]
In this end-of-day activity, weβll practice the concepts you learned today: control flows and pandas Series analysis. Weβll work with real student test score data to apply if/else statements, loops, and Series operations that you learned in todayβs sessions.
By completing this exercise, you will practice:
if
/elif
/else
statements for decision makingfor
loops to iterate through data.mean()
, .median()
, .max()
, .min()
)Letβs import the libraries we learned about today:
Letβs work with a realistic dataset of monthly test scores for a student throughout the academic year.
Create a pandas Series with the following monthly test scores:
Monthly Test Scores:
Sep 78
Oct 85
Nov 92
Dec 88
Jan 79
Feb 83
Mar 91
Apr 87
May 89
Jun 94
dtype: int64
Using the Series methods you learned today, calculate:
# Calculate statistics using Series methods
average_score = scores.mean()
highest_score = scores.max()
lowest_score = scores.min()
median_score = scores.median()
print(f"Average score: {average_score:.2f}")
print(f"Highest score: {highest_score}")
print(f"Lowest score: {lowest_score}")
print(f"Median score: {median_score:.2f}")
Average score: 86.60
Highest score: 94
Lowest score: 78
Median score: 87.50
Now letβs use the control flow concepts from today to analyze the scores:
Use if
/elif
/else
statements to categorize the average score:
# Categorize performance based on average score
if average_score >= 90:
performance = "Excellent"
elif average_score >= 80:
performance = "Good"
elif average_score >= 70:
performance = "Satisfactory"
else:
performance = "Needs Improvement"
print(f"Overall performance: {performance} (Average: {average_score:.2f})")
Overall performance: Good (Average: 86.60)
Use a for
loop to analyze each monthβs performance:
print("\nMonth-by-Month Performance Analysis:")
print("=" * 40)
for month in scores.index:
score = scores[month]
# Use if/else to provide feedback for each month
if score >= 90:
feedback = "Outstanding!"
elif score >= 85:
feedback = "Great work!"
elif score >= 80:
feedback = "Good job!"
elif score >= 75:
feedback = "Solid effort!"
else:
feedback = "Room for improvement"
print(f"{month}: {score} - {feedback}")
Month-by-Month Performance Analysis:
========================================
Sep: 78 - Solid effort!
Oct: 85 - Great work!
Nov: 92 - Outstanding!
Dec: 88 - Great work!
Jan: 79 - Solid effort!
Feb: 83 - Good job!
Mar: 91 - Outstanding!
Apr: 87 - Great work!
May: 89 - Great work!
Jun: 94 - Outstanding!
Use basic Series indexing to explore the data:
Score in December: 88
Score in May: 89
First three months:
Sep 78
Oct 85
Nov 92
dtype: int64
Last three months:
Apr 87
May 89
Jun 94
dtype: int64
Letβs analyze different parts of the year using basic slicing:
# Fall semester (first 4 months)
fall_scores = scores[:4]
fall_average = fall_scores.mean()
# Spring semester (last 6 months)
spring_scores = scores[4:]
spring_average = spring_scores.mean()
print(f"Fall semester average: {fall_average:.2f}")
print(f"Spring semester average: {spring_average:.2f}")
# Use control flow to compare performance
if spring_average > fall_average:
improvement = spring_average - fall_average
print(f"Improvement in spring: +{improvement:.2f} points!")
elif fall_average > spring_average:
decline = fall_average - spring_average
print(f"Performance declined in spring: -{decline:.2f} points")
else:
print("Performance remained consistent between semesters")
Fall semester average: 85.75
Spring semester average: 87.17
Improvement in spring: +1.42 points!
Apply NumPy functions to our Series (as learned in Session 3c):
# NumPy statistical functions work on Pandas Series!
np_mean = np.mean(scores)
np_std = np.std(scores)
np_sum = np.sum(scores)
print(f"Using NumPy functions:")
print(f"Mean: {np_mean:.2f}")
print(f"Standard deviation: {np_std:.2f}")
print(f"Total points: {np_sum}")
# Determine consistency using standard deviation
if np_std < 5:
consistency = "Very consistent"
elif np_std < 8:
consistency = "Moderately consistent"
else:
consistency = "Inconsistent"
print(f"Performance consistency: {consistency} (std dev: {np_std:.2f})")
Using NumPy functions:
Mean: 86.60
Standard deviation: 5.08
Total points: 866
Performance consistency: Moderately consistent (std dev: 5.08)
Combine control flows, Series operations, and NumPy functions:
print("Comprehensive Performance Report")
print("=" * 35)
# Count months above/below average using a for loop and control flow
above_average_count = 0
below_average_count = 0
print(f"Overall average: {average_score:.2f}")
print("\nMonthly performance relative to average:")
for month in scores.index:
score = scores[month]
if score > average_score:
status = "Above average"
above_average_count += 1
elif score < average_score:
status = "Below average"
below_average_count += 1
else:
status = "At average"
difference = score - average_score
print(f"{month}: {score} ({status}, {difference:+.2f})")
print(f"\nSummary:")
print(f"Months above average: {above_average_count}")
print(f"Months below average: {below_average_count}")
# Final recommendation using control flow
if above_average_count > below_average_count:
print("Recommendation: Strong performance overall! Keep up the good work.")
elif below_average_count > above_average_count:
print("Recommendation: Focus on consistency. Consider study habit adjustments.")
else:
print("Recommendation: Balanced performance. Work on achieving more peak months.")
Comprehensive Performance Report
===================================
Overall average: 86.60
Monthly performance relative to average:
Sep: 78 (Below average, -8.60)
Oct: 85 (Below average, -1.60)
Nov: 92 (Above average, +5.40)
Dec: 88 (Above average, +1.40)
Jan: 79 (Below average, -7.60)
Feb: 83 (Below average, -3.60)
Mar: 91 (Above average, +4.40)
Apr: 87 (Above average, +0.40)
May: 89 (Above average, +2.40)
Jun: 94 (Above average, +7.40)
Summary:
Months above average: 6
Months below average: 4
Recommendation: Strong performance overall! Keep up the good work.
In this activity, you practiced:
β
Control Flow Concepts: - if
/elif
/else
statements for categorizing and comparing data - for
loops for iterating through Series data - Combining conditions with data analysis
β
Pandas Series Operations: - Creating Series with custom indices - Statistical methods (.mean()
, .max()
, .min()
, .median()
) - Basic indexing (series['key']
) and slicing (series[:3]
)
β NumPy Integration: - Using NumPy statistical functions on Pandas Series - Understanding how NumPy and Pandas work together
These are the foundational skills that will support all your future data science work. Great job applying todayβs concepts!
Try creating your own Series with different data (maybe daily temperatures, stock prices, or sports scores) and apply the same analysis techniques you used today.
End Activity Session (Day 3)