first(), last(): First and last values in the group
Grouped Operations
You can apply operations to each group separately using transform() or apply().
Using transform() to alter each group in a group by object
Code
# Transform: apply function to each group, return same-sized DataFramedef normalize(x):return (x - x.mean()) / x.std()df['value_normalized'] = grouped['value'].transform(normalize)
Using apply() to alter each group in a group by object
Code
# Apply: apply function to each group, return a DataFrame or Seriesdef group_range(x):return x['value'].max() - x['value'].min()result = grouped.apply(group_range)
/var/folders/bs/x9tn9jz91cv6hb3q6p4djbmw0000gn/T/ipykernel_17816/114114075.py:5: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
result = grouped.apply(group_range)
Pivot Tables
Pivot tables are a powerful tool for reorganizing and summarizing data. They allow you to transform your data from a long format to a wide format, making it easier to analyze and visualize patterns.
sum mean
product A B A B
date
2023-01-01 100 150 100.0 150.0
2023-01-02 120 180 120.0 180.0
/var/folders/bs/x9tn9jz91cv6hb3q6p4djbmw0000gn/T/ipykernel_17816/1326309547.py:2: FutureWarning: The provided callable <function sum at 0x10e6f72e0> is currently using DataFrameGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
pivot_multi = pd.pivot_table(df, values='sales', index='date', columns='product',
/var/folders/bs/x9tn9jz91cv6hb3q6p4djbmw0000gn/T/ipykernel_17816/1326309547.py:2: FutureWarning: The provided callable <function mean at 0x10e708400> is currently using DataFrameGroupBy.mean. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "mean" instead.
pivot_multi = pd.pivot_table(df, values='sales', index='date', columns='product',
Key Pivot Table Parameters
values: Column(s) to aggregate
index: Column(s) to use as row labels
columns: Column(s) to use as column labels
aggfunc: Function(s) to use for aggregation (default is mean)
fill_value: Value to use for missing data
margins: Add row/column with subtotals (default is False)
For more detailed information on grouping, aggregating, and pivot tables in Pandas, refer to the official Pandas documentation.