Code
import pandas as pd
Pandas Series
This cheatsheet provides a quick reference for common operations on Pandas Series. Itβs designed for beginning data science students who are just starting to work with Pandas.
Always start by importing pandas:
a 1
b 2
c 3
d 4
e 5
dtype: int64
<class 'pandas.core.series.Series'>
Index: 5 entries, a to e
Series name: None
Non-Null Count Dtype
-------------- -----
5 non-null int64
dtypes: int64(1)
memory usage: 80.0+ bytes
None
count 5.000000
mean 3.000000
std 1.581139
min 1.000000
25% 2.000000
50% 3.000000
75% 4.000000
max 5.000000
dtype: float64
Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
[1 2 3 4 5]
The pd.concat()
command (short for concatenate) is now the preferred method for extending series.
The pd.concat()
command takes a list of pd.Series objects to concatenate. This means you must create a pd.Series of new values to extend an existing pd.Series.
b 2
c 3
d 4
e 5
f 3
g 4
h 5
dtype: int64
b 2
c 6
d 8
e 10
f 6
g 8
h 10
dtype: int64
This line uses the greater than (>
) logical operator within the mask()
function to update the series. It will double the values in series where the condition s > 5
is True
, while leaving other values unchanged.
b 2
c 6
d 8
e 10
f 6
g 8
h 10
dtype: int64
b 2
c 6
d 12
e 12
f 6
g 12
h 12
dtype: int64
This line of code will update the values in series where condition is False
(i.e. where s
is not less than 8), replacing them with 12
. The values where condition is True will remain unchanged.
Lambda functions allow for doing transformations with temporary functions instead of having to define functions seperately. They are good for quick, 1-off transformations.
Original Series:
b 2
c 6
d 12
e 12
f 6
g 12
h 12
dtype: int64
Re-indexed series:
a NaN
c 6.0
b 2.0
f 6.0
e 12.0
d 12.0
g 12.0
h 12.0
i NaN
b 2.0
dtype: float64
the reindex()
command takes an ordered list that specifies the indicies that should be used to make a new pd.Series
object. The list of indicies supplied to reindex()
must have some indicies in common with the existing index. Indicies that do not appear in the existing Series
will be set to NaN
in the new Series
. Repetition of indicies is allowed.
For more advanced operations and in-depth explanations, check out these resources:
Remember, practice is key! Try these operations with different datasets to become more comfortable with Pandas Series.