Pandas DataFrame 的唯一值计数
Suraj Joshi
2023年1月30日
2021年1月22日
本教程解释了如何使用 Series.value_counts()
和 DataFrame.nunique()
方法获得 DataFrame 中所有唯一值的计数。
import pandas as pd
patients_df = pd.DataFrame({
'Name': ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
'Date': ["2020-12-01", "2020-12-01", "2020-12-02", "2020-12-02", "2020-12-02", "2020-12-03"],
'Age': [17, 18, 17, 16, 18, 16]
})
print(patients_df)
输出:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
我们将使用 DataFrame patients_df
,其中包含患者的姓名、预约日期和年龄,来解释如何获得 DataFrame 中所有唯一值的计数。
使用 Series.value_counts()
计算 DataFrame 中的唯一值
import pandas as pd
patients_df = pd.DataFrame({
'Name': ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
'Date': ["2020-12-01", "2020-12-01", "2020-12-02", "2020-12-02", "2020-12-02", "2020-12-03"],
'Age': [17, 18, 17, 16, 18, 16]
})
print("The DataFrame is:")
print(patients_df, "\n")
print("No of appointments for each date:")
print(patients_df["Date"].value_counts())
输出:
The DataFrame is:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
No of appointments for each date:
2020-12-02 3
2020-12-01 2
2020-12-03 1
Name: Date, dtype: int64
它显示 DataFrame 中 Date
列的每个唯一值的计数。
使用 DataFrame.nunique()
计算 DataFrame 中的唯一值
import pandas as pd
patients_df = pd.DataFrame({
'Name': ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
'Date': ["2020-12-01", "2020-12-01", "2020-12-02", "2020-12-02", "2020-12-02", "2020-12-03"],
'Age': [17, 18, 17, 16, 18, 16]
})
print(patients_df, "\n")
print(patients_df.groupby('Date').Name.nunique())
输出:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
Date
2020-12-01 2
2020-12-02 3
2020-12-03 1
Name: Name, dtype: int64
它根据 Date
列的值将 DataFrame 分割开来,即把 Date
值相同的行放在同一组,然后计算每一个名字在某一组中的出现次数,以了解 DataFrame 中 Date
列的每一个唯一值的数量。
Author: Suraj Joshi
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn