Pandas DataFrame.describe()函数

Minahil Noor 2023年1月30日

Pandas Pandas DataFrame

pandas.DataFrame.describe() 语法
示例代码：DataFrame.describe() 方法查找 DataFrame 统计
示例代码： DataFrame.describe() 方法查找各列统计数据
示例代码：DataFrame.describe() 方法查找数值列统计

Python Pandas DataFrame.describe() 函数返回一个 DataFrame 的统计数据。

`pandas.DataFrame.describe()` 语法

DataFrame.describe(
    percentiles=None, include=None, exclude=None, datetime_is_numeric=False
)

参数


`percentiles`	这个参数告诉了输出中要包含的百分位数，所有的值都应该在 0 和 1 之间。默认值是 `[.25, .5, .75]`，返回第 25、50 和 75 个百分位数。
`include`	输出中要包含的数据类型。它有三个选项。 `all`：输入的所有列都将被包含在输出中。类似列表的数据类型：将结果限制在提供的数据类型中。 `None`：结果将包括所有数字列。
`exclude`	要从输出中排除的数据类型。它有两个选项。类似于数据类型的列表：从结果中排除所提供的数据类型。 `None`。结果将不包括任何东西。
`datetime_is_numeric`	布尔参数。它指定我们是否将数据时间类型作为数字类型处理。

它返回所传递的 Series 或 DataFrame 的统计摘要。

示例代码：`DataFrame.describe()` 方法查找 DataFrame 统计

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})

print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.describe()
print("Statistics are: \n")
print(dataframe1)

输出：

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
Statistics are: 

       Attendance  Obtained Marks
count    5.000000        5.000000
mean    82.600000       71.200000
std     15.773395       17.484279
min     60.000000       45.000000
25%     78.000000       64.000000
50%     80.000000       75.000000
75%     95.000000       82.000000
max    100.000000       90.000000

该函数返回了 DataFrame 的统计摘要。我们没有传递任何参数，所以，函数使用了所有的默认值。

示例代码： `DataFrame.describe()` 方法查找各列统计数据

我们将使用 include 参数查找所有列的统计数据。

import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.describe(include='all')
print("Statistics are: \n")
print(dataframe1)

输出：

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
Statistics are: 

        Attendance   Name  Obtained Marks
count     5.000000      5        5.000000
unique         NaN      5             NaN
top            NaN  Kevin             NaN
freq           NaN      1             NaN
mean     82.600000    NaN       71.200000
std      15.773395    NaN       17.484279
min      60.000000    NaN       45.000000
25%      78.000000    NaN       64.000000
50%      80.000000    NaN       75.000000
75%      95.000000    NaN       82.000000
max     100.000000    NaN       90.000000

该函数返回了 DataFrame 中所有列的统计汇总。

示例代码：`DataFrame.describe()` 方法查找数值列统计

现在我们将使用 exclude 参数只查找数值列的统计。

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.describe(exclude=[object])
print("Statistics are: \n")
print(dataframe1)

输出：

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
Statistics are: 

       Attendance  Obtained Marks
count    5.000000        5.000000
mean    82.600000       71.200000
std     15.773395       17.484279
min     60.000000       45.000000
25%     78.000000       64.000000
50%     80.000000       75.000000
75%     95.000000       82.000000
max    100.000000       90.000000

我们已经排除了数据类型 object。

pandas.DataFrame.describe() 语法

参数

返回

示例代码：DataFrame.describe() 方法查找 DataFrame 统计

示例代码： DataFrame.describe() 方法查找各列统计数据

示例代码：DataFrame.describe() 方法查找数值列统计

相关文章 - Pandas DataFrame

`pandas.DataFrame.describe()` 语法

示例代码：`DataFrame.describe()` 方法查找 DataFrame 统计

示例代码： `DataFrame.describe()` 方法查找各列统计数据

示例代码：`DataFrame.describe()` 方法查找数值列统计