describe()方法用于计算一些统计数据,例如Series或DataFrame的数值的百分位数,均值和 std 。它分析数字和对象Series以及混合数据类型的DataFrame列集。
语法
DataFrame.describe(percentiles=None, include=None, exclude=None)
参数
- percentile(百分位数) - 这是一个可选参数,它是一个列表,如数字的数据类型,应在0到1之间。其默认值为[.25,.5,.75],它返回第25位,第50和第75个百分位。
- include - 它也是一个可选参数,在描述DataFrame时包括数据类型列表。默认值为"None"。
- exclude - 它也是一个可选参数,用于在描述DataFrame时排除数据类型列表。默认值为"None"。
返回值
它返回Series和DataFrame的统计摘要。
例1
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe()
输出
count 3.0 mean 2.0 std 1.0 min 1.0 25% 1.5 50% 2.0 75% 2.5 max 3.0 dtype: float64
例2
import pandas as pd import numpy as np a1 = pd.Series([p, q, q, r]) a1.describe()
输出
count 4 unique 3 top q freq 2 dtype: object
例子3
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe() a1 = pd.Series([p, q, q, r]) a1.describe() info = pd.DataFrame({categorical: pd.Categorical([s,t,u]), numeric: [1, 2, 3], object: [p, q, r] }) info.describe(include=[np.number]) info.describe(include=[np.object]) info.describe(include=[category])
输出
categorical count 3 unique 3 top u freq 1
例子4
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe() a1 = pd.Series([p, q, q, r]) a1.describe() info = pd.DataFrame({categorical: pd.Categorical([s,t,u]), numeric: [1, 2, 3], object: [p, q, r] }) info.describe() info.describe(include=all) info.numeric.describe() info.describe(include=[np.number]) info.describe(include=[np.object]) info.describe(include=[category]) info.describe(exclude=[np.number]) info.describe(exclude=[np.object])
输出
categorical numeric count 3 3.0 unique 3 NaN top u NaN freq 1 NaN mean NaN 2.0 std NaN 1.0 min NaN 1.0 25% NaN 1.5 50% NaN 2.0 75% NaN 2.5 max NaN 3.0