开启掘金成长之旅！这是我参与「掘金日新计划 · 2 月更文挑战」的第 24 天，点击查看活动详情

3. Precision值

模型预测为正值的样本中，真实标签也是正值的样本所占的比例。

sklearn的函数文档：sklearn.metrics.precision_score — scikit-learn 1.1.1 documentation

3.1 Micro-P

计算所有预测结果中的正值中预测正确的比例。

使用Python的原生函数实现：

import json

label=json.load(open('data/cls/AAPD/label.json'))
prediction=json.load(open('data/cls/AAPD/prediction.json'))

pp_list=[x.count(1) for x in prediction]
pp=sum(pp_list)
tp_list=[[label[y][x]==1 and prediction[y][x]==1 for x in range(len(prediction[0]))].count(True) for y in range(len(prediction))]
tp=sum(tp_list)
print(tp/pp)

使用sklearn实现：

import json
from sklearn.metrics import precision_score

label=json.load(open('data/cls/AAPD/label.json'))
prediction=json.load(open('data/cls/AAPD/prediction.json'))

print(precision_score(np.array(label), np.array(prediction), average='micro'))

输出：0.8247272727272728

3.2 Macro-P

计算每一类标签对应的预测正值中预测正确的比例，然后将所有标签的P值求平均。如果某一类标签没有预测正值，sklearn的默认处理方式是将P值置0并报警告信息，本文在原生函数实现中也采用了这一方法。

使用Python原生函数实现：

import json
from statistics import mean

label=json.load(open('data/cls/AAPD/label.json'))
prediction=json.load(open('data/cls/AAPD/prediction.json'))

p_list=[0 for _ in range(len(label[0]))]
for label_index in range(len(label[0])):
    l=[x[label_index] for x in label]
    p=[x[label_index] for x in prediction]
    if p.count(1)==0:
        print('索引为'+str(label_index)+'的标签无正预测值！')
    else:
        p_list[label_index]=[l[x]==1 and p[x]==1 for x in range(len(l))].count(1)/p.count(1)
print(mean(p_list))

输出：

索引为26的标签无正预测值！
索引为28的标签无正预测值！
索引为30的标签无正预测值！
索引为32的标签无正预测值！
索引为35的标签无正预测值！
索引为36的标签无正预测值！
索引为37的标签无正预测值！
索引为41的标签无正预测值！
索引为42的标签无正预测值！
索引为44的标签无正预测值！
索引为45的标签无正预测值！
索引为46的标签无正预测值！
索引为47的标签无正预测值！
索引为48的标签无正预测值！
索引为49的标签无正预测值！
索引为50的标签无正预测值！
索引为51的标签无正预测值！
索引为52的标签无正预测值！
索引为53的标签无正预测值！
0.4440190824913562

使用sklearn实现：

import json
from sklearn.metrics import precision_score

label=json.load(open('data/cls/AAPD/label.json'))
prediction=json.load(open('data/cls/AAPD/prediction.json'))

print(precision_score(label,prediction, average='macro'))

输出：

env_path/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
0.4440190824913562

multi-class分类模型评估指标的定义、原理及其Python实现 (2)

3. Precision值

3.1 Micro-P

3.2 Macro-P