Keras 中保留 f1-score 最高的模型 (per epoch)Keras 是一个很好用的深度学习框架，简单易

Keras 是一个很好用的深度学习框架，简单易上手，和 sklearn 一样。

但是因为它的抽象性太高了，导致在自定义模型的一些属性上不如 pytorch 那么方便。

今天写一个在使用过程中遇到的问题，和解决的办法。

如何保存 val data 上 f1-score 最高的模型

在 keras 原生支持的 metrics 里面，并不包括 f1-scores，但是在分类问题中，f1-scores 是一个很重要的评价指标。

曾经看到 stack-overflow 上面的一个回答 How to calculate F1 Macro in Keras?

我想说这个答案其实会误导很多人，他是怎么解决这个问题的呢？如下：

from keras import backend as K

def f1(y_true, y_pred):
    def recall(y_true, y_pred):
        """Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))


model.compile(loss='binary_crossentropy',
          optimizer= "adam",
          metrics=[f1])

然后将 "val_f1" 加入到 earlyStop的 monitor 里面去，

early = EarlyStopping(monitor="val_f1", mode="max", patience=4)

似乎我们解决了如何得到最佳 val_f1 的模型的问题，但是其实这里的 f1 并不是我们想要的。。。因为它是 Only computes a batch-wise average of recall. 按 batch 算 f1 和按 epoch 算 f1 可是差别很大的！！！

解决方法

解决方法很简单，Keras 提供了强大的 Callback class，通过继承 Callback，可以实现上面的需求，怎么做呢，如下

以二分类问题为例，

from sklearn.metrics import f1_score, recall_score, precision_score
from keras.callbacks import Callback

def boolMap(arr):
    if arr > 0.5:
        return 1
    else:
        return 0


class Metrics(Callback):
    def __init__(self, filepath):
        self.file_path = filepath

    def on_train_begin(self, logs=None):
        self.val_f1s = []
        self.best_val_f1 = 0
        self.val_recalls = []
        self.val_precisions = []

    def on_epoch_end(self, epoch, logs=None):
        val_predict = list(map(boolMap, self.model.predict([self.validation_data[0], self.validation_data[1]])))
        val_targ = self.validation_data[2]
        _val_f1 = f1_score(val_targ, val_predict)
        _val_recall = recall_score(val_targ, val_predict)
        _val_precision = precision_score(val_targ, val_predict)
        self.val_f1s.append(_val_f1)
        self.val_recalls.append(_val_recall)
        self.val_precisions.append(_val_precision)
        print(_val_f1, _val_precision, _val_recall)
        print("max f1")
        print(max(self.val_f1s))
        if _val_f1 > self.best_val_f1:
            self.model.save_weights(self.file_path, overwrite=True)
            self.best_val_f1 = _val_f1
            print("best f1: {}".format(self.best_val_f1))
        else:
            print("val f1: {}, but not the best f1".format(_val_f1))
        return

使用方法也很简单，

metrics = Metrics(file_path)
callbacks_list = [metrics]

把这个 callback_list 加入 model 的训练过程就行。