keras backend tensorflow1 和tensorflow2 不同版本的多GPU训练之前使用keras

之前使用keras 2.3.1 搭配的后端tensorflow是1.14,今天使用的tensorflow是2.2的版本，因为版本上不同，发现在服务器上，nvidia-smi命令，发现虽然程序上使用来的多GPU。但是实际上只有一个GPU在跑

经过查询发现各版本的单机多GPU训练有区别。

# tensorflow1.14.0
from keras.utils import multi_gpu_model
os.environ["TF_KERAS"] = "1"
# CUDA_VISIBLE_DEVICES指定使用那几张显卡
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
model = keras.models.Model(input,output)
multi_model = multi_gpu_model(model, gpus=2)
model.compile(
        loss=...,
        optimizer=...,
        metrics=[...]
    )
# 注意这里保存模型，如果之后用在当前环境可以直接保存multi_model
multi_model.save_weights('../finetune_model/best_model.weights')
# 如果之后需要在cpu上推理或者单卡上推理，这里保存的是model
model.save_weights('../finetune_model/best_model.weights')

# tensorflow2.2.0
# 通过devices指定使用显卡
strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"])
with strategy.scope():
    model = keras.models.Model(input,output)
    # model.summary()
    model.compile(
        loss=...,
        optimizer=...,
        metrics=[...]
    )
model.save_weights('../finetune_model/best_model.weights')

keras backend tensorflow1 和tensorflow2 不同版本的 多GPU训练

keras backend tensorflow1 和tensorflow2 不同版本的多GPU训练