MobileNet+TensorRT : 极速版OpenPose实现

3,652 阅读1分钟
原文链接: github.com

1. Motivation

OpenPose from CMU provides real-time 2D pose estimation following "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields" However, the training code is based on Caffe and C++, which is hard to be customized. While in practice, developers need to customize their training set, data augmentation methods according to their requirement. For this reason, we reimplemented this project in TensorLayer fashion.

🚀 This repo will be moved into example folder of tensorlayer for life-cycle management soon. More cool Computer Vision applications such as super resolution and style transfer can be found in this organization.

2. Project files

  • config.py : config of the training details.
    • set training mode : datasetapi (single gpu, default), distributed (multi-gpus, TODO), placeholder(slow, for debug only)
  • models.py: defines the model structures.
  • utils.py: utility functions.
  • train.py: trains the model.
  • TODO
    • Provides pretrained models
    • TensorRT Float16 and Int8 inference
    • Faster C++ post-processing
    • Distributed training
    • Faster data augmentation
    • Pose Proposal Networks, ECCV 2018

3. Preparation

Build C++ library for post processing. See: github.com/ildoonet/tf…

cd inference/pafprocess
make

# ** before recompiling **
rm -rf build
rm *.so

4. Use pre-trained model

In this project, input images are RGB with 0~1. Runs train.py, it will automatically download the default VGG19-based model from here, and use it for inferencing. The performance of pre-trained model is as follow:

5. Train a model

Runs train.py, it will automatically download MSCOCO 2017 dataset into dataset/coco17. The default model in models.py is based on VGG19, which is the same with the original paper. If you want to customize the model, simply change it in models.py. And then train.py will train the model to the end.

6. Discussion