1. 目录结构

路径位置:/home/models/research/object_detection

object_detection
├── tai30
└── tai40 # 项目名称
    ├── data
    │   ├── VOCdevkit2007  # 项目数据
    │   │    ├── VOC2007
    │   │        ├── Annotations
    │   │        ├── ImageSets
    │   │        ├── JPEGImages
    │   │        └── readme.txt
    │   ├── pascal_label_map.pbtxt
    │   ├── pascal_train.record # 生成TFrecord格式
    │   └── pascal_val.record
    ├── ssd_mobilenet  #模型算法名字
    │   ├── eval_logs  #评估生成的log
    │   │   ├── ...
    │   │   └── events.out.tfevents....
    │   ├── logs
    │   │   ├── ...
    │   │   └── train_....txt
    │   ├── output   #用来保存我们训练好的模型
    │   ├── ssd_mobilenet_v1_coco_2017_11_17 # 预训的练模型
    │   │   ├── graph.pbtxt
    │   │   ├── frozen_inference_graph.pb
    │   │   ├── model.ckpt.data-00000-of-00001
    │   │   ├── model.ckpt.index
    │   │   └── model.ckpt.meta
    │   ├── train_logs #训练生成的log
    │   │   ├── checkpoint
    │   │   ├── ...
    │   │   ├── model.ckpt-0.data-00000-of-00001
    │   │   ├── model.ckpt-0.index
    │   │   └── model.ckpt-0.meta
    │   ├── eval.sh
    │   ├── ssd_mobilenet_v1_coco.config
    │   └── train.sh
    ├── create_pascal_tf_record.py
    ├── eval.py
    ├── infer.py
    └── test.py

2. 准备数据

项目使用tfrecord文件格式读取数据。用脚本 create_pascal_tf_record.py 将pascal voc格式的数据集转换为tfrecords文件格式

2.1. 先修改分类总数

打开 pascal_label_map.pbtxt 文件:
修改分类总数,目前只有一类:

item {
  id: 1
  name: 'object'
}

2.2. 用脚本生成tfrecords格式

数据存放目录格式为:

└── VOCdevkit2007
    ├── VOC2007
    │   ├── Annotations
    │   ├── ImageSets
    │   ├── JPEGImages
    │   └── readme.txt

参数data_dir 指定数据目录VOCdevkit2007,目录里面有 VOC2007文件,VOC2007文件里面有存放图片的JPEGImages,存放标注分类坐标信息Annotations。

ImageSets/Main 目录下有四个文件:test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集.

参数set指定要使用的数据集是那个,这里使用了train和val两个。

python create_pascal_tf_record.py \
    --label_map_path=pascal_label_map.pbtxt \
    --data_dir=/home/models/research/object_detection/tai40/data/VOCdevkit2007  --year=VOC2007 --set=train \
    --output_path=pascal_train.record

python create_pascal_tf_record.py \
    --label_map_path=pascal_label_map.pbtxt \
    --data_dir=/home/models/research/object_detection/tai40/data/VOCdevkit2007 --year=VOC2007 --set=val\
    --output_path=pascal_val.record

3. 下载预训练好的模型

为了加快训练速度,我们需要基于预训练的模型微调,官方提供了不少预训练模型,网站:

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

找到我们需要的模型,复制下载地址:
这里以ssd_mobilenet_v1_coco以例。

# 下载
wget 
http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz
# 解压
tar zxf ssd_mobilenet_v1_coco_2017_11_17.tar.gz

4. 修改管道配置文件

4.1. 复制管道文件

从目录 models/research/object_detection/samples/configs复制文件 ssd_mobilenet_v1_coco.configssd_mobilenet目录下

找到ssd_mobilenet_v1.config文件,修改下面几个位置:

4.2. 修改类别

num_classes:1

4.3. 修改预训练模型位置

fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: false

4.4. 修改训练次数

num_steps: 200000  

4.5. 修改训练数据位置

train_input_reader: {
  tf_record_input_reader {
    # 训练样本路径
    input_path: "/home/models/research/object_detection/tai40/data/train.record" 
  }
  # 标签分类配置文件路径
  label_map_path: "/home/models/research/object_detection/tai40/data/label_map.pbtxt"
}

train_input_reader: {
  tf_record_input_reader {
    # 训练样本路径
    input_path: "/home/models/research/object_detection/tai40/data/pascal_train.record"
  }
    # 标签分类配置文件路径
  label_map_path: "/home/models/research/object_detection/tai40/data/pascal_label_map.pbtxt"
}

4.6. 修改测试评估数据位置

eval_input_reader: {
  tf_record_input_reader {
    # 验证样本路径
    input_path: "/home/models/research/object_detection/tai40/data/pascal_val.record"
  }
   # 标签分类配置文件路径
  label_map_path: "/home/models/research/object_detection/tai40/data/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

5. 开始训练

目录/home/models/research/object_detection/tai40/ssd_mobilenet
下新建train.sh脚本,内容如下:

mkdir -p logs/
now=$(date +"%Y%m%d_%H%M%S")
python /home/models/research/object_detection/train.py \
    --logtostderr \
    --pipeline_config_path=ssd_mobilenet_v1_coco.config \
    --train_dir=train_logs 2>&1 | tee logs/train_$now.txt &

进入object_detection/tai40/ssd_mobilenet/,运行./train.sh即可训练。

6. 验证

可一边训练一边验证,但要注意使用其它的GPU或合理分配显存。

/home/models/research/object_detection/tai40/ssd_mobilenet
新建eval.sh,内容以下:

mkdir -p eval_logs
python /home/models/research/object_detection/eval.py \
    --logtostderr \
    --pipeline_config_path=ssd_mobilenet_v1_coco.config \
    --checkpoint_dir=train_logs \
    --eval_dir=eval_logs &

进入object_detection/tai40/ssd_mobilenet/,运行CUDA_VISIBLE_DEVICES="1" ./eval.sh即可验证(这里指定了第二个GPU)。

7. 可视化log

同时可视化训练与验证的log

tensorflow/models/object_detection/tai40/ssd_mobilenet/文件下运行

tensorboard --logdir . --port 6006

访问 http://localhost:6006/ 即可看到Loss等的变化。

8. 导出模型

训练完成后得到一些checkpoint文件在tensorflow/models/object_detection/tai40/ssd_mobilenet/train_logs/中,如:

graph.pbtxt
model.ckpt-200000.data-00000-of-00001
model.ckpt-200000.info
model.ckpt-200000.meta

meta保存了graph和metadata,ckpt保存了网络的weights。

而进行预测时只需模型和权重,不需要metadata,故可使用官方提供的脚本生成推导图。

mkdir -p output

python /home/models/research/object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path ssd_mobilenet_v1_coco.config \
--trained_checkpoint_prefix train_logs/model.ckpt-200000  \
--output_directory output/graph_200000

9. 测试图片

# coding: utf-8
import os,sys,cv2, time
import numpy as np
import tensorflow as tf
sys.path.append("..")
from utils import label_map_util
from utils import visualization_utils as vis_util
from matplotlib import pyplot as plt
from PIL import Image

class Detector(object):
    def __init__(self):
        # Path to frozen detection graph. This is the actual model that is used for the object detection.
        self.PATH_TO_CKPT = '/home/models/research/object_detection/tai40/ssd_mobilenet/output/frozen_inference_graph.pb'
        # List of the strings that is used to add correct label for each box.
        self.PATH_TO_LABELS = '/home/models/research/object_detection/tai40/data/pascal_label_map.pbtxt'
        self.NUM_CLASSES = 1

        self.detection_graph = self._load_model()
        self.category_index = self._load_label_map

    def _load_model(self):
        print('_load_model')
        detection_graph = tf.Graph()
        with detection_graph.as_default():
            od_graph_def = tf.GraphDef()
            with tf.gfile.GFile(self.PATH_TO_CKPT, 'rb') as fid:
                serialized_graph = fid.read()
                od_graph_def.ParseFromString(serialized_graph)
                tf.import_graph_def(od_graph_def, name='')
        return detection_graph

    @property
    def _load_label_map(self):
        print('_load_label_map')
        label_map = label_map_util.load_labelmap(self.PATH_TO_LABELS)
        categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=self.NUM_CLASSES,
                                                                    use_display_name=True)
        category_index = label_map_util.create_category_index(categories)
        # CLASS = ['object']
        # category_index = {}
        # for i, cls in enumerate(CLASS):
        #     dict = {'id':i+1, 'name':cls}
        #     category_index[i+1] = dict
        # print(category_index)
        return category_index

    def run_inference_for_single_image(self, image_np):
        with self.detection_graph.as_default():
            with tf.Session() as sess:
                # Get handles to input and output tensors
                ops = tf.get_default_graph().get_operations()
                all_tensor_names = {output.name for op in ops for output in op.outputs}
                tensor_dict = {}
                for key in [
                    'num_detections', 'detection_boxes', 'detection_scores',
                    'detection_classes'
                ]:
                    tensor_name = key + ':0'
                    if tensor_name in all_tensor_names:
                        tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                            tensor_name)

                image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
                # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
                image_np_expanded = np.expand_dims(image_np, axis=0)
                # Run inference
                start_time = time.time()
                output_dict = sess.run(tensor_dict,feed_dict={image_tensor: image_np_expanded})
                end_time = time.time()
                # all outputs are float32 numpy arrays, so convert types as appropriate
                output_dict['num_detections'] = int(output_dict['num_detections'][0])
                output_dict['detection_classes'] = output_dict[
                    'detection_classes'][0].astype(np.uint8)
                output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
                output_dict['detection_scores'] = output_dict['detection_scores'][0]
                output_dict['use_time'] = end_time - start_time
        return output_dict

def show_detecte_image(path):
    image = Image.open(path).convert("RGB")
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    (im_width, im_height) = image.size
    image_np = np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
    # Actual detection.
    detector = Detector()
    output_dict = detector.run_inference_for_single_image(image_np)
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        detector.category_index,
        use_normalized_coordinates=True,
        min_score_thresh=.5,
        line_thickness=4)
    plt.figure(figsize=(12, 8))
    plt.imshow(image_np)
    plt.show()
    print('Image:{}  Num: {}  Time: {:.3f}s'.format(path, output_dict['num_detections'], output_dict['use_time']))


image_path = '/home/models/research/object_detection/tai40/data/VOCdevkit2007/VOC2007/JPEGImages/000001.bmp'
show_detecte_image(image_path)
cv2.waitKey(0)


技术交流学习,请加QQ微信:631531977
目录