Yolo3 training notes

Yolo3 custom training notes

To train a model, YOLO training code expects:
* Images
* Labels
* NAMES File
* CFG file
* train.txt file
* test.txt file
* DATA file
* Pretrained weights (optional)

Images and Labels

The images and labels should be located in the same directory. Each image and label is related to its counterpart by filename.
i.e.
For image 001.jpg, the corresponding label should be named 001.txt

Label format

The file containing the labels is a plain text file. Each line contains a bounding box for each object. The colums are separated by spaces, in the following format:

classID x y width height

The x, y, width and height should be expressed in a normalized pixel with values from 0 to 1.
x and y correspond to the coordinate of the center of the bounding box.

Yolo includes the following python helper function to easily achieve that:

def convert(size, box):
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 - 1
    y = (box[2] + box[3])/2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

Names file

This file contains the label string for each class. The first line corresponds to the class 0. second line corresponds to the class 1, and so on.
i.e. Contents of classes.names

classA
classB
classC

This would create the following relationship:

Class ID (labels) Class identifier
0 classA
1 classB
2 classC

CFG file

This file is a darknet configuration file. To simplify the explanation:

Modifications required to train, according to:

GPU memory available:

[net]
# Training
 batch=64 #Number of images to move to GPU memory on each batch.
 ...

Number of classes

The number of classes should be set on each of the [yolo] sections in the CFG file.

[yolo]
...
classes= NUM_CLASSES (1,2,3,4) should match names file.
...

Number of Filters

Before each [yolo] section, the number of filters in the [convolutional] layer should also be updated to match the following formula:

classes=(classes + 5) * 3

For instnace, for 3 classes:

[convolutional]
size=1
stride=1
pad=1
filters=24
activation=linear

[yolo]
classes= 3

train.txt file

This plain text file contains each of the images that will be used for training. Each line should include the absolute path to the Image.
i.e.

/home/test/images/train/001.jpg
/home/test/images/train/002.jpg
/home/test/images/train/003.jpg

test.txt file

In the same way as the train.txt file, this text file contains the paths to the images used for testing, one per line.
i.e.

/home/user/dataset/images/test/001.jpg
/home/user/dataset/images/test/002.jpg
/home/user/dataset/images/test/003.jpg

DATA file

This plain text file summarizes the dataset using the following format:

classes= 20
train  = /home/user/dataset/train.txt
valid  = /home/user/dataset/test.txt
names = /home/user/dataset/classes.names
backup = /home/user/dataset/backup

Training

./darknet detector train file.data file.cfg darknet53.conv.74