YOLOv2中anchors的计算方法-马育民老师

# 介绍

YOLOv2作者说 anchors 是使用 `K-MEANS` 计算得出的，计算出哪种长宽的框比较多，使用这样 长宽 的 anchors 可以加快收敛速度

# YOLOv2 vs YOLOv3

### YOLOv2 的anchor大小
作者用最后一层feature map的相对大小来定义anchor大小。
也就是说，在YOLOv2中的神经网络，最后一层feature map大小为 `13X13`（不同的神经网络，最后的feature map大小不一样的），那么anchor大小范围就在（0x0，13x13]。
>如果一个anchor大小是9x9，那么其在原图上的实际大小是288x288。

### YOLOv3 的anchor大小

在 YOLOv3 中，作者改用 **输入图片的大小** 来定义anchor，anchor的大小为（0x0，input_w x input_h]。

>所以，在YOLOv2 和 YOLOv3 的 cfg文件中，anchor的大小有明显的区别。

# 代码

根据 https://github.com/lars76/kmeans-anchor-boxes 工程修改

### example.py

##### 修改下面常量：
- `ANNOTATIONS_PATH`：xml文件路径
- `CLUSTERS`：计算得出的 anchors 数量，YOLOv2用5个anchor，所以是5

##### 修改13：
```
print("Boxes:\n {}".format(out*13))
```

- 如果是YOLOv2，archors大小是根据最后一层的feature map大小计算的（YOLOv2 默认最后一层的feature map大小是13，所以这里 乘以 `13`）
在具体操作时，要根据神经网络的最后一层的feature map大小来设置

- 如果是YOLOv3，archors大小是根据输入图片的大小计算的，这里要改成输入图片的大小

```
import glob
import xml.etree.ElementTree as ET

import numpy as np

from kmeans import kmeans, avg_iou

ANNOTATIONS_PATH = "/Users/mym/Desktop/ai/yolov2-tf2/data/train/annotation"
CLUSTERS = 5

def load_dataset(path):
	dataset = []
	for xml_file in glob.glob("{}/*xml".format(path)):
		tree = ET.parse(xml_file)

height = int(tree.findtext("./size/height"))
		width = int(tree.findtext("./size/width"))

for obj in tree.iter("object"):
			xmin = int(obj.findtext("bndbox/xmin")) / width
			ymin = int(obj.findtext("bndbox/ymin")) / height
			xmax = int(obj.findtext("bndbox/xmax")) / width
			ymax = int(obj.findtext("bndbox/ymax")) / height

dataset.append([xmax - xmin, ymax - ymin])

return np.array(dataset)

data = load_dataset(ANNOTATIONS_PATH)
out = kmeans(data, k=CLUSTERS)
print("Accuracy: {:.2f}%".format(avg_iou(data, out) * 100))
print("Boxes:\n {}".format(out*13))

ratios = np.around(out[:, 0] / out[:, 1], decimals=2).tolist()
print("Ratios:\n {}".format(sorted(ratios)))
```
执行结果如下：
```
Accuracy: 62.08%
Boxes:
 [[0.46875  0.40625 ]
 [4.90625  5.109375]
 [2.59375  3.046875]
 [1.25     1.28125 ]
 [0.8125   0.59375 ]]
Ratios:
 [0.85, 0.96, 0.98, 1.15, 1.37]
```

### kmeans.py
kmeans.py算法实现，不需要改动
```
import numpy as np

def iou(box, clusters):
    """
    Calculates the Intersection over Union (IoU) between a box and k clusters.
    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
    """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")

intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

iou_ = intersection / (box_area + cluster_area - intersection)

return iou_

def avg_iou(boxes, clusters):
    """
    Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: average IoU as a single float
    """
    return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])

def translate_boxes(boxes):
    """
    Translates all the boxes to the origin.
    :param boxes: numpy array of shape (r, 4)
    :return: numpy array of shape (r, 2)
    """
    new_boxes = boxes.copy()
    for row in range(new_boxes.shape[0]):
        new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
        new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
    return np.delete(new_boxes, [0, 1], axis=1)

def kmeans(boxes, k, dist=np.median):
    """
    Calculates k-means clustering with the Intersection over Union (IoU) metric.
    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param k: number of clusters
    :param dist: distance function
    :return: numpy array of shape (k, 2)
    """
    rows = boxes.shape[0]

distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

np.random.seed()

# the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

while True:
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

nearest_clusters = np.argmin(distances, axis=1)

if (last_clusters == nearest_clusters).all():
            break

for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

last_clusters = nearest_clusters

return clusters

```

感谢：
https://blog.csdn.net/m_buddy/article/details/82926024

原文出处：http://malaoshi.top/show_1EF5DjeeJRas.html