YOLOv2中anchors的计算方法 作者:马育民 • 2020-03-25 10:43 • 阅读:10124 # 介绍 YOLOv2作者说 anchors 是使用 `K-MEANS` 计算得出的,计算出哪种长宽的框比较多,使用这样 长宽 的 anchors 可以加快收敛速度 # YOLOv2 vs YOLOv3 ### YOLOv2 的anchor大小 作者用最后一层feature map的相对大小来定义anchor大小。 也就是说,在YOLOv2中的神经网络,最后一层feature map大小为 `13X13`(不同的神经网络,最后的feature map大小不一样的),那么anchor大小范围就在(0x0,13x13]。 >如果一个anchor大小是9x9,那么其在原图上的实际大小是288x288。 ### YOLOv3 的anchor大小 在 YOLOv3 中,作者改用 **输入图片的大小** 来定义anchor,anchor的大小为(0x0,input_w x input_h]。 >所以,在YOLOv2 和 YOLOv3 的 cfg文件中,anchor的大小有明显的区别。 # 代码 根据 https://github.com/lars76/kmeans-anchor-boxes 工程修改 ### example.py ##### 修改下面常量: - `ANNOTATIONS_PATH`:xml文件路径 - `CLUSTERS`:计算得出的 anchors 数量,YOLOv2用5个anchor,所以是5 ##### 修改13: ``` print("Boxes:\n {}".format(out*13)) ``` - 如果是YOLOv2,archors大小是根据最后一层的feature map大小计算的(YOLOv2 默认最后一层的feature map大小是13,所以这里 乘以 `13`) 在具体操作时,要根据神经网络的最后一层的feature map大小来设置 - 如果是YOLOv3,archors大小是根据输入图片的大小计算的,这里要改成输入图片的大小 ``` import glob import xml.etree.ElementTree as ET import numpy as np from kmeans import kmeans, avg_iou ANNOTATIONS_PATH = "/Users/mym/Desktop/ai/yolov2-tf2/data/train/annotation" CLUSTERS = 5 def load_dataset(path): dataset = [] for xml_file in glob.glob("{}/*xml".format(path)): tree = ET.parse(xml_file) height = int(tree.findtext("./size/height")) width = int(tree.findtext("./size/width")) for obj in tree.iter("object"): xmin = int(obj.findtext("bndbox/xmin")) / width ymin = int(obj.findtext("bndbox/ymin")) / height xmax = int(obj.findtext("bndbox/xmax")) / width ymax = int(obj.findtext("bndbox/ymax")) / height dataset.append([xmax - xmin, ymax - ymin]) return np.array(dataset) data = load_dataset(ANNOTATIONS_PATH) out = kmeans(data, k=CLUSTERS) print("Accuracy: {:.2f}%".format(avg_iou(data, out) * 100)) print("Boxes:\n {}".format(out*13)) ratios = np.around(out[:, 0] / out[:, 1], decimals=2).tolist() print("Ratios:\n {}".format(sorted(ratios))) ``` 执行结果如下: ``` Accuracy: 62.08% Boxes: [[0.46875 0.40625 ] [4.90625 5.109375] [2.59375 3.046875] [1.25 1.28125 ] [0.8125 0.59375 ]] Ratios: [0.85, 0.96, 0.98, 1.15, 1.37] ``` ### kmeans.py kmeans.py算法实现,不需要改动 ``` import numpy as np def iou(box, clusters): """ Calculates the Intersection over Union (IoU) between a box and k clusters. :param box: tuple or array, shifted to the origin (i. e. width and height) :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: numpy array of shape (k, 0) where k is the number of clusters """ x = np.minimum(clusters[:, 0], box[0]) y = np.minimum(clusters[:, 1], box[1]) if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0: raise ValueError("Box has no area") intersection = x * y box_area = box[0] * box[1] cluster_area = clusters[:, 0] * clusters[:, 1] iou_ = intersection / (box_area + cluster_area - intersection) return iou_ def avg_iou(boxes, clusters): """ Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: average IoU as a single float """ return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])]) def translate_boxes(boxes): """ Translates all the boxes to the origin. :param boxes: numpy array of shape (r, 4) :return: numpy array of shape (r, 2) """ new_boxes = boxes.copy() for row in range(new_boxes.shape[0]): new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0]) new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1]) return np.delete(new_boxes, [0, 1], axis=1) def kmeans(boxes, k, dist=np.median): """ Calculates k-means clustering with the Intersection over Union (IoU) metric. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param k: number of clusters :param dist: distance function :return: numpy array of shape (k, 2) """ rows = boxes.shape[0] distances = np.empty((rows, k)) last_clusters = np.zeros((rows,)) np.random.seed() # the Forgy method will fail if the whole array contains the same rows clusters = boxes[np.random.choice(rows, k, replace=False)] while True: for row in range(rows): distances[row] = 1 - iou(boxes[row], clusters) nearest_clusters = np.argmin(distances, axis=1) if (last_clusters == nearest_clusters).all(): break for cluster in range(k): clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0) last_clusters = nearest_clusters return clusters ``` 感谢: https://blog.csdn.net/m_buddy/article/details/82926024 原文出处:http://malaoshi.top/show_1EF5DjeeJRas.html