<Computer Vision> Implementing RTMDet-Ins 2. Making COCO Dataset Reader
Sunday, Mar 30, 2025
The most widely used image dataset format is COCO.
COCO format: https://cocodataset.org/#home label metadata is handled in train/valid/test.json image metadata in images, annotation data in annotations, labels in categories 1. Basic COCO Format Example images: Contains image id and path where image is stored annotations: Contains image id and category id that the annotation belongs in. Can have multiple annotations(=instances) for each image labels include: category(label_id for classification) bbox([x_box_leftupper_corner, y_box_leftupper_corner, w, h], for object detection) and segmentation([x1 y1 x2 y2 … format]) categories: category id and the label’s name 2.