Dataset Preparation

To train and evaluate the models, you need to prepare the COCO and SAM datasets. All datasets should be placed inside a data/ directory at the root of the project.

. (project root)
├── data/
│   ├── coco/
│   │   └── ...
│   └── sam/
│       └── ...
├── seg/
├── projects/
└── ...

COCO Dataset

The COCO dataset is used for both training and evaluation, particularly for open-vocabulary instance segmentation tasks.

Required Structure

Organize your COCO 2017 dataset files as follows:

data/
├── coco/
│   ├── annotations/
│   │   ├── panoptic_train2017.json
│   │   ├── panoptic_val2017.json
│   │   ├── instance_train2017.json
│   │   └── instance_val2017.json
│   ├── train2017/
│   │   ├── 000...1.jpg
│   │   └── ...
│   ├── val2017/
│   │   ├── 000...2.jpg
│   │   └── ...
│   ├── panoptic_train2017/  # PNG panoptic annotations
│   │   ├── 000...1.png
│   │   └── ...
│   └── panoptic_val2017/   # PNG panoptic annotations
│       ├── 000...2.png
│       └── ...
  • annotations/: Contains the JSON annotation files for instances and panoptic segmentation.
  • train2017/ & val2017/: Contain the raw JPG image files.
  • panoptic_train2017/ & panoptic_val2017/: Contain the PNG format annotations for panoptic segmentation.

SAM Dataset

The SAM dataset, released with the original Segment Anything Model, is used for distillation and training the segmentation backbone.

Required Structure

The dataset is organized into multiple subdirectories (sa_000000, sa_000001, etc.), each containing images and their corresponding JSON annotations.

You must also create train.txt and val.txt files to specify which subdirectories to use for training and validation.

data/
├── sam/
│   ├── train.txt
│   ├── val.txt
│   ├── sa_000020/
│   │   ├── sa_223750.jpg
│   │   ├── sa_223750.json
│   │   └── ...
│   └── ...

train.txt and val.txt

These files should contain a list of the folder names to be included in the respective splits, with one folder name per line.

Example train.txt:

sa_000020
sa_000021
sa_000022
...

This structure allows for flexible splitting of the large SAM dataset for different training and validation experiments.