Dataset Configurations
Dataset configurations define how data is loaded, processed, and fed to the model during training and evaluation. These are specified in the train_dataloader
, val_dataloader
, and test_dataloader
dictionaries in the config files.
Key Components of a Dataloader Config
batch_size
: Number of samples per GPU.num_workers
: Number of subprocesses to use for data loading.dataset
: A dictionary that defines the dataset itself, including:type
: The dataset class name (e.g.,CocoDataset
,SAMDataset
).data_root
: The root directory where the dataset is stored.ann_file
: Path to the annotation file (relative todata_root
).data_prefix
: Path to the image directory (relative todata_root
).pipeline
: A list of data transformation and augmentation steps.
Example: COCO Instance Segmentation Dataset
Here is an example from projects/rwkvsam/configs/_base_/datasets/coco/coco_instance.py
:
# dataset settings
data_root = 'data/coco/'
dataset_type = 'CocoDataset'
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='Resize', scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', prob=0.5),
dict(type='PackDetInputs')
]
train_dataloader = dict(
batch_size=2,
num_workers=2,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='annotations/instances_train2017.json',
data_prefix=dict(img='train2017/'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
pipeline=train_pipeline
)
)
The Data Pipeline
The pipeline
is a crucial component that defines a sequence of operations applied to each data sample:
LoadImageFromFile
: Loads the image from the file path.LoadAnnotations
: Loads annotations (bounding boxes, masks) from the annotation file.Resize
: Resizes the image and its corresponding annotations.RandomFlip
: Applies random horizontal flipping for data augmentation.PackDetInputs
: Collects all data into a standardized format (DetDataSample
) that the model expects.
Available Dataset Configurations
This project provides base configurations for a variety of datasets located in seg/configs/_base_/datasets/
and projects/rwkvsam/configs/_base_/datasets/
, including:
- COCO: For instance segmentation and open-vocabulary tasks.
- LVIS: For large-vocabulary instance segmentation.
- SAM: For the large-scale SAM dataset used in distillation.
- ADE20k: For semantic segmentation.
- Specialized Datasets:
DIS5K
,ThinObject5K
, andEntitySeg
for more specific segmentation tasks.