Code Structure Overview
This project is organized into several key directories, each serving a distinct purpose. Understanding this structure is helpful for navigating the codebase and extending it.
ext/
- External Libraries
This directory contains third-party or foundational libraries that are integrated directly into the project. These are often core components upon which the new models are built.
open_clip/
: A copy of the OpenCLIP library, providing the powerful CLIP models used for open-vocabulary recognition.sam/
: Contains the core building blocks from the original Segment Anything Model, such as the image encoder, prompt encoder, and mask decoder architectures.rwkv/
: Code related to the RWKV (Receptance Weighted Key Value) architecture, which is used as an efficient backbone in the RWKV-SAM project.class_names/
: Utility files defining class IDs and names for various datasets like COCO and LVIS.
seg/
- Core OVSAM Implementation
This directory holds the primary source code for the Open-Vocabulary SAM (OVSAM) model.
configs/
: Configuration files for training and evaluating OVSAM components, includingsam2clip
,clip2sam
, and the finalovsam
model.models/
: Implementation of custom models, including backbones (OpenCLIPBackbone
,SAMBackbone
), necks (MultiLayerTransformerNeck
), heads (OVSAMHead
), and detectors that tie everything together (OVSAM
,CLIP2SAM
).datasets/
: Custom dataset loaders and data processing pipelines.evaluation/
: Custom evaluation metrics.
projects/rwkvsam/
- RWKV-SAM Project
This is a self-contained sub-project focusing on a high-efficiency segmentation model. It follows a similar structure to the seg/
directory but contains code specific to the RWKV-based architecture.
README.md
: An introduction to the RWKV-SAM model.configs/
: Configuration files for training and evaluating RWKV-SAM.models/
: Implementations of RWKV-specific models like theVITAMINBackbone
.datasets/
&evaluation/
: Dataset and evaluation code specific to experiments run for this project.
tools/
- Scripts and Utilities
This directory contains the main executable scripts for interacting with the models.
train.py
: The main script for training models.test.py
: The main script for testing and evaluating models.gen_cls.py
: A utility to pre-compute and cache language embeddings for class names.dist.sh
: A wrapper script to launch the Python scripts in a distributed (multi-GPU) environment.