π Where Vibe Coding meets CV data. Convert, visualize & evaluate datasets β built with the flow of Claude Code.
A computer vision dataset processing library for seamless format conversion, visualization, and evaluation between YOLO, LabelMe, and COCO annotation formats. Designed for researchers and developers working with multi-format annotation pipelines.
graph LR
A[YOLO<br/>.txt] -->|convert| D[DataFlow-CV]
B[LabelMe<br/>.json] -->|convert| D
C[COCO<br/>.json] -->|convert| D
D -->|visualize| E[π¨ Rendered<br/>Images]
D -->|evaluate| F[π mAP / AR<br/>Metrics]
| π Format Conversion | Convert between YOLO, LabelMe, and COCO in any direction β 6 conversion paths, plus prediction file support (outputs standard list-format COCO predictions) | |
| π― Detection & Segmentation | Handle both object detection (bbox) and instance segmentation (polygon/RLE) annotations | |
| π¨ Visualization | Render annotations with OpenCV β color-coded classes, semi-transparent masks, display & save modes | |
| π Evaluation | COCO-standard 12-metric output (mAP, AP50, AP75, AR) via pycocotools, with per-class breakdowns | |
| π» Command-line Interface | Intuitive CLI with convert, visualize, and evaluate subcommands β positional args, rich --help |
|
| π Python API | Programmatic access for integration into larger ML pipelines | |
| π Verbose Logging | File-based debug logging with timestamps β toggle with --verbose |
|
| π₯οΈ Headless Mode | Server/Docker-friendly: --no-display + --save for off-screen rendering |
|
| π‘οΈ Flexible Error Handling | Strict mode (abort on error) or lenient mode (skip & continue with warnings) via --no-strict |
pip install dataflow-cvgit clone https://github.com/zjykzj/DataFlow-CV.git
cd DataFlow-CV
# Regular installation
pip install .
# Editable installation (for development)
pip install -e .π‘ Tip: When installed in editable mode, use
python -m dataflow.cliinstead of thedataflow-cvcommand.
| Dependency | Purpose | Install |
|---|---|---|
pycocotools |
COCO RLE segmentation + evaluation | pip install pycocotools |
All required parameters (image directories, label directories, class files, output paths) are positional arguments for better usability. Use --help on any subcommand for detailed usage.
# YOLO β COCO
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt output.json
# YOLO β COCO (with RLE encoding)
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt output.json --do-rle
# YOLO β LabelMe
dataflow-cv convert yolo2labelme images/ yolo_labels/ classes.txt labelme_json/
# LabelMe β YOLO
dataflow-cv convert labelme2yolo labelme_json/ classes.txt yolo_labels/
# LabelMe β COCO
dataflow-cv convert labelme2coco labelme_json/ classes.txt output.json
# COCO β YOLO
dataflow-cv convert coco2yolo input.json yolo_labels/
# COCO β LabelMe
dataflow-cv convert coco2labelme input.json labelme_json/
# YOLO predictions β COCO (output: plain JSON list β prediction format)
dataflow-cv convert yolo2coco --prediction images/ yolo_preds/ classes.txt pred.json
# Options
dataflow-cv convert yolo2coco --verbose images/ labels/ classes.txt output.json
dataflow-cv convert yolo2coco --no-strict images/ labels/ classes.txt output.json# Visualize YOLO annotations
dataflow-cv visualize yolo images/ yolo_labels/ classes.txt --save visualized/
# Visualize LabelMe annotations
dataflow-cv visualize labelme images/ labelme_json/ --save visualized/
# Visualize COCO annotations
dataflow-cv visualize coco images/ coco_annotations.json --save visualized/
# Verbose logging + headless mode
dataflow-cv visualize yolo --verbose --no-display images/ yolo_labels/ classes.txt --save visualized/Evaluate object detection and instance segmentation model outputs using COCO-standard metrics. Two COCO-format JSON files are required:
| File | Role | Format | Source |
|---|---|---|---|
anno.json |
Ground Truth (GT) β reference annotations | Full COCO dict (images, annotations, categories) |
yolo2coco (label mode) |
pred.json |
Detection (DT) β model predictions | Plain JSON list of annotation dicts (with score) |
yolo2coco --prediction, Detectron2, MMDetection |
If your annotations and predictions are in YOLO format, convert them to COCO JSON first:
# Step 1: YOLO ground truth labels β COCO GT (anno.json)
# Label format: class_id cx cy w h β 5 tokens (detection)
# class_id x1 y1 ... xn yn β odd tokens (segmentation)
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt anno.json
# Step 2: YOLO predictions β COCO DT (pred.json)
# Prediction fmt: class_id cx cy w h confidence β 6 tokens (detection)
# class_id x1 y1 ... xn yn confidence β even tokens (segmentation)
dataflow-cv convert yolo2coco --prediction images/ yolo_preds/ classes.txt pred.json
β οΈ Important: YOLO label files (GT) use odd token counts, while prediction files (DT) use even token counts with a trailingconfidence. The--predictionflag is required for DT β it outputs a plain JSON list of annotation dicts (not a full COCO dict withimages/categories). Mixed label/prediction files in the same directory are not supported.βΉοΈ Note: The
--predictionflag is only available foryolo2coco.labelme2cocodoes not support prediction conversion β LabelMe files (.json) have no label vs prediction format distinction, so there is no equivalent prediction source format to convert from.
| Field | Detection GT | Detection DT | Segmentation GT | Segmentation DT |
|---|---|---|---|---|
bbox |
β Required | β Required | β Required (for area) | β Required (for area) |
score |
β | β Required | β | β Required |
segmentation |
β Not required | β Not required | β Required | β Required |
area |
βͺ Recommended | βͺ Recommended | β Required | β Required |
iscrowd |
βͺ Optional | β | βͺ Optional | β |
- Object Detection (
iouType='bbox'): Bounding box overlap evaluation. Onlybbox+scoremandatory in DT. - Instance Segmentation (
iouType='segm'): Mask overlap evaluation. GT and DT must includesegmentation(polygon or RLE),area, andbbox.
# Object detection evaluation (bbox IoU)
dataflow-cv evaluate detection anno.json pred.json
# Verbose per-class breakdown
dataflow-cv evaluate detection --verbose anno.json pred.json
# With P/R/F1 at IoU=0.5
dataflow-cv evaluate detection --prf1 --prf1-iou 0.5 anno.json pred.json
# Instance segmentation evaluation (mask IoU)
dataflow-cv evaluate segmentation anno.json pred.json
# Save results as JSON
dataflow-cv evaluate detection --output results.json anno.json pred.json# Complete pipeline: YOLO β COCO β Evaluation
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt anno.json
dataflow-cv convert yolo2coco --prediction images/ yolo_preds/ classes.txt pred.json
dataflow-cv evaluate detection --verbose --prf1 anno.json pred.jsonfrom dataflow.convert import YoloAndCocoConverter
from dataflow.visualize import YOLOVisualizer
from dataflow.evaluate import DetectionEvaluator, compute_pr_f1
# ββ Convert ββββββββββββββββββββββββββββββββββββββββββ
# YOLO labels β COCO (label mode)
converter = YoloAndCocoConverter(source_to_target=True, verbose=True, strict_mode=True)
result = converter.convert(
source_path="yolo_labels/", target_path="anno.json",
class_file="classes.txt", image_dir="images/",
)
# YOLO predictions β COCO (prediction mode)
converter = YoloAndCocoConverter(source_to_target=True, prediction=True)
result = converter.convert(
source_path="yolo_preds/", target_path="pred.json",
class_file="classes.txt", image_dir="images/",
)
# ββ Visualize ββββββββββββββββββββββββββββββββββββββββ
visualizer = YOLOVisualizer(
label_dir="yolo_labels/", image_dir="images/",
class_file="classes.txt", is_show=True, is_save=True,
output_dir="visualized/", verbose=True, strict_mode=True,
)
result = visualizer.visualize()
# ββ Evaluate βββββββββββββββββββββββββββββββββββββββββ
evaluator = DetectionEvaluator(verbose=True)
result = evaluator.evaluate("anno.json", "pred.json")
print(f"AP: {result.metrics.ap:.3f}, AP50: {result.metrics.ap50:.3f}")
# Quick P/R/F1 at IoU=0.5
prf1 = compute_pr_f1("anno.json", "pred.json", iou_threshold=0.5)
print(f"F1: {prf1.overall.f1_score:.3f}")π See the
samples/directory for complete examples:samples/visualize/(YOLO, LabelMe, COCO demos),samples/convert/(conversion examples).
| Resource | Description |
|---|---|
| CLAUDE.md | Architecture overview, development guide, and known gotchas |
| CHANGELOG.md | Version history and breaking changes |
| specs/evaluate/ | Evaluation metric contracts β IoU, matching, AP/mAP/AR |
| specs/formats/ | External format contracts β YOLO, LabelMe, COCO, conversion rules |
| specs/modules/ | Internal module architecture, interface contracts, dependency constraints |
- Format-Native Coordinates: Coordinates stored in each format's native representation β YOLO normalized [0,1] center-based, LabelMe/COCO absolute pixels top-left. Check
DatasetAnnotations.formatto determine semantics. - Explicit Coordinate Transforms: Converters handle all coordinate transformations between formats β no hidden normalization.
- Strict Mode: Validation errors raise exceptions by default. Disable with
--no-strict(CLI) orstrict_mode=False(API). - Verbose Logging: Detailed debug logs saved to files when
--verboseis used. The CLI prints the log file path after each operation. - Headless Support: Use
--no-displayfor servers/Docker; pair with--saveto output visualization images without a window. - Keyboard Shortcuts: During visualization β
q/ESCto exit,Enter/Spaceto advance, any other key to continue. - Color Management: Each class ID gets a unique color from an HSV-based palette (up to 1000 classes) for consistent visualization.
- Evaluation Metrics: COCO-standard 12-metric output with optional per-class breakdown and P/R/F1 computation.
- Prediction Files: YOLO prediction files use 6 tokens (detection) or even tokens (segmentation) vs 5/odd for labels.
--predictionoutputs a plain JSON list of annotation dicts β the standard prediction exchange format compatible with pycocotoolsloadRes().
For detailed developer guidance including advanced test commands, debugging, and architecture overview, see CLAUDE.md.
370 tests, 75% code coverage (3912 statements).
pytest # All tests
pytest --cov=dataflow --cov-report=term # With coverage
pytest tests/convert/test_yolo_and_coco.py # Single module
pytest tests/evaluate/test_evaluator.py # Single moduleπ Coverage by module
| Module | Coverage | Highlights |
|---|---|---|
dataflow/label/ |
68% | models (87%), coco_handler (75%), labelme_handler (70%), yolo_handler (58%) |
dataflow/convert/ |
87% | yolo_and_coco (90%), labelme_and_yolo (86%), coco_and_labelme (87%), rle (80%), base (83%), utils (92%) |
dataflow/visualize/ |
81% | yolo_vis (100%), labelme_vis (100%), coco_vis (97%), base (74%) |
dataflow/evaluate/ |
88% | evaluator (100%), metrics (96%), result (99%), base (91%), utils (69%) |
dataflow/cli/ |
59% | main (96%), convert cmd (48%), evaluate cmd (24%), visualize cmd (84%), utils (86%) |
dataflow/util/ |
93% | logging (98%), file_util (84%) |
pip install -e .[dev] # Install dev dependencies
black dataflow tests samples # Format
isort dataflow tests samples # Sort imports
mypy dataflow # Type check
flake8 dataflow tests samples # Lintpip install pre-commit
pre-commit install # Install git hooks (run once)
# After this, every `git commit` auto-runs:
# black β isort β flake8 β whitespace checks
pre-commit run --all-files # Manual run against all filesdataflow/
βββ label/ # Annotation handlers + data models
βββ convert/ # Format converters + RLE utility
βββ visualize/ # OpenCV-based rendering
βββ evaluate/ # pycocotools-based metrics
βββ util/ # Logging & file utilities
βββ cli/ # CLI entry point, commands, validation
tests/ # Unit & integration tests
samples/ # Python API usage examples
assets/ # Test data (det/seg by format)
specs/ # Canonical specifications (evaluate/ + formats/ + modules/)
Contributions are welcome! Please review CLAUDE.md for architecture and development patterns before contributing.
- π΄ Fork the repository
- πΏ Create a feature branch
- βοΈ Make your changes
- π§ͺ Add or update tests as needed
- β Ensure code passes formatting and linting checks
- π¬ Submit a pull request
This project is licensed under the MIT License β see LICENSE for details.
- Thanks to the creators of YOLO, LabelMe, and COCO formats for establishing these annotation standards
- Built with OpenCV, NumPy, Click, and pycocotools
- Inspired by the need for seamless format conversion in multi-tool CV pipelines