A modular, multi-agent based system for PyTorch, Hugging Face, and AWS, powered by Anthropic's Claude family of models.
Note: This project is compatible with GitHub Copilot through
.github/copilot-instructions.md, which references the same agent architecture defined inCLAUDE.md.
This repository embodies an agent-based architecture for machine learning projects, where specialized AI agents collaborate to deliver comprehensive solutions. Each agent maintains deep expertise in their domain while remaining modality and task agnostic.
| Each agent owns a specific technical domain, preventing overlap and ensuring expertise depth. | Agents adapt to any ML task—vision, NLP, audio, multimodal—without hardcoded assumptions. |
| Optimized for PyTorch 2.3+ with distributed training and NVIDIA GPU acceleration. | Built for AWS EC2 environments with scalable infrastructure patterns. |
| Agents work in concert, sharing context and building on each other's outputs. | |
This template serves as a gateway to two critical ML engineering competencies:
Progress from basic tensor operations to production-ready ML systems through practical, agent-guided development. The prompting-guide/ provides a structured path from prompt dependency to independent PyTorch expertise.
The multi-agent architecture here provides hands-on experience with patterns directly applicable to:
- LangChain: Chain-of-thought reasoning, tool use, and agent orchestration
- AWS Bedrock Agents: Structured prompts, knowledge bases, and action groups
- NVIDIA NeMo Guardrails: Agent safety, structured outputs, and conversation flows
By working with this template's agent team, you're learning:
- Agent coordination patterns (supervisor/worker models)
- Tool use and function calling (ReAct patterns)
- Structured prompting (INVEST+CRPG framework)
- Multi-agent orchestration (parallel and sequential workflows)
These skills transfer directly to building production agent applications, making this template both a PyTorch learning tool and an introduction to the agentic AI ecosystem.
This project uses a disciplined approach to requirements specification:
All agent tasks and prompt templates follow the agile INVEST criteria
- Independent - Each story stands alone
- Negotiable - Flexible implementation details
- Valuable - Clear business or research value
- Estimable - Measurable scope and effort
- Small - Completable in reasonable time
- Testable - Verifiable success criteria
A custom format using Reinforcement Learning language guides AI agent optimization
- Constraints - Technical boundaries and limitations
- Rewards - Success metrics and performance targets
- Penalties - Anti-patterns and quality deductions
- Goal State - Clear deliverables and validation criteria
This structured approach ensures agents understand both the "what" (user story) and the "how" (optimization parameters) of each task.
- Define Your Project: Consult
CLAUDE.mdto engage the Supervisor - Select Your Team: Claude routes to appropriate specialist agents
- Iterate and Build: Agents collaborate to implement your solution
graph TB
%% Strategy Team
Supervisor["Supervisor - Project Coordination"]
DomainExpert["DomainExpert - Domain Knowledge"]
%% Data Pipeline Team
DatasetCurator["DatasetCurator - HF Datasets"]
DataEngineer["DataEngineer - DataLoaders"]
TransformSpecialist["TransformSpecialist - Augmentation"]
%% Model Architecture Team
ModelArchitect["ModelArchitect - HF Models"]
NetworkArchitect["NetworkArchitect - Custom Networks"]
%% Training & Evaluation Team
TrainingOrchestrator["TrainingOrchestrator - Training Loops"]
MetricsArchitect["MetricsArchitect - Evaluation"]
RunnerOrchestrator["RunnerOrchestrator - Pipelines"]
%% Infrastructure Team
CloudEngineer["CloudEngineer - AWS Services"]
ComputeOrchestrator["ComputeOrchestrator - EC2/GPU"]
LocalStackEmulator["LocalStackEmulator - Local Testing"]
%% Quality & Interface Team
TestArchitect["TestArchitect - TDD"]
InterfaceDesigner["InterfaceDesigner - Web UI"]
%% Team Groupings
subgraph Strategy
Supervisor
DomainExpert
end
subgraph DataPipeline[Data Pipeline]
DatasetCurator
DataEngineer
TransformSpecialist
end
subgraph ModelArchitecture[Model Architecture]
ModelArchitect
NetworkArchitect
end
subgraph TrainingEvaluation[Training & Evaluation]
TrainingOrchestrator
MetricsArchitect
RunnerOrchestrator
end
subgraph Infrastructure
CloudEngineer
ComputeOrchestrator
LocalStackEmulator
end
subgraph QualityInterface[Quality & Interface]
TestArchitect
InterfaceDesigner
end
%% Primary Relationships
Supervisor --> DatasetCurator
Supervisor --> ModelArchitect
Supervisor --> CloudEngineer
DomainExpert --> DatasetCurator
DomainExpert --> MetricsArchitect
DatasetCurator --> DataEngineer
DataEngineer --> TransformSpecialist
DataEngineer --> TrainingOrchestrator
ModelArchitect --> NetworkArchitect
NetworkArchitect --> TrainingOrchestrator
TrainingOrchestrator --> MetricsArchitect
TrainingOrchestrator --> RunnerOrchestrator
CloudEngineer --> ComputeOrchestrator
CloudEngineer --> InterfaceDesigner
LocalStackEmulator --> CloudEngineer
TestArchitect -.-> DataEngineer
TestArchitect -.-> NetworkArchitect
TestArchitect -.-> TrainingOrchestrator
TestArchitect -.-> CloudEngineer
RunnerOrchestrator --> ComputeOrchestrator
%% Styling
classDef strategyStyle fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#000
classDef dataStyle fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#000
classDef modelStyle fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000
classDef trainingStyle fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
classDef infraStyle fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#000
classDef qualityStyle fill:#f1f8e9,stroke:#558b2f,stroke-width:2px,color:#000
class Supervisor,DomainExpert strategyStyle
class DatasetCurator,DataEngineer,TransformSpecialist dataStyle
class ModelArchitect,NetworkArchitect modelStyle
class TrainingOrchestrator,MetricsArchitect,RunnerOrchestrator trainingStyle
class CloudEngineer,ComputeOrchestrator,LocalStackEmulator infraStyle
class TestArchitect,InterfaceDesigner qualityStyle
Agents → Specialized Expertise → Collaborative Implementation → Deployed Solution
Each agent operates as an expert consultant, providing:
- Domain-specific knowledge
- Best practice implementations
- Performance optimizations
- Quality assurance
- PyTorch 2.3+: Core deep learning framework
- Hugging Face: Model and dataset ecosystem
- AWS: Cloud infrastructure and services
- Claude Code: AI-powered development assistance
.claude/agents/: Specialized agent definitionsCLAUDE.md: Agent routing and coordinationdocs/: Documentation and agile artifactsadr/: Architecture Decision Recordssprints/: Sprint planning and tracking
prompt-templates/: Task-specific prompt examplesprompting-guide/: Comprehensive guide on prompting techniques and MLE learning pathsrc/: Core Python modules (non-package structure)data.py: Data pipeline componentsnetwork.py: Model architecturestrainer.py: Training orchestrationserver.py: API and servingrunner.py: CLI entry point
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv pip install -r requirements.txt
# Install dev dependencies
uv pip install -e ".[dev]"
# Setup pre-commit hooks
pre-commit installView all available agents and their capabilities in .claude/team.md
In the Claude Code terminal, you can directly invoke specialized agents using @agent-[NAME] or let Claude automatically route your request to the appropriate expert.
# Explicitly call a specific agent using @agent-[NAME]
$ "@agent-NetworkArchitect implement a custom attention mechanism for video understanding"
# Agent responds with expertise
NetworkArchitect: I'll design a custom spatio-temporal attention module...# Describe your task and Claude routes to appropriate agents
$ "I need to fine-tune a BERT model on my custom dataset with limited GPU memory"
# Claude automatically engages relevant agents
Supervisor: Let me establish your constraints...
TestArchitect: Writing tests for your fine-tuning pipeline...
ModelArchitect: Selecting optimal BERT variant for your memory constraints...
DataEngineer: Configuring efficient data loading...$ "I want to build an image classification system for medical X-rays"
# Supervisor coordinates the team
Supervisor: Analyzing requirements...
DomainExpert: Medical imaging requires specific preprocessing...
DatasetCurator: Searching for relevant medical datasets...
TestArchitect: Writing comprehensive test suite first...
NetworkArchitect: Designing architecture for medical images...$ "Fine-tune Llama-2-7B on my customer support dataset using QLoRA"
# Specialized agents collaborate
ModelArchitect: Configuring Llama-2-7B with 4-bit quantization...
DataEngineer: Setting up efficient data pipeline...
TrainingOrchestrator: Implementing QLoRA with gradient checkpointing...
MetricsArchitect: Establishing evaluation metrics...$ "Test my model API locally before deploying to AWS"
# LocalStackEmulator coordinates with CloudEngineer
LocalStackEmulator: Starting local AWS environment...
CloudEngineer: Configuring API endpoints for local testing...
TestArchitect: Running integration tests against LocalStack...$ "Write tests for a vision transformer training pipeline"
# TestArchitect leads TDD workflow
TestArchitect: Creating tests that will fail initially...
- test_model_initialization()
- test_forward_pass_shapes()
- test_loss_computation()
- test_optimizer_step()
NetworkArchitect: Implementing ViT to pass your tests...$ "Deploy a real-time object detection API with <50ms latency"
# Watch agents collaborate
Supervisor: Establishing latency requirements...
TestArchitect: Writing performance benchmarks...
ModelArchitect: Selecting YOLOv8n for speed...
ComputeOrchestrator: Recommending g5.xlarge instance...
CloudEngineer: Implementing FastAPI with async inference...
LocalStackEmulator: Testing locally first...
InterfaceDesigner: Creating monitoring dashboard...
# Result: Complete deployment pipeline with tests- Be Specific: Include constraints, metrics, and requirements
- Direct Invocation: Use
@agent-[NAME]to call specific agents - Use Templates: Copy prompts from
prompt-templates/for consistency - Test First: Let TestArchitect write tests before implementation
- Local First: Use LocalStackEmulator before AWS deployment
- Trust Routing: Claude knows which agents to engage when not specified
# Iterative workflow
$ "Test → Data → Model → Training → Deploy (continuous iteration)"
# Parallel execution
$ "Run tests AND start LocalStack AND prepare dataset"
# Specific expertise request
$ "@agent-MetricsArchitect design custom metrics for video quality assessment"The src/ directory contains standalone modules that can be run directly without package installation. This simplifies deployment and reduces complexity while maintaining clear separation of concerns.
- Architecture Decision Records: Documented technical decisions in
docs/adr/ - Sprint Tracking: Comprehensive sprint planning and retrospectives in
docs/sprints/ - Test-Driven Development: TestArchitect enforces TDD practices
- Continuous Integration: Built into agent collaboration workflows
- uv: Fast, reliable Python package management
- Ruff: Single tool for linting and formatting
- Pre-commit: Automated code quality checks
- Type hints: Full typing support throughout
This template provides the foundation for any ML project, from research prototypes to production systems.
If you use this project in your research or work, please cite:
@software{claude_code_pytorch,
author = {jxtngx},
title = {Claude Code PyTorch: Multi-Agent ML Development Framework},
year = {2025},
url = {https://github.com/jxtngx/claude-code-pytorch},
license = {Apache-2.0}
}Copyright 2025 jxtngx
Licensed under the Apache License, Version 2.0. See LICENSE for details.