Production Agent: Seven Layers

Build an LLM agent the way real teams ship them: as seven composable layers, each with a clear job, a clear contract, and a clear failure mode. Every request flows through the same pipeline and emits a per-layer trace so you can debug production in minutes, not hours.

Start learning at learnwithparam.com. Regional pricing available with discounts of up to 60%.

What You'll Learn

Design an agent as a graph of small, testable layers instead of one giant function
Add transport validation and rate limiting before anything reaches your LLM
Route intents with a tiny state machine and branch to tools, retrieval, or direct reply
Run tools safely with whitelists and argument parsing
Keep per-thread memory with sane eviction limits
Ground answers with ChromaDB semantic search when retrieval beats generation
Enforce guardrails on both input and output (PII scrub, length caps)
Emit structured traces and logs so every request is observable out of the box

The Seven Layers

Transport - validates the request, enforces size and basic rate limits
Orchestrator - a small state machine that picks tool_use, retrieve, or reply
Tools - get_time and calculator, dispatched through a safe registry
Memory - per-thread bounded conversation history
Retrieval - ChromaDB semantic search over a seed knowledge base (degrades gracefully when disabled)
Guardrails - PII scrubbing and length checks on every input and output
Observability - per-request trace with per-layer timings plus a structlog event

Tech Stack

FastAPI - async Python web framework
Pydantic - request and response validation
ChromaDB + Sentence Transformers - embedded vector store and local embeddings
structlog - structured logs for every request
LLM Provider Pattern - supports OpenRouter, Fireworks, Gemini, OpenAI
Docker - containerized development

Getting Started

Prerequisites

Python 3.11+
uv (installed automatically by make setup)
An API key from any supported LLM provider

Quick Start

make dev

# Or step by step:
make setup
# edit .env and add your API key
make run

With Docker

make build
make up
make logs
make down

API Documentation

Once running, open http://localhost:8000/docs for the interactive Swagger UI.

Primary endpoints:

GET /production-agent/health - liveness check plus the list of layers
POST /production-agent/chat - body { "message": "...", "thread_id": "t1" }
GET /production-agent/trace/{thread_id} - the most recent per-layer trace for a thread

Challenges

Work through these incrementally to build the full system:

The Transport Layer - Validate inputs and add a tiny rate limiter
The Orchestrator - Route between tool use, retrieval, and direct reply
Tools with Guardrails - Register get_time and a safe calculator
Thread Memory - Store bounded per-thread conversation history
Retrieval Layer - Seed a ChromaDB collection and wire up semantic search
Input and Output Guardrails - Scrub emails, phone numbers, and SSN-like tokens
Observability - Build the per-request trace and expose GET /trace/{thread_id}

Makefile Targets

make help           Show all available commands
make setup          Initial setup (create .env, install deps)
make dev            Setup and run (one command!)
make run            Start FastAPI server
make build          Build Docker image
make up             Start container
make down           Stop container
make clean          Remove venv and cache

Learn more

Start the course: learnwithparam.com/courses/layered-production-ai-architecture
AI Bootcamp for Software Engineers: learnwithparam.com/ai-bootcamp
All courses: learnwithparam.com/courses

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
layers		layers
utils		utils
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
models.py		models.py
pyproject.toml		pyproject.toml
router.py		router.py
service.py		service.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Production Agent: Seven Layers

What You'll Learn

The Seven Layers

Tech Stack

Getting Started

Prerequisites

Quick Start

With Docker

API Documentation

Challenges

Makefile Targets

Learn more

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Production Agent: Seven Layers

What You'll Learn

The Seven Layers

Tech Stack

Getting Started

Prerequisites

Quick Start

With Docker

API Documentation

Challenges

Makefile Targets

Learn more

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages