Skip to content

Latest commit

 

History

History
118 lines (71 loc) · 8.32 KB

File metadata and controls

118 lines (71 loc) · 8.32 KB

Samples Guide

The LayerLens Python SDK ships with 70+ runnable samples covering every API resource, from a single trace evaluation to enterprise compliance pipelines and multi-agent orchestration. All samples live in the samples/ directory and can be run directly after installing the SDK and setting your API key.

Quick Start

pip install layerlens --index-url https://sdk.layerlens.ai/package
export LAYERLENS_STRATIX_API_KEY=your-api-key
python samples/core/quickstart.py

quickstart.py walks through the complete workflow end-to-end: upload a trace, create a judge, run an evaluation, and retrieve results.

Samples by Category

Core SDK Operations (18 samples)

Located in samples/core/. Start here to learn how every LayerLens resource -- traces, judges, evaluations, results, models, and benchmarks -- works individually and together, including async patterns and pagination.

Key samples:

See the Core SDK README for the full list.

Industry Solutions (10 samples)

Located in samples/industry/. Domain-specific evaluation scenarios with judges tuned for regulated and high-stakes verticals including healthcare, financial services, legal, government, insurance, and retail.

Key samples:

See the Industry Solutions README for the full list.

Multi-Agent Evaluation (5 samples)

Located in samples/cowork/. Patterns for Claude Cowork, Agent Teams, or any multi-agent framework where multiple agents collaborate and each agent's output needs independent quality assessment.

Key samples:

See the Multi-Agent README for the full list.

CI/CD Integration (2 samples + workflow)

Located in samples/cicd/. Embed evaluation quality gates into your build and deployment pipelines so regressions never reach production.

See the CI/CD README for details.

LLM Provider Integrations (2 samples)

Located in samples/integrations/. Trace and evaluate outputs from OpenAI and Anthropic with minimal instrumentation.

Content-Type Evaluations (3 samples)

Located in samples/modalities/. Apply specialized judges to different content types -- text responses, brand assets, and structured documents.

OpenClaw Agent Evaluation (10 demos + skill)

Located in samples/openclaw/. Trace, evaluate, and monitor OpenClaw autonomous AI agents using LayerLens -- including cage match model tournaments, code gating, drift detection, content auditing, honeypot skill auditing, and adversarial red-teaming.

See the OpenClaw README for the full list of integration samples and advanced evaluation patterns.

MCP Server (1 sample)

Located in samples/mcp/. Expose LayerLens capabilities as tools for Claude, Cursor, and any MCP-compatible AI assistant.

  • layerlens_server.py -- MCP server with trace management, judge creation, and evaluation execution

See the MCP README for setup instructions.

CopilotKit Integration

Located in samples/copilotkit/. A full-stack canvas + chat sample built on langchain.agents.create_agent + CopilotKitMiddleware, with a runnable Next.js 16 + Tailwind 4 + shadcn/ui demo app under app/. The pattern mirrors CopilotKit's own coagents-research-canvas reference: state-driven cards on the host page, a chat sidebar with a frontend HITL widget, and out-of-band polling for long-running async work.

  • agents/evaluator_agent.py -- LangGraph agent with four backend tools (list_recent_traces, list_judges, run_trace_evaluation, get_evaluation_result) and a frontend HITL tool (confirm_judge) for picking which judge to apply. The picker is a real React widget registered via useCopilotAction({ renderAndWaitForResponse }), bridged into the LLM's toolbelt by CopilotKitMiddleware -- no interrupt() call.
  • agents/investigator_agent.py -- Standalone procedural StateGraph for trace investigation (errors / latency / cost hot spots). No HITL, no LLM. Reference for non-conversational agents.
  • components/*.tsx -- Five reusable SDK card components (EvaluationCard, TraceCard, JudgeVerdictCard, MetricCard, ComplianceCard) plus MarkdownLite, re-exported as @layerlens/copilotkit-cards.
  • app/ -- Runnable Next.js + FastAPI demo. Real LayerLens only -- a missing LAYERLENS_STRATIX_API_KEY is a hard error at startup.

Checkpointer note: The evaluator graph is compiled with InMemorySaver so ag_ui_langgraph's endpoint can call graph.aget_state(config) per request -- without it the AG-UI handler errors with "No checkpointer set" before any tool runs. The sample ships InMemorySaver for zero-setup local development; production deployments should swap to a durable saver (Postgres / SQLite / Redis / LangGraph Platform). See the sample's README for the full architecture walkthrough.

See the CopilotKit README for the full list.

Claude Code Skills (6 skills)

Located in samples/claude-code/. Slash commands that bring LayerLens workflows directly into the Claude Code CLI -- manage traces, judges, evaluations, optimizations, benchmarks, and investigations without leaving your terminal.

See the Claude Code Skills README for the full list.

Sample Data

Located in samples/data/. Pre-built trace files, test datasets, and 16 industry-specific evaluation datasets so you can run every sample without generating your own data first.

See the Sample Data README for contents.

Full Sample Reference

For the complete table of every sample with descriptions, see the samples README.