Aegis sits on top of your AI pipeline and returns structured control decisions that stabilize behavior at runtime without replacing your model, agent, or retrieval system.
Modern AI systems often fail in subtle but costly ways:
- inconsistent outputs across similar inputs
- unstable multi-step reasoning
- retrieval drift in RAG systems
- fragile workflow and agent execution
Aegis addresses these problems with runtime control, not retraining, fine-tuning, or model swapping.
Aegis is a control layer, not an execution layer.
from aegis import AegisClient
client = AegisClient(api_key="YOUR_API_KEY")
result = client.auto().llm(...)Aegis will:
- detect instability signals
- select minimal corrective actions
- return runtime controls and observability data
Aegis does not execute the downstream LLM call for you. It is not a model, not a full execution engine, and not a replacement for LangChain, LangGraph, or your tool stack.
pip install scelabs-aegisAegis Shell lets you run an existing AI pipeline through Aegis without editing pipeline code.
aegis attach --cmd "python run_agent.py"Core message:
- attach Aegis to your pipeline and see what it would control before you integrate
- no code changes required
- no SDK wiring required for proof-of-value
Aegis Shell simulation reports show:
- observed behavior
- controls Aegis would have issued
- projected impact
- recommended SDK integration points
Example demo command:
aegis attach --cmd "python examples/shell_attach/retry_loop_pipeline.py" --no-live --report .aegis/reports/retry-demo.mdExample attach report (current renderer style):
[Aegis] Pipeline Simulation Report
[Aegis] Observed:
[Aegis] - Runtime: 2 sec
[Aegis] - Exit status: 0
[Aegis] - Repeated retry patterns: 3
[Aegis] - Scope observed: 1 files
[Aegis] - Validation failures observed: 1
[Aegis] Aegis would have:
[Aegis] - Stop retries
[Aegis] - Limit changes to 2-3 files
[Aegis] - Validate changes before next step
[Aegis] Projected impact:
[Aegis] - Estimated AI iterations avoided: 4-8
[Aegis] - Retry loops prevented: 1
[Aegis] - Scope reduced from 1 files to 1
[Aegis] Recommended SDK integration points:
[Aegis] - Before retry loop: call client.auto().step(...) before another retry.
[Aegis] - Before agent step continuation: call client.auto().agent(...) or client.auto().step(...).
Attach options:
aegis attach --cmd "python run_agent.py"
aegis attach --cmd "python run_agent.py" --simulate
aegis attach --cmd "python run_agent.py" --no-simulate
aegis attach --cmd "python run_agent.py" --log pipeline.log
aegis attach --cmd "python run_agent.py" --json
aegis attach --cmd "python run_agent.py" --report .aegis/report.md
aegis attach --cmd "python run_agent.py" --json --report .aegis/report.json
aegis attach --cmd "python run_agent.py" --no-liveRuntime files under .aegis/:
control.jsonactive machine-readable control statesession.jsonlevent historyattach_runs.jsonlcompact run history for same-command comparison
Aegis Shell:
- observes
- simulates
- reports
Aegis SDK:
- enforces controls in production pipelines
See full guide: ./docs/aegis-shell.md
curl -X POST https://aegis-backend-production-4b47.up.railway.app/v1/onboard \
-H "Content-Type: application/json" \
-d '{"email":"you@example.com"}'This returns:
api_key- auto scope URLs (including
auto_llm_url,auto_rag_url,auto_step_url,auto_context_url, andauto_agent_urlon current backends) - example usage
export AEGIS_API_KEY=your_key_here
export AEGIS_BASE_URL=https://aegis-backend-production-4b47.up.railway.appfrom aegis import AegisClient, AegisConfig
client = AegisClient(
config=AegisConfig(mode="balanced"),
)
result = client.auto().llm(
base_prompt="You are a careful assistant.",
input={"user_query": "Explain recursion simply."},
symptoms=["inconsistent_outputs"],
severity="medium",
)
print(result.actions)
print(result.explanation)
print(result.scope_data)Aegis uses a scope-first runtime interface:
client.auto().llm(...)
client.auto().rag(...)
client.auto().step(...)
client.auto().context(...)
client.auto().agent(...)These calls map to first-class public backend routes:
- POST /v1/auto/llm
- POST /v1/auto/rag
- POST /v1/auto/step
- POST /v1/auto/context
- POST /v1/auto/agent
Use llm when you need stabilization around a direct model call.
result = client.auto().llm(
base_prompt="You are a careful assistant.",
input={"user_query": "Explain recursion simply."},
symptoms=["inconsistent_outputs"],
severity="medium",
)Use rag when instability appears in retrieval plus generation.
result = client.auto().rag(
query="What changed in the policy?",
retrieved_context=[
"Policy updated last week.",
"Refund window reduced to 14 days."
],
symptoms=["retrieval_drift"],
severity="medium",
)Aegis no longer treats retrieval as a fixed input.
It controls retrieval behavior at runtime.
The RAG scope now:
- enforces typed evidence coverage (source, test, support)
- applies relevant-file protection (never drops critical context)
- performs selective expansion (not always-on)
- removes noise without losing required files
- uses staged retrieval only when ambiguity or gaps are detected
- applies guided retrieval (intent + plan) only when justified
This is not just ranking or filtering.
Aegis:
- evaluates the retrieved set
- diagnoses issues (missing support, ambiguity, distractors)
- applies minimal corrective actions
- returns a controlled context for downstream use
At runtime:
-
You pass query + retrieved context
-
Aegis evaluates:
- missing required evidence
- role imbalance (source/test/support)
- distractor pressure
- ambiguity / multi-branch cases
-
It decides whether to:
- keep as-is
- prune noise
- expand retrieval
- run a staged second pass
- guide retrieval when needed
-
It enforces relevant-file protection before final selection
Everything is gated and minimal.
No always-on expansion. No blind pruning.
Yes.
Aegis sits above your agent system and stabilizes retrieval behavior.
It can:
- prevent agents from drifting due to poor context
- enforce evidence requirements before execution
- reduce retries and replans
- stabilize multi-step retrieval chains
Aegis does not replace your agents — it makes them more reliable.
Use step when you need stabilization for a workflow or agent step.
result = client.auto().step(
step_name="coordinator",
step_input={"task": "resolve ticket"},
symptoms=["unstable_workflow"],
severity="medium",
)Use context to control information state before the next model or workflow action.
result = client.auto().context(
objective="Prepare the next response context.",
messages=[
{"role": "user", "content": "Summarize blockers from this thread."},
{"role": "assistant", "content": "Draft summary goes here."},
],
tool_results=[
{"tool": "ticket_lookup", "ok": True, "data": {"id": "T-42", "status": "open"}},
],
constraints=["keep it concise", "cite ticket IDs"],
severity="medium",
)context can clean and prioritize messages and tool results so your downstream call receives better state.
Use agent to control multi-step workflow loops on top of your existing AI pipeline.
result = client.auto().agent(
goal="Resolve the support ticket safely.",
steps=[
{"name": "triage", "input": {"ticket_id": "T-42"}},
{"name": "propose_resolution", "input": {"channel": "email"}},
],
max_steps=4,
severity="medium",
)agent can control multi-step execution, tool-result integration, carry-forward context, and stop/retry/escalation decisions.
Every call returns an AegisResult.
result = client.auto().llm(...)actions— interventions Aegis selectedtrace— structured control tracemetrics— runtime signalsused_fallback— whether fallback behavior was usedexplanation— concise rationalescope— llm, rag, step, context, or agentscope_data— scope-specific runtime data
RAG responses now include richer runtime signals:
Inside scope_data:
public_rag_runtime— high-level runtime inforetrieval_intent— if guided retrieval was usedretrieval_plan— structured retrieval guidance (when triggered)initial_retrieved_chunks— stage 1 candidatesstage2_retrieved_chunks— staged retrieval results (if used)before_after_metrics— context quality changes
Inside trace:
-
decision.policy_pathincludes:- expansion score / threshold
- staged retrieval activation
- intent / plan activation
-
changesincludes:- protected chunk IDs
- relevant-file protection indicators
These are optional but useful for debugging pipeline behavior.
result = client.auto().rag(
query="Why is retry failing?",
retrieved_context=raw_context,
symptoms=["retrieval_drift"],
severity="medium",
)
controlled_context = result.scope_data.get("retrieved_context")
trace = result.trace
print(controlled_context)
print(result.actions)
print(trace)You apply the returned controlled context in your downstream system.
{
"actions": [...],
"trace": [...],
"scope": "rag",
"scope_data": {
"retrieved_context": [...],
"public_rag_runtime": {...},
"before_after_metrics": {...}
}
}print(result.summary())
print(result.debug_summary())
print(result.to_dict())summary() is for human-readable inspection.
to_dict() is for raw structured output.
debug_summary() is compact.
Useful fields:
print(result.actions)
print(result.explanation)
print(result.trace)
print(result.scope_data)from aegis import AegisConfig
config = AegisConfig(
mode="balanced",
max_interventions=3,
allow_retries=True,
allow_retrieval_expansion=True,
allow_context_reduction=True,
allow_prompt_shaping=True,
fallback="baseline",
explain=False,
emit_trace=False,
policy=None,
timeout_ms=30000,
)For scope calls, severity should be one of: low, medium, high.
Symptoms behavior:
llm,rag, andsteprequire explicitsymptomsandseveritycontextandagentprovide safe defaults forsymptomsandseveritywhen omitted
Example:
result = client.auto().llm(
base_prompt="You are a careful assistant.",
symptoms=["inconsistent_outputs"],
severity="medium",
)- runtime control over training
- minimal intervention
- observable behavior through trace and actions
- model-agnostic integration
Docs in /docs explain:
- architecture
- scopes
- Request Shapes
- result behavior
- integration guidance
- migration and usage patterns
- Stable SDK surface
- Active scopes: llm, rag, step, context, agent
- Public backend routes aligned to the scope-first contract
MIT