Feature/real agents by sahana-sreeram · Pull Request #5 · taugroup/ThinkTank

sahana-sreeram · 2026-06-27T17:43:38Z

No description provided.

Replaces the round-based scientific meeting loop with a LangGraph policy workflow (Policy Director -> stakeholder research -> synthesis -> recommendation -> red-team -> revise -> forecast) exposed via a single run_policy_analysis(request) -> PolicyRunResult entry point. Foundation + mocked end-to-end skeleton so four workstreams can develop in parallel against frozen Pydantic contracts and example fixtures: - models.py: 18 frozen policy schemas (legacy meeting models retained) - graph.py/orchestrator.py: LangGraph state graph + sequential fallback, bounded red-team revision loop - context_builder.py: compact BriefingPacket (no transcript-to-every-agent) - agents/: policy_director, stakeholder_research, implementation, red_team (mock mode; real Ollama path stubbed) - retrieval.py: retrieve_policy_evidence seam (mock, dedup/rank/filter) - storage.py (SQLite), source_scoring.py, logger model events - app.py: general policy Streamlit UI + execute_policy_analysis adapter - skills/, data/, examples/, evals/ (3 cases), tests/ (22 passing) General-domain (not transportation-specific): forecasting is domain-gated via a forecasters/ registry — numeric scenarios when a deterministic domain module matches (transportation today), otherwise a qualitative directional outlook with no fabricated numbers. No LLM arithmetic in forecasts. Local-first by default (MOCK_MODE needs no model); optional frontier fallback off by default. Preserves upstream MIT license + TAU Group attribution. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Real intelligence behind per-component flags (mock stays default + auto-fallback, so the app always runs), plus the new agent roster with orchestrator-decided skills. Agent framework (Person A): - config.py: per-component MOCK_DIRECTOR/RESEARCH/ANALYSIS flags. - agent_builder.py: run_structured() shared wrapper — lazy Ollama, JSON-mode, Pydantic validation, retry, optional frontier fallback, ModelEvent logging; never raises (returns None -> callers fall back to mock). Injectable for tests. Roster (Director + 3 workers + Red-Team): - skills_registry.py: reads skills/*/SKILL.md into a catalog. - Policy Director now assigns each task an agent_type AND a skill set chosen from the registry (skills are orchestrator-decided, not hardcoded). - New Research agent (objective cited evidence) runs as its own graph phase. - Stakeholder agent loads the task's assigned skills. - Data Analyst (canonical name for the analysis/recommendation role). - Red-Team kept for the revision loop. Orchestration: - graph/orchestrator: added the research phase (plan -> research -> stakeholder -> synthesize -> recommend -> red_team -> forecast); fallback executor updated. - models.py (additive): AgentType, PolicyTask.agent_type + skills, ResearchBrief, PolicyRunResult.research_briefs. Product: - forecasters/housing.py: second deterministic domain (registry not transport-only). - utils.export_policy_brief() (pure python-docx, offline); removed import-time pypandoc network download. - app.py: research section, per-stakeholder assigned-skills display, run history, DOCX brief download. Verified here (no Ollama/agno): 33 tests pass; evals 100% in mock and real-flags-fallback modes. Defaults stay mock; flip POLICY_MOCK_*=0 with Ollama to go live. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sahana-sreeram and others added 2 commits June 27, 2026 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/real agents#5

Feature/real agents#5
sahana-sreeram wants to merge 2 commits into
taugroup:mainfrom
sahana-sreeram:feature/real-agents

sahana-sreeram commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sahana-sreeram commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant