Feature/real agents#5
Open
sahana-sreeram wants to merge 2 commits into
Open
Conversation
Replaces the round-based scientific meeting loop with a LangGraph policy workflow (Policy Director -> stakeholder research -> synthesis -> recommendation -> red-team -> revise -> forecast) exposed via a single run_policy_analysis(request) -> PolicyRunResult entry point. Foundation + mocked end-to-end skeleton so four workstreams can develop in parallel against frozen Pydantic contracts and example fixtures: - models.py: 18 frozen policy schemas (legacy meeting models retained) - graph.py/orchestrator.py: LangGraph state graph + sequential fallback, bounded red-team revision loop - context_builder.py: compact BriefingPacket (no transcript-to-every-agent) - agents/: policy_director, stakeholder_research, implementation, red_team (mock mode; real Ollama path stubbed) - retrieval.py: retrieve_policy_evidence seam (mock, dedup/rank/filter) - storage.py (SQLite), source_scoring.py, logger model events - app.py: general policy Streamlit UI + execute_policy_analysis adapter - skills/, data/, examples/, evals/ (3 cases), tests/ (22 passing) General-domain (not transportation-specific): forecasting is domain-gated via a forecasters/ registry — numeric scenarios when a deterministic domain module matches (transportation today), otherwise a qualitative directional outlook with no fabricated numbers. No LLM arithmetic in forecasts. Local-first by default (MOCK_MODE needs no model); optional frontier fallback off by default. Preserves upstream MIT license + TAU Group attribution. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Real intelligence behind per-component flags (mock stays default + auto-fallback, so the app always runs), plus the new agent roster with orchestrator-decided skills. Agent framework (Person A): - config.py: per-component MOCK_DIRECTOR/RESEARCH/ANALYSIS flags. - agent_builder.py: run_structured() shared wrapper — lazy Ollama, JSON-mode, Pydantic validation, retry, optional frontier fallback, ModelEvent logging; never raises (returns None -> callers fall back to mock). Injectable for tests. Roster (Director + 3 workers + Red-Team): - skills_registry.py: reads skills/*/SKILL.md into a catalog. - Policy Director now assigns each task an agent_type AND a skill set chosen from the registry (skills are orchestrator-decided, not hardcoded). - New Research agent (objective cited evidence) runs as its own graph phase. - Stakeholder agent loads the task's assigned skills. - Data Analyst (canonical name for the analysis/recommendation role). - Red-Team kept for the revision loop. Orchestration: - graph/orchestrator: added the research phase (plan -> research -> stakeholder -> synthesize -> recommend -> red_team -> forecast); fallback executor updated. - models.py (additive): AgentType, PolicyTask.agent_type + skills, ResearchBrief, PolicyRunResult.research_briefs. Product: - forecasters/housing.py: second deterministic domain (registry not transport-only). - utils.export_policy_brief() (pure python-docx, offline); removed import-time pypandoc network download. - app.py: research section, per-stakeholder assigned-skills display, run history, DOCX brief download. Verified here (no Ollama/agno): 33 tests pass; evals 100% in mock and real-flags-fallback modes. Defaults stay mock; flip POLICY_MOCK_*=0 with Ollama to go live. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.