NetLogo · JNK234 · Feb 27, 2026 · Feb 27, 2026 · Feb 27, 2026 · Feb 27, 2026
diff --git a/demos/crisis-triage/README.md b/demos/crisis-triage/README.md
@@ -0,0 +1,100 @@
+# Demo 2: Crisis Triage with Ambiguous Incidents
+
+A municipal emergency operations center where LLM-powered dispatchers assess ambiguous crisis reports — demonstrating that keyword matching fails when incidents are deliberately misleading, but LLMs reading full impact descriptions can succeed.
+
+Target runtime: NetLogo 7.0.3 (`.nlogox` model format).
+
+## The Story
+
+Three dispatchers — Veteran, Rookie, and Analyst — receive a stream of crisis incidents. Each must assess severity and route to the right response tier. The incident bank includes **misleading cases** where surface keywords don't match reality:
+
+- "Toxic chemical spill at school" → actually spilled vinegar (LOW severity)
+- "Minor water leak in basement" → threatening a neonatal ICU (CRITICAL severity)
+- "Dog loose on highway" → causing a multi-vehicle pileup (HIGH severity)
+
+A naive keyword heuristic over-triggers on "toxic", "fire", "collapse" and fails on these cases. The LLM reads the full impact description and can assess correctly.
+
+## Quick Start
+
+1. Edit `config.txt` with your provider credentials (default: local Ollama).
+2. Open `crisis-triage.nlogox` in NetLogo 7.0.3.
+3. Click **setup** → dispatchers appear with persona labels, responders by tier.
+4. Click **go** → incidents spawn, flow through the pipeline, monitors update.
+5. Watch the output log for `[TRIAGE]`, `[ROUTE]`, and `[REFLECT]` messages.
+
+## How to Use
+
+### Controls
+
+| Control | Type | Purpose |
+|---------|------|---------|
+| `use-llm?` | Switch | Toggle between LLM dispatchers and naive heuristic |
+| `memory-mode` | Chooser | persistent / per-episode / none |
+| `reflection-interval` | Slider | Ticks between dispatcher self-reflection (0 = off) |
+| `incident-rate` | Slider | Probability (%) of new incident per tick |
+| `episode-length` | Slider | Ticks per episode boundary (0 = no episodes) |
+| `add incident` | Button | Manually inject a random incident |
+| `force reflect` | Button | Trigger immediate reflection for all dispatchers |
+
+### What to Observe
+
+- **Misleading%** — The key metric. Accuracy on misleading incidents where keywords don't match reality.
+- **Triage Acc%** / **Route Acc%** — Overall accuracy vs ground truth.
+- **Accuracy Over Time** plot — Watch how accuracy evolves, especially with memory.
+- **Per-persona differences** — Veteran, Rookie, and Analyst may perform differently.
+- **Reflection log** — Dispatchers reason about their own performance.
+
+## The A/B Experiment
+
+1. Run with `use-llm?` ON for 50+ ticks. Note the Misleading% metric.
+2. Click setup again. Toggle `use-llm?` OFF. Run for 50+ ticks.
+3. Compare:
+   - **Heuristic**: ~30% on misleading cases (keywords mislead it).
+   - **LLM**: Expected ~70%+ on misleading cases (reads actual impact).
+4. Compare memory modes: Run with "persistent" vs "none" over multiple episodes.
+
+## LLM Primitives Exercised (8)
+
+| Primitive | Where | Paper Concept |
+|-----------|-------|---------------|
+| `llm:load-config` | `setup-llm` | Config management |
+| `llm:set-history` | `setup-dispatchers` — persona injection | Personalization (Ch.2) |
+| `llm:chat-with-template` | `triage-my-incidents` — severity assessment | Environment/Interface (Ch.1) |
+| `llm:choose` | `route-my-incidents` — bounded tier selection | Bounded Rationality |
+| `llm:history` | `dispatcher-reflect` — check history length | Memory (Ch.3) |
+| `llm:chat` | `dispatcher-reflect` — freeform reflection | Reflection (Ch.3) |
+| `llm:clear-history` | `handle-episode-boundary` — configurable reset | Memory ablation |
+| `llm:active` | Monitor widget — show provider/model | Provider awareness |
+
+## Design Rationale
+
+**Why dispatchers use LLM, not responders**: Triage and routing are judgment calls where reading context matters. Case processing is mechanical — it doesn't benefit from language understanding.
+
+**Why no thinking/reasoning models**: With 3 dispatchers making 2+ LLM calls per tick, thinking models would add minutes of latency per tick. The triage task is classification, not multi-step reasoning. Standard `llm:chat-with-template` and `llm:choose` are the right tools.
+
+**Why `llm:choose` for routing**: Guarantees the output is one of the valid tier names, avoiding parsing failures from freeform text.
+
+**Why misleading incidents**: They make the LLM genuinely necessary. Without them, keyword matching achieves similar accuracy and the LLM adds cost without value.
+
+## Paper Connection
+
+This demo implements concepts from the Gao et al. (2312.11970) LLM-ABM survey:
+
+- **Personalization** (Ch.2): Dispatcher personas via `llm:set-history` produce different decisions from the same model.
+- **Bounded Rationality**: `llm:choose` constrains decisions to valid options.
+- **Memory** (Ch.3): Configurable memory modes show how history retention affects performance.
+- **Reflection** (Ch.3): Dispatchers reason about their own accuracy and identify patterns.
+- **Environment/Interface** (Ch.1): Templates structure how agents perceive incidents.
+
+## Files
+
+| File | Purpose |
+|------|---------|
+| `crisis-triage.nlogox` | NetLogo 7 simulation model |
+| `triage-template.yaml` | Severity assessment prompt with anti-keyword-bias guidance |
+| `dispatcher-template.yaml` | Documentation stub (routing uses `llm:choose`) |
+| `config.txt` | LLM provider configuration |
+
+## Provider Configuration
+
+Default is local Ollama (no API key needed). See commented examples in `config.txt` for OpenAI, Claude, and Gemini. Never commit real API keys.
diff --git a/demos/crisis-triage/config.txt b/demos/crisis-triage/config.txt
@@ -0,0 +1,34 @@
+# Crisis Triage Demo LLM configuration
+# Path is loaded by crisis-triage.nlogox via llm:load-config
+
+# Recommended local/default option (no cloud key required)
+provider=ollama
+model=qwen2.5:7b
+base_url=http://localhost:11434
+
+# Runtime behavior
+temperature=0.2
+max_tokens=200
+timeout_seconds=45
+
+# Optional cloud fallback examples (commented)
+# provider=openai
+# api_key=YOUR_OPENAI_API_KEY_HERE
+# model=gpt-4o-mini
+# temperature=0.2
+# max_tokens=200
+# timeout_seconds=45
+
+# provider=claude
+# api_key=YOUR_ANTHROPIC_API_KEY_HERE
+# model=claude-3-5-haiku-latest
+# temperature=0.2
+# max_tokens=200
+# timeout_seconds=45
+
+# provider=gemini
+# api_key=YOUR_GEMINI_API_KEY_HERE
+# model=gemini-2.0-flash
+# temperature=0.2
+# max_tokens=200
+# timeout_seconds=45