Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions docs/DISCOVERIES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,89 @@

This file documents non-obvious problems, solutions, and patterns discovered during amplihack development. Review and update this regularly, removing outdated entries or those replaced by better practices, code, or tools. Update entries where best practices have evolved.

## Sync Method Called with asyncio.run() Passes Resolved Value, Not Coroutine (2026-04-25)

### Issue

`agent_subprocess.py` wrapped `agent.answer_question(...)` with `asyncio.run()`.
Every L1–L12 level in the progressive test suite failed with:

```
ValueError: a coroutine was expected, got ('', ReasoningTrace(...))
```

### Root Cause

`AnswerSynthesizerMixin.answer_question` is a **sync** method that internally
calls `_run_async_or_return` and returns the already-resolved value. Passing
the resolved tuple to `asyncio.run()` triggers `ValueError` because
`asyncio.run()` expects a coroutine object, not a plain Python value.

### Solution

**PR #4471**: Call `agent.answer_question(...)` directly. `learn_from_content`
on the same line is genuinely async and remains wrapped with `asyncio.run()`.

### Key Learnings

1. **Verify async status before wrapping**: Check `async def` in the method
signature — not just the module name — before using `asyncio.run()`
2. **`_run_async_or_return` helpers return synchronously**: They are sync
wrappers around async dispatch; they do not return coroutines
3. **Error message is misleading**: The bench logs reported a memory warning,
not the actual `ValueError`, making diagnosis harder

### Prevention

- Read the method signature (`async def` vs `def`) before adding `asyncio.run()`
- Search for sibling calls in the same file to verify the async/sync pattern
used for other methods on the same object

---

## Shadowed Async Error in Progressive Eval Suite Exposed by Upstream Fix (2026-04-25)

### Issue

After PR #4471 fixed the `asyncio.run()` misuse on `answer_question`, a new
failure appeared across all L1–L12 levels in the gym benchmark:

```
RuntimeWarning: coroutine 'grade_metacognition' was never awaited
✗ L1 failed: 'coroutine' object has no attribute 'effort_calibration'
```

### Root Cause

`grade_metacognition` (`metacognition_grader.py:280`) is `async def`, but
`progressive_test_suite.run_single_level` is a **sync** method and called it
without `asyncio.run()`. The coroutine was never awaited; accessing
`.effort_calibration` on the unawaited coroutine object raised `AttributeError`.
The bug was invisible before #4471 because the upstream failure masked it.

### Solution

**PR #4472**: Wrap the call with `asyncio.run(grade_metacognition(...))` and
add `import asyncio` to `progressive_test_suite.py`.

### Key Learnings

1. **Upstream failures can mask downstream async bugs**: Fix root-cause errors
first; expect secondary async/sync mismatches to surface after
2. **`RuntimeWarning: coroutine was never awaited` pinpoints the site**: The
warning names the exact coroutine function — go there first
3. **`AttributeError` on coroutine object is a secondary symptom**: The real
error is the missing `asyncio.run()` one line earlier

### Prevention

- After fixing an async/sync mismatch, re-run the full test suite to catch
downstream bugs now exposed
- Treat `RuntimeWarning: coroutine … was never awaited` as a hard error, not
a warning — it always indicates a broken async call path

---

## Cleanup Agent Gap: Root Directory Organization (2026-01-12)

### Issue
Expand Down
20 changes: 20 additions & 0 deletions docs/howto/agent-learning-eval-harness.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,26 @@ The latest accepted Azure scores and the local reproduction commands are tracked

- [Current validation results](../hive_mind/current-validation-results.md)

## Async/Sync Contract for Eval Methods

When adding calls to eval methods in `agent_subprocess.py` or
`progressive_test_suite.py`, check whether the method is `async def` or `def`
before deciding how to call it:

| Method | Type | How to call from sync context |
|--------|------|-------------------------------|
| `agent.answer_question(...)` | sync (`def`) | Call directly — no `asyncio.run()` |
| `agent.learn_from_content(...)` | async (`async def`) | `asyncio.run(agent.learn_from_content(...))` |
| `grade_metacognition(...)` | async (`async def`) | `asyncio.run(grade_metacognition(...))` |

Wrapping a sync method with `asyncio.run()` raises
`ValueError: a coroutine was expected` because the resolved return value is
not a coroutine. Omitting `asyncio.run()` from an async method leaves the
coroutine unawaited and raises `AttributeError` when you access fields on it.

See [April 2026 eval fixes](../recipes/RECENT_FIXES_APRIL_2026.md) for the
full diagnosis (PRs #4471, #4472).

## Related Docs

- [Distributed Hive Evaluation](../hive_mind/EVAL.md)
Expand Down
80 changes: 80 additions & 0 deletions docs/recipes/RECENT_FIXES_APRIL_2026.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Recent Fixes — April 2026

This document tracks bug fixes and improvements merged in April 2026, following
the [Diátaxis](https://diataxis.fr/) framework (Explanation sections).

---

## April 25 — Eval Async/Sync Contract Fixes

Two sequential fixes in the `amplihack-agent-eval` progressive test suite
resolved a class of bugs where Python's `asyncio.run()` was misapplied to
sync methods, or omitted from genuinely async ones.

### Don't wrap sync `answer_question` with `asyncio.run()` (PR #4471)

**Problem**: `agent_subprocess.py` called `asyncio.run(agent.answer_question(...))`
inside the `testing_phase` function. Every L1–L12 level failed with:

```
ValueError: a coroutine was expected, got ('', ReasoningTrace(...))
```

The bench logs obscured the root cause by reporting
`Testing phase failed: WARNING: amplihack_memory.graph not available`.

**Root cause**: `AnswerSynthesizerMixin.answer_question`
(`src/amplihack/agents/goal_seeking/answer_synthesizer.py:52`) is a **sync**
method. It calls `_run_async_or_return` internally and returns the resolved
value directly. Passing the resolved tuple `('', ReasoningTrace)` to
`asyncio.run()` raised `ValueError` because `asyncio.run()` expects a coroutine,
not an already-resolved value.

**Fix**: Call `agent.answer_question(...)` directly without `asyncio.run()`.
`learn_from_content` (line 120) is genuinely async and remains wrapped.

**Rule**: Before wrapping a call with `asyncio.run()`, verify via the method
signature or docstring that the method is `async def`. Sync wrappers that
internally dispatch to async code (like `_run_async_or_return`) return
synchronously — wrapping them is incorrect.

---

### Await async `grade_metacognition` in progressive suite (PR #4472)

**Problem**: This bug was hidden by the #4471 error above. Once #4471 was
fixed and `answer_question` began returning the correct trace, the downstream
`grade_metacognition` call surfaced a new failure across all L1–L12 levels:

```
RuntimeWarning: coroutine 'grade_metacognition' was never awaited
✗ L1 failed: 'coroutine' object has no attribute 'effort_calibration'
✗ L2 failed: 'coroutine' object has no attribute 'effort_calibration'
... (repeats L1 through L12)
```

**Root cause**: `grade_metacognition` is **async**
(`metacognition_grader.py:280`) but `progressive_test_suite.run_single_level`
is a **sync** method and called it without `asyncio.run()`. The returned
coroutine was never awaited; the subsequent `.effort_calibration` access
raised `AttributeError`.

**Fix**: Wrap the call with `asyncio.run(grade_metacognition(...))` and add
the `asyncio` import to `progressive_test_suite.py`.

**Rule**: When a sync function must call an async function, use
`asyncio.run()`. When an async function calls another async function, use
`await`. The `run_single_level` function is sync, so `asyncio.run()` is the
correct bridge.

---

### Summary table

| PR | File | Bug | Fix |
|----|------|-----|-----|
| #4471 | `eval/agent_subprocess.py` | `asyncio.run()` on sync `answer_question` | Call directly |
| #4472 | `progressive_test_suite.py` | Missing `asyncio.run()` on async `grade_metacognition` | Add `asyncio.run()` + import |

**Dependency**: #4472 was shadowed by #4471. Fix #4471 first to expose the
downstream grader bug.