Skip to content

fix(search_strategy): accept search_phases / phases / dict-wrapped queries#243

Open
minamo817 wants to merge 1 commit intoaiming-lab:mainfrom
minamo817:fix/search-plan-schema-variants
Open

fix(search_strategy): accept search_phases / phases / dict-wrapped queries#243
minamo817 wants to merge 1 commit intoaiming-lab:mainfrom
minamo817:fix/search-plan-schema-variants

Conversation

@minamo817
Copy link
Copy Markdown

@minamo817 minamo817 commented Apr 21, 2026

Summary

_execute_search_strategy only recognizes search_strategies as the top-level key of LLM-generated plans, and only accepts queries as a list of strings. When the LLM produces a semantically equivalent but syntactically different schema (e.g. search_phases: with queries like {q1: "..."}), the parser silently returns an empty queries_list and the stage falls back to _build_default_search_queries(), which slices the topic sentence into stopword-filtered n-grams.

Those n-grams then drive Stage 4's real API search, which returns whichever high-citation papers happen to match those generic tokens, and Stage 5's shortlist is populated with off-topic papers.

Repro

Any run where the LLM produces a plan with a top-level key other than search_strategies, or with queries wrapped as dicts. After Stage 3 completes:

  • stage-03/search_plan.yaml — a well-formed plan
  • stage-03/queries.json — only topic n-grams: <first N words of topic> plus literal " benchmark" / " survey" / " recent advances"
  • stage-04/candidates.jsonl — top hits match generic tokens rather than the intended domain
  • stage-05/shortlist.jsonl — off-topic papers

Fix

  • Accept search_strategies, search_phases, or phases at the top level.
  • In each strategy's queries list, handle items that are either strings or single-key dicts like {"q1": "query text"}.
  • Also treat a top-level queries: list as a last-resort fallback (same shape handling).

Plans that already use the original schema are unaffected.

Test plan

  • Existing test suite passes unchanged
  • queries.json reflects the LLM's plan rather than topic n-grams

Notes for reviewers

Separately, _build_default_search_queries() is a weak fallback that produces low-quality queries regardless of domain. That is out of scope here — this patch only addresses why the fallback is being hit when the LLM did produce a good plan. Happy to follow up.

…eries

The `_execute_search_strategy` parser only recognized `search_strategies`
as the top-level key for LLM-generated plans, and only accepted queries
as a list of strings. Real LLM output frequently uses different but
semantically equivalent schemas, e.g.:

    search_phases:
      - phase: 1
        queries:
          - q1: "<first domain query>"
          - q2: "<second domain query>"

When this happens, `plan.get("search_strategies", [])` returns `[]`,
`queries_list` stays empty, and the stage falls back to
`_build_default_search_queries()`, which slices the topic sentence into
stopword-filtered n-grams of the form `<topic-first-6-words>` plus the
literal strings `" benchmark"` / `" survey"` / `" recent advances"`.

These n-grams then drive Stage 4's real API search against OpenAlex /
Semantic Scholar / arXiv, which returns whichever high-citation 2020+
papers happen to match those generic tokens — entirely unrelated to the
actual research topic. Stage 5's shortlist is then populated with
completely off-topic papers.

This patch accepts `search_strategies`, `search_phases`, `phases`, and a
top-level `queries` fallback, and handles query items that are either
strings or single-key dicts like `{"q1": "query text"}`. Behavior for
plans that already use the original schema is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@minamo817 minamo817 force-pushed the fix/search-plan-schema-variants branch from 5595f30 to 9f15247 Compare April 21, 2026 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant