Skip to content

test: fix CI regressions (Vertex/litellmrouter construction, dedup config, trace recorder leak)#2491

Merged
nicoloboschi merged 4 commits into
mainfrom
fix/ci-broken-test-fixtures
Jul 1, 2026
Merged

test: fix CI regressions (Vertex/litellmrouter construction, dedup config, trace recorder leak)#2491
nicoloboschi merged 4 commits into
mainfrom
fix/ci-broken-test-fixtures

Conversation

@nicoloboschi

@nicoloboschi nicoloboschi commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes CI failures on main. Test-only changes.

1. Vertex AI regression — provider built without the Vertex settings

LLMConfig/LLMProvider were refactored to use provider-specific settings as-passed (not resolved from global config). CI runs test-api/Core LLM/acceptance with HINDSIGHT_API_LLM_PROVIDER=vertexai, and two constructors built the provider without forwarding vertexai_project_id/region/service_account_key, raising HINDSIGHT_API_LLM_VERTEXAI_PROJECT_ID is required:

  • test_fact_extraction_analysis.py::llm_config (ERROR in test-api)
  • test_llm_provider.py::_make_llm (FAILED in LLM acceptance (vertexai))

2. litellmrouter regression — same class

test_llm_provider.py::_make_llm didn't forward litellmrouter_config either → LLM acceptance (litellmrouter) failed with Provider 'litellmrouter' requires a config object.

3. test_consolidation_dedup — 2 merge-path tests

Since #2425 the dedup merge path reads config.text_search_extension (+_native_language); the test configs only set consolidation_dedup_threshold. Added the fields (defaults native/english).

4. test_llm_trace::test_disabled_writes_no_rows — leaked-recorder flake (#2229)

MemoryEngine.__init__ registers its LLM-trace recorder in a process-global registry; only close() removes it. Tests that construct an engine directly (e.g. test_per_operation_llm_config) never close it, leaking an enabled recorder. Locally its writes fail (uninitialized backend) so 0 rows; in CI a leaked recorder with a live backend records a later test's LLM calls into the shared DB, so test_disabled_writes_no_rows sees rows for its bank even though its own recorder is disabled (assert N == 0). Reproduced: after test_per_operation_llm_config the registry holds 8 enabled recorders. Added an autouse fixture that drops any recorder a test leaked (registry goes 8 → 1). _teardown_memory_engine already guarded the fixtures; this guards direct constructions.

Verified

  • test_consolidation_dedup.py → 18 passed.
  • Vertex/litellmrouter: LLMConfig/LLMProvider construct with the params passed, still raise when omitted (repros the CI errors). Prior run confirmed test-api (1/3) and both acceptance jobs went green.
  • Trace: 53 trace + leak-risk tests pass together under -n2; leaked-recorder count 8 → 1.
  • Lint clean.

Out of scope (pre-existing, not this PR)

  • test_fact_extraction_quality judge tests — real-LLM model variability (a different one flakes each run).
  • test_migration_observation_search_vector_backfill / test-*-client-oracle — CI infra (embedded-PG / Oracle DB startup).

…onfig

The dedup merge/update path builds a search_vector UPDATE clause from
config.text_search_extension (+ _native_language) since #2425, but the
_dedup_reconcile_create / _dedup_reconcile_update test configs only set
consolidation_dedup_threshold, so the two merge-path tests raised
AttributeError: 'types.SimpleNamespace' object has no attribute
'text_search_extension' on main.

Add the two fields (production defaults native/english) to those configs.
The clause reuses $1, so the existing positional-arg assertions are unchanged.
@nicoloboschi nicoloboschi changed the title test(consolidation): fix dedup merge-path tests missing text-search config test: fix deterministic test-api failures (dedup config + misplaced real-LLM test) Jul 1, 2026
Regression: LLMConfig was refactored to use vertexai_project_id/region/
service_account_key as-passed (the caller resolves the global-config fallback),
but the llm_config fixture never forwarded them. So with the CI provider set to
vertexai, LLMConfig raised "HINDSIGHT_API_LLM_VERTEXAI_PROJECT_ID is required"
even though the env var was set — the test errored in the test-api job.

Forward the three Vertex settings from config (mirroring MemoryEngine's own
LLMConfig construction). Verified: LLMConfig(provider="vertexai", ...) now
constructs with project_id passed, and still raises when it is omitted.
@nicoloboschi nicoloboschi changed the title test: fix deterministic test-api failures (dedup config + misplaced real-LLM test) test: fix CI regressions (Vertex/litellmrouter provider construction, dedup config, trace isolation) Jul 1, 2026
@nicoloboschi nicoloboschi force-pushed the fix/ci-broken-test-fixtures branch from b29bcfd to 31cc360 Compare July 1, 2026 08:03
_make_llm() built an LLMProvider from the env-selected provider without
forwarding provider-specific settings, so the vertexai and litellmrouter
acceptance-matrix jobs failed at construction ("VERTEXAI_PROJECT_ID is required"
/ "litellmrouter requires a config object"). LLMProvider uses these as-passed
(it does not resolve them from global config), so forward
vertexai_project_id/region/service_account_key and litellmrouter_config.
@nicoloboschi nicoloboschi force-pushed the fix/ci-broken-test-fixtures branch from 31cc360 to 4da6a13 Compare July 1, 2026 08:29
@nicoloboschi nicoloboschi changed the title test: fix CI regressions (Vertex/litellmrouter provider construction, dedup config, trace isolation) test: fix CI regressions (Vertex/litellmrouter provider construction, dedup config) Jul 1, 2026
Root cause of the flaky test_llm_trace::test_disabled_writes_no_rows:
MemoryEngine.__init__ registers its LLM-trace recorder in a process-global
registry, and only close() removes it. Tests that construct an engine directly
(test_per_operation_llm_config, test_llm_reasoning_effort_env, etc.) never
close it, leaking an ENABLED recorder. Locally those recorders' writes fail
(uninitialized backend), but in CI a leaked recorder with a live backend
records a later test's LLM calls into the shared DB — so test_disabled_writes_
no_rows sees rows for its bank even though its own recorder is disabled
(assert N == 0). Reproduced: after test_per_operation_llm_config the registry
holds 8 enabled recorders.

Add an autouse fixture that snapshots the registry and removes anything a test
leaked. Verified: the registry drops from 8 leaked recorders back to 1.
_teardown_memory_engine already guards the fixtures; this guards direct
constructions. 53 trace + leak-risk tests pass together under -n2.
@nicoloboschi nicoloboschi changed the title test: fix CI regressions (Vertex/litellmrouter provider construction, dedup config) test: fix CI regressions (Vertex/litellmrouter construction, dedup config, trace recorder leak) Jul 1, 2026
@nicoloboschi nicoloboschi merged commit 33e9db6 into main Jul 1, 2026
196 of 202 checks passed
@nicoloboschi nicoloboschi deleted the fix/ci-broken-test-fixtures branch July 1, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant