YTDB-646 Index-assisted pre-filtering for `both()` / `bothE()` MATCH patterns by sandrawar · Pull Request #982 · JetBrains/youtrackdb

Sandra Adamiec (sandrawar) · 2026-04-17T11:08:02Z

PR Title:

YTDB-646 Index-assisted pre-filtering for both() / bothE() MATCH patterns

Motivation:

Extend the index intersection pre-filter to support bidirectional traversals (both() and bothE() in MATCH patterns). Currently, these patterns silently degrade to unfiltered iteration because VertexEntityImpl returns an Apache Commons ChainedIterable that does not implement PreFilterableLinkBagIterable.

gemini-code-assist

Code Review

This pull request introduces the PreFilterableChainedIterable class to enable index-assisted pre-filtering for bidirectional traversals (both() and bothE()) within the MATCH engine. Previously, these traversals bypassed pre-filtering because they were wrapped in a standard chained iterable that did not support the pre-filter interface. The changes include updates to VertexEntityImpl to utilize the new class and enhancements to MatchExecutionPlanner to support class inference for bidirectional steps. Review feedback suggests extending this optimization to multi-label vertex traversals and improving the performance of the pre-filterable check in getEdgesInternal by replacing stream operations with a single loop.

github-actions · 2026-04-17T13:07:19Z

Coverage Gate Results

Thresholds: 85% line, 70% branch

Line Coverage: ✅ 100.0% (65/65 lines)

File	Coverage	Uncovered Lines
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/PreFilterableChainedIterable.java`	✅ 100.0% (37/37)	-
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/VertexEntityImpl.java`	✅ 100.0% (17/17)	-
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/MatchExecutionPlanner.java`	✅ 100.0% (11/11)	-

Branch Coverage: ✅ 100.0% (50/50 branches)

File	Coverage	Lines with Uncovered Branches
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/PreFilterableChainedIterable.java`	✅ 100.0% (18/18)	-
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/VertexEntityImpl.java`	✅ 100.0% (8/8)	-
`core/src/main/java/com/jetbrains/youtrackdb/internal/core/sql/executor/match/MatchExecutionPlanner.java`	✅ 100.0% (24/24)	-

github-actions · 2026-04-17T13:07:20Z

Test Count Gate Results

Tolerance: 5% drop allowed per module

Overall: ✅ 19954 tests (baseline: 19936, +18)

Module	Baseline	Current	Change	Status
`core`	9388	9403	+15	✅
`docker-tests`	1891	1891	+0	✅
`embedded`	1931	1931	+0	✅
`examples`	6	6	+0	✅
`gremlin-annotations`	30	30	+0	✅
`jmh-ldbc`	39	42	+3	✅
`server`	5504	5504	+0	✅
`tests`	1147	1147	+0	✅

Sandra Adamiec (sandrawar) · 2026-04-20T13:18:23Z

Benchmark results: `both-pre-filter-support` vs `develop`

Ran the three single-thread bothE benchmarks on Hetzner CCX33 (8 dedicated AMD vCPUs), JDK 21, LDBC SF1. Identical canonical curated params on both branches; database freshly loaded per branch (schema differs — PR adds KNOWS.creationDate index). JMH params: -f 3 -wi 3 -w 10s -i 10 -r 30s (30 measurement iterations across 3 forks).

Summary

Benchmark	develop (ops/s)	PR (ops/s)	Δ	Notes
`bothEKnows_recentConnections` (small-bag, Person→KNOWS ~100 edges, 95p date)	6974.58 ± 229.5	7002.45 ± 110.0	+0.40%	noise, as expected
`bothEHasMember_recentJoiners` (hub, top-100 Forums, `.inV()` + ORDER BY + LIMIT, 95p date)	260.45 ± 23.2	288.96 ± 2.4	+10.95%	real improvement + ~10× tighter error bars
`bothEHasMember_joinerCount` (hub, top-100 Forums, COUNT only, 99p date)	702.8 ± 151.0	855.4 ± 14.2	+21.7% nominal	improvement + ~10× tighter error bars

Per-fork breakdown for `joinerCount`

The headline number understates the benefit — the real story is fork-level variance.

Branch	Fork 1	Fork 2	Fork 3
develop	389.1	874.1	845.2
PR	880.5	854.7	831.1

One of develop's three forks runs at less than half the throughput of the others. Without the pre-filter, the query's speed depends heavily on page-cache residency of the HAS_MEMBER bag; with the pre-filter, work is bounded by the index RID set and stays deterministic across forks. The PR eliminates the worst-case fork entirely — in production this translates to consistent query latency instead of sporadic cold-cache stalls.

Interpretation

Small-bag (KNOWS, ~100 edges/person): matches the benchmark author's documented prediction — pre-filter overhead balances the savings when bags are small. No regression.
Hub-shape (HAS_MEMBER): +11% on the realistic "recent joiners" pattern and +22% (nominal) on the pure COUNT variant — the scenario the optimization is designed for. Even more importantly, error bars shrink ~10× in both hub benchmarks: the pre-filter doesn't just improve average throughput, it stabilises latency by making work independent of cache state.

Net: the optimization delivers its intended benefit on hub-shape bothE traversals with no measurable cost on small-bag traversals, plus a significant stability win that is as valuable as the throughput gain for a production query engine.

…patterns

… to getVerticesOptimized, single-pass check

…dge-method and sets aliasClasses[e2]=X

gemini-code-assist Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/VertexEntityImpl.java Outdated

Comment thread core/src/main/java/com/jetbrains/youtrackdb/internal/core/record/impl/VertexEntityImpl.java Outdated

Sandra Adamiec (sandrawar) force-pushed the both-pre-filter-support branch from b6f1b22 to b7301a5 Compare April 18, 2026 20:36

Sandra Adamiec (sandrawar) requested review from Andrii Lomakin (andrii0lomakin) and removed request for Andrii Lomakin (andrii0lomakin) April 20, 2026 09:58

Sandra Adamiec (sandrawar) force-pushed the both-pre-filter-support branch from 85fbedd to 9bc4ced Compare April 20, 2026 13:17

Sandra Adamiec (sandrawar) force-pushed the both-pre-filter-support branch from 9bc4ced to d9d3867 Compare April 20, 2026 13:25

Sandra Adamiec (sandrawar) added 8 commits April 21, 2026 11:09

YTDB-646 Add LdbcSingleThreadBothEBenchmark + KNOWS.creationDate index

0f2414a

YTDB-646 Add hub-shape bothE(HAS_MEMBER) benchmark

7f1ff53

YTDB-646 Add unit tests for PreFilterableChainedIterable uncovered paths

a7d01bd

YTDB-646 Index-assisted pre-filtering for both() / bothE() MATCH …

d87b777

…patterns

YTDB-646 infer target class for both('X') with symmetric edge

057504c

YTDB-646 address Gemini feedback: extend PreFilterableChainedIterable…

0a352ba

… to getVerticesOptimized, single-pass check

YTDB-646 update stale bothE planner test: bothE is now a recognized e…

d99455a

…dge-method and sets aliasClasses[e2]=X

YTDB-646 add count bothE(HAS_MEMBER) hub benchmark

13699f4

Sandra Adamiec (sandrawar) force-pushed the both-pre-filter-support branch from d9d3867 to 13699f4 Compare April 21, 2026 09:10

Sandra Adamiec (sandrawar) requested review from Lev Sivashov (lpld) and removed request for Andrii Lomakin (andrii0lomakin) May 4, 2026 04:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

YTDB-646 Index-assisted pre-filtering for `both()` / `bothE()` MATCH patterns#982

YTDB-646 Index-assisted pre-filtering for `both()` / `bothE()` MATCH patterns#982
Sandra Adamiec (sandrawar) wants to merge 8 commits into
developfrom
both-pre-filter-support

Sandra Adamiec (sandrawar) commented Apr 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Sandra Adamiec (sandrawar) commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Sandra Adamiec (sandrawar) commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Title:

Motivation:

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Gate Results

Line Coverage: ✅ 100.0% (65/65 lines)

Branch Coverage: ✅ 100.0% (50/50 branches)

Uh oh!

github-actions Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Count Gate Results

Overall: ✅ 19954 tests (baseline: 19936, +18)

Uh oh!

Sandra Adamiec (sandrawar) commented Apr 20, 2026

Benchmark results: both-pre-filter-support vs develop

Summary

Per-fork breakdown for joinerCount

Interpretation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sandra Adamiec (sandrawar) commented Apr 17, 2026 •

edited

Loading

github-actions Bot commented Apr 17, 2026 •

edited

Loading

github-actions Bot commented Apr 17, 2026 •

edited

Loading

Benchmark results: `both-pre-filter-support` vs `develop`

Per-fork breakdown for `joinerCount`