Skip to content

parser: tag CALLS edges inside dead guards with reachable flag#580

Open
HouMinXi wants to merge 2 commits into
tirth8205:mainfrom
HouMinXi:dead-guard-576
Open

parser: tag CALLS edges inside dead guards with reachable flag#580
HouMinXi wants to merge 2 commits into
tirth8205:mainfrom
HouMinXi:dead-guard-576

Conversation

@HouMinXi

Copy link
Copy Markdown

Linked issue

Closes #576

What & why

Calls inside if False: and if TYPE_CHECKING: blocks are statically
unreachable, but the parser emits CALLS edges for them anyway. This
causes false positives in dead-code detection (refactor_tool mode
dead_code) and inflates impact-radius results.

This PR adds a lightweight ancestor-walk helper
_is_in_static_dead_guard() that detects these two common Python
idioms and sets extra["reachable"] = False on the affected CALLS
edges. Live edges omit the key entirely, so:

  • Existing graph.db files remain forward-compatible (absent key =
    live).
  • Downstream consumers can filter with:
    json_extract(extra, '$.reachable') IS NULL OR json_extract(extra, '$.reachable') != 0

Scope

  • Covered (v1): All Tree-sitter languages routed through
    _extract_calls (Python, JS/TS, Go, Rust, C, C++, Java, etc.).
  • Not covered: Bespoke handlers (ReScript, Dart regex path,
    Elixir), C preprocessor guards (#if 0). These are left for
    follow-up.

Design choice: ancestor walk vs threaded flag

Chose ancestor walk (approach b from #576 discussion) over threading
a flag through _extract_from_tree because it requires zero signature
changes, keeps the diff minimal, and is O(scope-depth) per call node
-- negligible overhead.

How it was tested

uv run pytest tests/test_parser.py -v          # 104 passed
uv run pytest tests/ --tb=short -q             # 1385 passed, 0 failed
uv run ruff check code_review_graph/parser.py  # All checks passed

Bug-inject verification: neutralized the guard check (if False and ...) -- test failed on dead_false_call edge missing the flag.
Restored -- test passed.

Real-path smoke test: parsed a Python file with if False: and
if TYPE_CHECKING: calls outside the test suite; confirmed live calls
show reachable=ABSENT and dead calls show reachable=False.

Checklist

  • Tests added for new functionality
  • All tests pass: uv run pytest tests/ --tb=short -q
  • Linting passes: uv run ruff check code_review_graph/
  • Lines are at most 100 characters
  • Type checking passes (mypy not run locally; CI will verify)
  • Docs updated where behavior changed (no user-facing API change)

Calls inside `if False:` and `if TYPE_CHECKING:` blocks are
statically unreachable but the parser still emits CALLS edges
for them, causing false positives in dead-code detection and
impact analysis.

Add a lightweight ancestor-walk helper that detects these two
common Python idioms and sets extra["reachable"] = False on the
affected CALLS edges.  Live edges omit the key entirely so
existing graph.db files remain forward-compatible (absent key
means live).

The detection covers all Tree-sitter languages routed through
_extract_calls.  Bespoke handlers (ReScript, Dart, Elixir) and
C preprocessor guards (#if 0) are left for follow-up work.

Closes tirth8205#576

Signed-off-by: Minxi Hou <houminxi@gmail.com>
@itxaiohanglover

Copy link
Copy Markdown

Nice implementation! I reported #576 and this is exactly what I had in mind. The scope boundary check (stopping at function/class/module) is the right call — a dead guard in an outer scope shouldn't propagate to nested definitions. Tagging with reachable: False rather than removing edges is smart too, keeps the graph complete for downstream consumers.

One small thing: might be worth handling if 0: as well? It's less common but I've seen it in C-extension bindings. Not a blocker though.

Extend _is_in_static_dead_guard to recognize the integer literal 0
as a dead condition, alongside False and TYPE_CHECKING.  The if 0:
pattern appears in C-extension bindings and some legacy codebases.

In tree-sitter-python, 0 parses as an "integer" node with text b"0",
so the check mirrors the existing "false" node check.

Suggested-by: tirth8205 (PR tirth8205#580 review)
Signed-off-by: Minxi Hou <houminxi@gmail.com>
@HouMinXi

Copy link
Copy Markdown
Author

Good catch -- added if 0: support in ded4913. Tree-sitter parses 0 as an integer node with text == b"0", so the check mirrors the existing false node pattern. Test fixture and assertions updated accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CALLS edges are emitted for calls inside statically-dead code (if False:, if TYPE_CHECKING:, #if 0)

2 participants