Design doc for #2158: Flag-Gated Code Coverage for Bosatsu Test Execution by johnynek · Pull Request #2159 · johnynek/bosatsu

johnynek · 2026-03-13T20:44:30Z

Design doc.

gemini-code-assist · 2026-03-13T20:44:55Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive design document outlining the implementation of optional, flag-gated code coverage for Bosatsu's test execution. The proposal details how to instrument code during lowering, collect execution hits at runtime, and generate CI-friendly reports in LCOV and Cobertura formats, all while minimizing performance impact and ensuring compatibility with existing workflows.

Highlights

Feature Introduction: Introduced a design for flag-gated code coverage functionality for the Bosatsu tool test command, allowing optional collection of execution data.
Instrumentation Strategy: Proposed a method to record source-linked probes during the Matchless lowering phase, avoiding invasive changes to the language semantics.
Runtime Hit Collection: Designed an extension to the MatchlessToValue evaluator to increment counters for executed probes, ensuring minimal overhead in default operation.
Reporting Formats: Planned for the generation of industry-standard LCOV and Cobertura reports, with an internal data model compatible with JaCoCo for future expansion.
Performance and Fidelity: Addressed concerns about runtime overhead and source fidelity by making coverage optional and suggesting the use of NoOptimize during coverage runs.

Changelog

docs/design/2158-codecoverage-for-bosatsu.md
- Added a new design document for flag-gated code coverage in Bosatsu test execution.

Activity

The pull request currently contains a design document for the proposed feature, serving as an initial blueprint for discussion and implementation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a design document for adding flag-gated code coverage to Bosatsu tests. The design is comprehensive, covering the architecture, implementation plan, testing strategy, and risks. It proposes instrumenting code during the Matchless lowering phase, collecting hit counts in the interpreter, and generating reports in LCOV and Cobertura formats. My main feedback concerns the interaction between probe registration and Matchless.Expr optimizations, which could affect the stability of probe identifiers.

gemini-code-assist · 2026-03-13T20:46:58Z

+### 2) Static Probe Registration During Matchless Lowering
+Instrument at lowering time, not by changing Bosatsu language semantics:
+1. Extend `Matchless.fromLet` with optional coverage recorder callbacks.
+2. For each compiled typed-expression root that has usable source region metadata, register a probe id against the produced Matchless expression object identity.
+3. For each typed `Match`, additionally register a branch decision group with one alternative per branch body.
+
+This avoids invasive AST-wide payload changes while still mapping execution to source regions.


The proposal to register probes against Matchless.Expr object identity is a key part of the design. However, there's a potential issue with Matchless.Expr optimizations.

The Matchless.fromLet function, where instrumentation is proposed, applies several optimizations at the end of its execution (e.g., reuseConstructors, hoistInvariantLoopLets). These optimizations rewrite the Matchless.Expr tree, which will invalidate any object identities registered before they run. This could lead to probes being associated with stale objects and result in incorrect coverage data.

To address this, the design should clarify the ordering. Either:

Probe registration should occur after all Matchless.Expr optimizations have been applied.

These Matchless.Expr-level optimizations should be disabled when coverage collection is enabled, similar to how NoOptimize is used for TypedExpr optimizations.

Please address the comment.

One idea could be to aggregrate a RegionSet for Expr values, so when we optimize two different source positions into a single bit of code, we can take that single bit with counting for executing two regions. Although, that's a hand waving argument.

Addressed. I adopted the RegionSet direction in Static Probe Registration During Matchless Lowering: coverage metadata is now (sourceHash, RegionSet), and rewrites/merges explicitly union RegionSets so one optimized expression can count for multiple source regions.

codecov · 2026-03-13T21:16:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.39%. Comparing base (0f3bea7) to head (521d33e).

⚠️ Current head 521d33e differs from pull request most recent head 6be5139

Please upload reports for the commit 6be5139 to get more accurate results.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2159      +/-   ##
==========================================
+ Coverage   85.03%   85.39%   +0.35%     
==========================================
  Files         190      184       -6     
  Lines       46891    38710    -8181     
  Branches    11805     9731    -2074     
==========================================
- Hits        39876    33057    -6819     
+ Misses       7015     5653    -1362

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

johnynek · 2026-03-14T02:35:50Z

+For each executed probe:
+1. Source package name.
+2. Source region (`start`, `end` offsets).
+3. Probe kind (`line` probe, `branch alternative` probe).


why isn't this implied by keeping track of the source regions executed? For instance if you execute the branch expr for a match then you have taken that branch. How is it different from the source region?

Good point. I updated Coverage Data We Must Retain to make this explicit: executed regions do imply a branch was taken, and the extra decision-group metadata is only to preserve the denominator/untaken alternatives required by branch-report formats.

johnynek · 2026-03-14T02:37:35Z

+### 2) Static Probe Registration During Matchless Lowering
+Instrument at lowering time, not by changing Bosatsu language semantics:
+1. Extend `Matchless.fromLet` with optional coverage recorder callbacks.
+2. For each compiled typed-expression root that has usable source region metadata, register a probe id against the produced Matchless expression object identity.
+3. For each typed `Match`, additionally register a branch decision group with one alternative per branch body.
+
+This avoids invasive AST-wide payload changes while still mapping execution to source regions.


Please address the comment.

One idea could be to aggregrate a RegionSet for Expr values, so when we optimize two different source positions into a single bit of code, we can take that single bit with counting for executing two regions. Although, that's a hand waving argument.

johnynek · 2026-03-14T02:38:16Z

+
+This avoids invasive AST-wide payload changes while still mapping execution to source regions.
+
+### 3) Runtime Hit Collection In Evaluator


we should not slow down the normal MatchlessToValue to do this. It should be purely opt-in. So, when we aren't running coverage, there is no hit on performance.

Updated. The Runtime Hit Collection In Evaluator section now states a strict zero-cost normal path: non-coverage runs keep the existing MatchlessToValue path unchanged, and coverage uses a separate evaluator path only when --coverage is enabled.

johnynek · 2026-03-14T02:39:48Z

+
+### 4) Source Mapping And Filtering
+Coverage should only emit files we can map back to source paths:
+1. Use `(Path, PackageName)` mapping already returned by compilation (`RuntimeCommandSupport.packMap`) to map package to source file path.


I think hashing the source with blake3 and keeping that as the key in Matchless source information could be a good idea. So HashValue[Algo.Blake3] and the RegionSet should be enough.

Integrated this suggestion. The doc now uses HashValue[Algo.Blake3] as the source key and describes hash -> (path, LocationMap) resolution for reporting.

johnynek · 2026-03-14T02:40:31Z

+2. Update `RuntimeCommandSupport.packMap` to accept compile options from caller.
+3. Add coverage model classes (`CoverageModel`, `CoverageAggregator`) for static probes and runtime counters.
+4. Extend `Matchless.fromLet` and `MatchlessFromTypedExpr.compile` to register probes and branch groups during lowering.
+5. Extend `MatchlessToValue` with optional expression-entry callback.


again, I think we must be clear that we should have a zero cost impact on the normal path.

Clarified again in both architecture and plan/acceptance sections: normal runs must not pay coverage overhead, and coverage logic is isolated to the explicit opt-in path.

johnynek · 2026-03-14T02:53:53Z

Updated docs/design/2158-codecoverage-for-bosatsu.md to address all review comments (branch-vs-region clarification, RegionSet merge model, Blake3 source keying, and explicit zero-cost non-coverage path). Ran required pre-push check: scripts/test_basic.sh passed (Passed: Total 64, Failed 0, Errors 0).

…ecoverage-for-bosatsu

johnynek · 2026-04-19T19:12:29Z

Merged origin/main into the PR branch. Post-merge verification passed with scripts/test_basic.sh (Passed: Total 67, Failed 0, Errors 0).

johnynek · 2026-04-19T20:03:52Z

MergeXO feedback automation is blocked because the agent returned commit_message but no new staged changes or local commits were detected.

Action: request concrete file edits (or explicit git_ops), then reset blocked feedback state.

docs: add design for issue #2158

20c3ca7

johnynek mentioned this pull request Mar 13, 2026

codecoverage for bosatsu #2158

Open

gemini-code-assist Bot reviewed Mar 13, 2026

View reviewed changes

johnynek commented Mar 14, 2026

View reviewed changes

docs: address review feedback on coverage design

521d33e

Merge remote-tracking branch 'origin/main' into agent/design/2158-cod…

6be5139

…ecoverage-for-bosatsu


		This avoids invasive AST-wide payload changes while still mapping execution to source regions.

		### 3) Runtime Hit Collection In Evaluator

Conversation

johnynek commented Mar 13, 2026

Uh oh!

gemini-code-assist Bot commented Mar 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johnynek commented Mar 14, 2026

Uh oh!

johnynek commented Apr 19, 2026

Uh oh!

johnynek commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Mar 13, 2026 •

edited

Loading