Credit collapsed-wrapper equivalence in static-parity coverage (honest score)#347
Merged
Merged
Conversation
The deterministic static-style parity comparator scored coverage as matched/source, where source elements include the presentational wrapper divs the transformer intentionally collapses. A collapsed wrapper owns no 1:1 candidate, so it fell to misaligned structural matches or counted as an outright drop — deflating coverage (15-saas 0.752, 38-medical 0.639) with false "no candidate" loss even when the wrapper's styling and content were faithfully preserved on the merged element. Add a non-consuming collapsed-wrapper equivalence pass. When a source element finds no 1:1 candidate, it earns coverage credit only if some candidate is a style superset (every declared, non-empty tracked-style value reproduced) AND subsumes its content (candidate text contains the source text, or the source has no text). This credits faithfully-absorbed wrappers while keeping genuine divergence as loss: a dropped or restyled element whose style is absent from every candidate, or whose content has no home, finds no absorbing candidate and stays counted, so the score still falls for real regressions. Property comparison is untouched, so no property regression is masked (property_parity unchanged: 0.9744 / 0.9707). Coverage = (matched + absorbed) / source. Content subsumption is directional so a short candidate (e.g. a one-letter icon) cannot spuriously absorb a text-bearing wrapper. Results — 15-saas 0.7328 -> 0.8113 (cov 0.7521 -> 0.8326), 38-medical 0.6201 -> 0.7752 (cov 0.6389 -> 0.7986); the compare-mismatch fixture still fails (coverage already 1.0, dropped hero bg / button radius still surface). Deterministic: same inputs -> byte-identical report. Report adds absorbed_source plus absorbed_source_total / covered_total. New fixture locks collapsed-wrapper equivalence (preserved wrapper is credited, dropped styled element still counts); match/mismatch fixtures assert absorption never fires spuriously. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes the deterministic static-parity score honest: stop counting collapsed presentational wrapper
<div>s as coverage loss. The transformer intentionally merges wrappers, so a source wrapper has no 1:1 candidate and was counted as a drop — deflating coverage with alignment noise rather than real divergence.Fix (
src/VisualParity/StaticStyleParityComparator.php)coverage = (matched + absorbed) / source_total. Report gainsabsorbed_source/covered_totalfor audit.Verification
--json2× on both fixtures → byte-identical; full 15-saas report 2× → byte-identical (353,291 bytes).div.col→p.lead, nav<li>s into their containing sections, decorative empty dots); real divergences stayed counted (nav-link typography change, dropped logo, display:none menus). The mismatch fixture stillfails (0.5455).composer test+composer parity: 183 fixtures green (+1 collapsed-wrapper fixture; strengthened match/mismatch to assertabsorbed_source_total: 0).Honest limit
Left the structural-tier 1:1 matching artifact (~40 false
display:flex→''deltas, affects property_parity which is already 0.97) unchanged — a defensible scope cut; coverage was the dominant clean lever.AI assistance