Harden corpus-diagnostics harness: severity-ranked, defect-faithful worklist#337
Merged
Conversation
…orklist Close four blind spots in the php-transformer corpus-diagnostics harness so its numbers reflect real, editor-visible defects instead of structural proxies and working behavior. Reporting/detector-only — no transformer conversion logic is touched. 1. RichText invalidity is now the headline signal. The structural wp_block_validity round-trip reports invalid_blocks=0 even when the editor would mark content invalid, because it does not model RichText stripping class/style off inline <span>/<a> in paragraph/heading/list-item content. The classed-span detector is promoted to the authoritative editor-invalid-risk signal (richtext_invalid_content_risk, HIGH), extended to cover <a> and list-item content, surfaced via a richtext_invalid_risk_count metric, and the summary no longer presents structural invalid_blocks=0 as "no invalid content". 2. Layout-direction faithfulness. New layout_direction_misrecognition detector flags a core/columns emitted from a display:flex;flex-direction:column source (a vertical stack rendered as horizontal columns). Conservative: only inline column-direction flex on container elements with 2+ children, confirmed by a verifier that the fragment actually converts to core/columns. Horizontal flex and grid are never flagged. 3. SVG-loss is now HIGH-severity and surfaced. svg_content_lost routes inline-svg fallback diagnostics plus empty/comment-only core/html that bears an SVG remnant into one lane, while preserving the distinction from svg kept as core/html with real shape elements (acceptable, not flagged). 4. CSS var() density is informational. var() references are materialized downstream by SSI, so resolved var density is relabeled informational_var_density (severity=info) and down-ranked below all actionable clusters instead of inflating the worklist with 233 css clusters. Clusters now rank by severity tier first, then count, so real defects lead the worklist. Adds/extends tests/unit/corpus-detectors.php for all four cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens the php-transformer corpus-diagnostics harness (added in #327) so its numbers are trustworthy. This closes four known blind spots where the harness either under-reported real defects or inflated the worklist with working behavior. Reporting/detector-only — no transformer conversion logic is touched. All changes stay within
php-transformer/src/CorpusDiagnostics/and its tests.Clusters now rank by severity tier first, then occurrence count, so the actionable worklist leads with real, editor-visible defects.
Blind spots closed
1. Validity headline no longer lies (RichText invalidity is now the headline)
The structural
wp_block_validityproxy reportsinvalid_blocks=0even when the editor would flag content invalid, because it does not model RichText strippingclass/styleoff inline<span>/<a>incontent. The classed-span detector is promoted to the authoritative editor-invalid-risk signal (richtext_invalid_content_risk, HIGH severity), extended to also cover<a>andcore/list-itemcontent, surfaced via a newrichtext_invalid_risk_countmetric, and the summary stops presenting structuralinvalid_blocks=0as "no invalid content."2. Layout-direction faithfulness
New
layout_direction_misrecognition :: columns_from_vertical_flexdetector: acore/columnsemitted from adisplay:flex; flex-direction:columnsource (a vertical stack rendered as horizontal columns) is a misrecognition. Conservative — only inline column-direction flex on container elements with 2+ children, confirmed by a verifier that the fragment actually converts tocore/columns. Genuine horizontal flex / grid is never flagged.3. SVG-loss surfaced as HIGH severity
New
svg_content_lostlane routes the transformer's inline-SVG fallback diagnostics and empty/comment-onlycore/htmlblocks that carry an SVG remnant into one signal, while keeping the distinction from SVG preserved ascore/htmlwith real shape elements (acceptable, not flagged). Previously this hid at rank ~61 under generic asset findings.4. CSS
var()false-positive down-ranked to informationalvar()references are materialized downstream by SSI (verified end-to-end), so resolved var density is not a repair gap. Relabeledinformational_var_density(severityinfo) and ranked below all actionable clusters, instead of flooding the top of the worklist with 233 css clusters.Before / after ranking
Full corpus: 368 documents / 77 fixtures, 54,123 blocks.
Before (count-only ranking):
preserve_runtime_island :: runtime_script— 1224native_block_recognition :: <svg>— 696richtext_inline_span_normalization :: core/paragraph— 308preserve_runtime_island :: interactive_form— 206semantic_structure_parity_restoration :: navigation_menu— 956–30. all 25 remaining slots are
css_custom_property_materialization :: --*(working behavior)materialize_static_asset :: inline_svg(the SVG-loss signal): rank 61invalid_blocks=0(the lie)After (severity-first ranking) — new actionable top-15:
richtext_invalid_content_risk :: core/paragraphrichtext_invalid_content_risk :: core/list-itemrichtext_invalid_content_risk :: core/headinglayout_direction_misrecognition :: columns_from_vertical_flexsvg_content_lost :: inline_svg_droppedpreserve_runtime_island :: runtime_scriptnative_block_recognition :: <svg>(svg preserved — acceptable)preserve_runtime_island :: interactive_formsemantic_structure_parity_restoration :: navigation_menurestore_interactive_behavior :: interactive_controlsemantic_structure_parity_restoration :: semantic_landmarktypography_parity_restoration :: typographypreserve_runtime_island :: html_templatepreserve_runtime_island :: runtime_templatematerialize_commerce_products :: commerce_product_gridThe 233
informational_var_densityclusters now begin at rank 18 (severityinfo), out of the actionable worklist. New headline surfacesrichtext_invalid_risk=1627,svg_content_lost=16,columns_from_vertical_flex=21, and labels var density as informational.Tests
tests/unit/corpus-detectors.phpextended for all four cases (classed<span>/<a>in paragraph/list-item = richtext invalid risk; vertical-flex→columns flags layout misrecognition while horizontal flex does not;<svg>→empty/commentcore/htmlflagssvg_content_lostwhile a shape-bearing svgcore/htmldoes not; var density is informational, not top-ranked). 23 assertions pass.composer testgreen (canonical + 171 parity fixtures + packaging);php -lclean.AI assistance