Skip to content

port: hook + validator fixes from ECC#68

Merged
Jamkris merged 3 commits into
mainfrom
port/hook-validator-fixes
May 12, 2026
Merged

port: hook + validator fixes from ECC#68
Jamkris merged 3 commits into
mainfrom
port/hook-validator-fixes

Conversation

@Jamkris
Copy link
Copy Markdown
Owner

@Jamkris Jamkris commented May 12, 2026

Summary

Second port PR from the 2026-05-12 upstream sync round (audit log: upstream/sync-rounds/2026-05-12.md).

Lands two real correctness fixes plus the matching CI guard:

  • scripts/hooks/block-no-verify.js (ECC 0dcde13) — close two bypass holes in the git-hook protection hook.
  • scripts/ci/validate-skills.js (ECC e196f8a) — flag SKILL.md frontmatter defects in CI.
  • Data fixes to openclaw-persona-forge + skill-stocktake SKILL.md so the new validator passes under --strict.
  • Enable --strict in CI so future SKILL.md additions can't regress the same defects.

Bypass holes closed in block-no-verify.js

EGC's previous detection ran stripQuotedStrings (regex) before the flag regexes. Two real holes:

  1. git -c "core.hooksPath=/dev/null" commit -m "msg" — the quoted -c value got stripped to "", so the override slipped through.
  2. git commit "--no-verify" -m "msg" — quoted --no-verify flag stripped to "".

The upstream rewrite tokenizes by shell words instead, so quoted args occupying the flag position are still detected as flags while flags mentioned inside a commit message body stay false-positive-safe.

Additional fixes that come along:

  • Combined short options like git commit -am parse correctly (-am with a message arg does not consume the next token as a -n value).
  • Chained commands like git commit -m 'msg' && git push --no-verify detect the second-segment bypass instead of stopping at the first git invocation.
  • (Round 2 fix in this PR) -an / -na / -abcn style combined short options are now detected — CodeRabbit caught that the upstream regex ^-n[a-zA-Z] only matched when n was the first character.

Validator extension

validate-skills.js now inspects SKILL.md frontmatter:

  • Requires a non-empty name: field. Quoted-empty ("", '') and YAML null (~, null) are normalised to empty before the check (round 2 fix from CodeRabbit).
  • Rejects literal block-scalar description: indicators (|, |-, |+, |2, |-2, |- # note, etc. per YAML 1.2). Inline strings and folded > scalars are fine.

Frontmatter findings default to WARN; --strict or CI_STRICT_SKILLS=1 promotes them to ERROR. Structural findings (missing/empty SKILL.md) remain hard errors as before.

Data fixes

The validator surfaced exactly the two defects ECC's PR #1664 fixed upstream:

  • skills/openclaw-persona-forge/SKILL.md had description: |- (multi-line Chinese). Inlined to a single quoted string preserving the original text verbatim.
  • skills/skill-stocktake/SKILL.md was missing name: entirely. Added name: skill-stocktake (kept the "Gemini skills" wording — not reverted to ECC's "Claude skills").

After the data fixes, the validator reports zero warnings under --strict, so the CI workflow now runs it in strict mode.

Files (6, +580 / -65)

File Source Change
scripts/hooks/block-no-verify.js 0dcde13 + round-2 fix Full rewrite: shell-words tokenizer + tightened flag detection; -an/-na now caught
tests/hooks/block-no-verify.test.js 0dcde13 + round-2 fix +9 new test cases (17 → 26)
scripts/ci/validate-skills.js e196f8a + round-2 fix Extended with frontmatter inspection + --strict mode + YAML scalar normalisation
skills/openclaw-persona-forge/SKILL.md data fix Block scalar → quoted inline string
skills/skill-stocktake/SKILL.md data fix Add missing name: field
.github/workflows/reusable-validate.yml this PR Pass --strict to validate-skills

ECC commits evaluated and not ported (architecture mismatch)

  • d26d66f / b1456bd / 03108be — ECC's session-start hook emits an additional_context block carrying learned-instinct content. EGC's session-start.js only manages session/learned-skills directories and command shims; there is no additional_context surface to inject into. These are session-start design changes, not portable fixes.
  • 3fadc37 — ECC moves continuous-learning observe from a bash hook to a Node observe-runner.js registered in top-level hooks/hooks.json. EGC keeps the observe hook inside the skill bundle at skills/continuous-learning-v2/hooks/observe.sh. Porting would mean restructuring EGC's skill-local hook layout.

These skips are also recorded against the audit log for future-round provenance.

Test plan

  • npm run lint clean
  • npm test 273/273
  • node scripts/ci/validate-skills.js 0 warnings against the current tree
  • node scripts/ci/validate-skills.js --strict exits 0
  • Bypass holes verified blocked: quoted core.hooksPath, quoted --no-verify, chained git push --no-verify, combined -an / -na
  • False positives still allowed: --no-verify / -n / core.hooksPath= mentioned inside commit messages; -mn (n is the -m message body, not a flag)

Summary by CodeRabbit

  • Tests

    • Expanded test coverage for shell-command parsing edge cases in git hooks.
  • Chores

    • Skills validation made stricter in CI (strict mode enabled), enforcing SKILL.md structure and frontmatter rules.
    • Normalized frontmatter and minor formatting updates for skill documents.
    • Improved git-hook parsing to more reliably detect bypass flags in complex shell inputs.

Review Change Stack

Second port PR from the 2026-05-12 upstream sync round (audit log:
upstream/sync-rounds/2026-05-12.md).

scripts/hooks/block-no-verify.js  (0dcde13)
  Rewrite flag detection to use shell-words tokenization instead of
  string-stripping regex. EGC's previous stripQuotedStrings approach
  had two real bypass holes: (1) "git -c \"core.hooksPath=/dev/null\"
  commit -m \"msg\"" stripped the quoted -c value to "" and silently
  allowed the hook override; (2) "git commit \"--no-verify\" -m
  \"msg\"" similarly stripped the quoted flag to "". The tokenizer
  preserves shell quoting structure, so quoted args occupying the
  flag position are still detected as flags while flags MENTIONED
  inside a commit message body remain false-positive-safe.

  Also gains: combined short options like `git commit -am` are now
  parsed correctly (-am with a message arg does not consume the
  next token as a value for -n), and chained git commands like
  "git commit -m 'msg' && git push --no-verify" detect the push
  bypass instead of stopping at the first git invocation.

tests/hooks/block-no-verify.test.js  (0dcde13)
  +6 new tests: quoted core.hooksPath override; -am combined short
  opt + quoted --no-verify / -n message; git bypass phrase in
  quoted message; real quoted --no-verify flag still blocked; later
  chained git commands still blocked. 23/23 pass.

scripts/ci/validate-skills.js  (e196f8a)
  Extend the validator to inspect SKILL.md frontmatter:
    - require name: field (non-empty after trim + comment-strip)
    - reject literal block-scalar description: (|, |-, |+, |2, |-2,
      |-  # note, etc.) — preserves internal newlines and breaks
      flat-table renderers keyed off description.
  Frontmatter findings default to WARN (exit 0) and promote to ERROR
  with --strict or CI_STRICT_SKILLS=1. Structural findings (missing
  or empty SKILL.md) remain hard errors.

skills/openclaw-persona-forge/SKILL.md  (data fix, paired with e196f8a)
  description: was a literal block scalar (|-). Inlined to a quoted
  string preserving the original Chinese text verbatim.

skills/skill-stocktake/SKILL.md  (data fix, paired with e196f8a)
  Frontmatter was missing name: entirely. Added "name: skill-stocktake".

.github/workflows/reusable-validate.yml
  Enable --strict on validate-skills now that both data defects are
  fixed. Future SKILL.md additions cannot reintroduce the same
  class of frontmatter defect without failing CI.

Four ECC commits evaluated and not ported (architecture mismatch):
- d26d66f / b1456bd / 03108be: ECC's session-start hook emits an
  additional_context block carrying learned-instinct content. EGC's
  session-start.js only manages session/learned-skills directories
  and command shims — there is no additional_context surface to
  inject into. These are session-start design changes, not bug
  fixes.
- 3fadc37: ECC moves continuous-learning observe from a bash hook
  to a Node observe-runner registered in hooks/hooks.json. EGC keeps
  the observe hook inside the skill bundle at
  skills/continuous-learning-v2/hooks/observe.sh. Porting would
  require restructuring EGC's skill-local hook layout.

Lint clean. 270/264 -> 270 total tests pass (+6 new in
block-no-verify). validate-skills reports 0 warnings against the
current skills/ tree under --strict.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d44b2b5f-4322-4ed3-aee4-bbb8f33e47ed

📥 Commits

Reviewing files that changed from the base of the PR and between e2d7388 and fbf7908.

📒 Files selected for processing (2)
  • scripts/hooks/block-no-verify.js
  • tests/hooks/block-no-verify.test.js

Walkthrough

Adds a strict-mode SKILL.md validator (parses frontmatter, enforces name, detects block-scalar description) and rewrites the git hook to use shell-aware tokenization for robust detection of --no-verify and core.hooksPath overrides, with matching tests and CI invocation.

Changes

Skill Directory Validation System

Layer / File(s) Summary
Validation script implementation
scripts/ci/validate-skills.js
Implements frontmatter extraction, inspection, YAML scalar normalization, per-skill validation (SKILL.md existence/non-empty, name presence), WARN vs ERROR handling, summary reporting, and strict-mode behavior.
Workflow configuration update
.github/workflows/reusable-validate.yml
Runs node scripts/ci/validate-skills.js --strict in the “Validate skills” step.
Skill metadata compliance updates
skills/openclaw-persona-forge/SKILL.md, skills/skill-stocktake/SKILL.md
Reformats description presentation and adds/normalizes name frontmatter to satisfy the new validator.

Git Hook Security Hardening

Layer / File(s) Summary
Shell-aware parsing infrastructure
scripts/hooks/block-no-verify.js
Replaces regex-based detection with a shell-aware tokenizer and command-scoped checks that correctly handle quoted/escaped tokens, commit options that consume values, combined short options, and -c core.hooksPath= forms.
Shell-words parsing test coverage
tests/hooks/block-no-verify.test.js
Adds edge-case tests for quoted overrides, commit message content containing bypass-like phrases, combined short options, chained git commands, and case-insensitive core.hooksPath variants to verify correct blocking behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

security, skills

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.06% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: porting hook and validator fixes from upstream ECC.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch port/hook-validator-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/ci/validate-skills.js`:
- Around line 151-154: The current empty-name check (else if (values.name ===
'')) misses quoted empty strings and YAML nulls; update the conditional after
the hasOwnProperty check to treat null/undefined and any string that's empty
after trimming as invalid (e.g. if (values.name == null ||
String(values.name).trim() === '')), and call reportFrontmatterFinding the same
way (referencing values, reportFrontmatterFinding and the SKILL.md message) so
"" / '' / ~ / null all fail CI.

In `@scripts/hooks/block-no-verify.js`:
- Around line 240-242: The isCommitNoVerifyShortFlag function fails to detect
combined short flags where 'n' is not the first char (e.g., '-an'); update its
detection to match any short-option token that starts with a single dash
followed by letters and contains 'n' anywhere (but not long flags starting with
'--'). Modify isCommitNoVerifyShortFlag to return true for tokens like '-n' or
any /^-[a-zA-Z]*n[a-zA-Z]*$/ so combined flags such as '-an', '-na', '-abcn' are
caught while preserving rejection of long flags.

In `@tests/hooks/block-no-verify.test.js`:
- Around line 144-177: Add a test case that ensures combined short options with
-n not in first position are blocked: use the existing test harness/runHook and
add a test named "blocks -n in combined short options like -an" which runs 'git
commit -an -m "msg"' and asserts exit code 2; then fix the parsing bug in
isCommitNoVerifyShortFlag so it detects 'n' anywhere inside a combined
short-flag token (e.g., treat '-an' and '-na' equivalently) by scanning the
characters of combined flags rather than only checking first position.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5813702e-cbe9-4190-9ae5-fb20b88b9678

📥 Commits

Reviewing files that changed from the base of the PR and between bae261a and 021dae3.

📒 Files selected for processing (6)
  • .github/workflows/reusable-validate.yml
  • scripts/ci/validate-skills.js
  • scripts/hooks/block-no-verify.js
  • skills/openclaw-persona-forge/SKILL.md
  • skills/skill-stocktake/SKILL.md
  • tests/hooks/block-no-verify.test.js

Comment thread scripts/ci/validate-skills.js
Comment thread scripts/hooks/block-no-verify.js
Comment thread tests/hooks/block-no-verify.test.js
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 6 files

P0 — combined short option bypass in block-no-verify (CodeRabbit):
  isCommitNoVerifyShortFlag only matched when 'n' was the first
  character after the dash. Tokens like '-an' (== '-a -n' in git
  semantics) or '-na' slipped through because the regex was
  /^-n[a-zA-Z]/. Rewritten to scan every character of a combined
  short-option token until either an 'n' is found (bypass) or a
  value-consuming flag like -m / -F / -C / -c is hit (which ends
  the option chain, the rest becomes the inline value). So '-mn'
  still allows ('n' is part of the -m message), '-am' still allows
  (no 'n' before the message-consuming 'm'), but '-an' / '-na' /
  '-abcn' are now caught.

P1 — quoted-empty / YAML-null name bypass (CodeRabbit):
  inspectFrontmatter leaves raw scalar values as literal strings,
  so `name: ""` produced values.name === '""' (length 2, not 0)
  and slipped past the `values.name === ''` check. Same for `''`,
  `null`, and `~`. Added normalizeYamlScalar() that strips matched
  outer quotes and treats `null` / `~` as empty before the check.

Tests (+3 in block-no-verify, 26 total):
  - blocks -n hidden inside combined short option (-an)
  - blocks -n hidden inside combined short option (-na)
  - still allows -mn (n is inside the -m message, not a flag)

Lint clean. 273/270 -> 273 total tests pass.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/hooks/block-no-verify.js (1)

333-352: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use tokenizeShellWords() to properly handle quoted arguments before the subcommand.

The current code splits on whitespace: gap.trim().split(/\s+/) is not quote-aware. A quoted global-flag value with spaces can hide the real subcommand. For example, git --git-dir "/tmp/repo dir/.git" commit --no-verify causes the parser to incorrectly tokenize the gap into ['--git-dir', '"/tmp/repo', 'dir/.git"'] and fail to identify commit as the subcommand, allowing the --no-verify bypass.

Suggested fix
-        const gap = input.slice(git.idx + git.len, cmdIdx);
-        const tokens = gap.trim().split(/\s+/).filter(Boolean);
+        const tokens = tokenizeShellWords(input, git.idx + git.len, cmdIdx)
+          .map(token => token.value);

Add a test case with a quoted path containing spaces to prevent regression.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/hooks/block-no-verify.js` around lines 333 - 352, Replace the naive
split (gap.trim().split(/\s+/)) with a quote-aware tokenizer by calling
tokenizeShellWords on the gap (e.g., tokens = tokenizeShellWords(gap.trim())),
then keep the existing loop logic that uses expectFlagArg/onlyFlagsAndArgs and
the same git global-flag list so quoted flag values (like "--git-dir \"/tmp/repo
dir/.git\"") are treated as single tokens; ensure any whitespace-trimming still
occurs before tokenization and leave the fallback that advances searchPos
unchanged. Also add a unit/test case that runs the affected parsing path with a
quoted path containing spaces (for example a --git-dir value with spaces) to
prevent regression.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/hooks/block-no-verify.js`:
- Line 70: The COMMIT_SHORT_OPTIONS_WITH_VALUE set is missing the short option
't', causing isCommitNoVerifyShortFlag() to mis-parse combined flags like -tn;
update the COMMIT_SHORT_OPTIONS_WITH_VALUE constant (the Set declared as
COMMIT_SHORT_OPTIONS_WITH_VALUE) to include 't' so the parser treats -t as
consuming the next character as its value and stops scanning before checking for
'n'.
- Line 38: The guard compares git config keys case-sensitively using
GIT_CONFIG_KEY_PREFIX ('core.hooksPath=') which allows bypass via case
variations; update the logic to normalize to lowercase before matching: either
set GIT_CONFIG_KEY_PREFIX to all-lowercase ('core.hookspath=') and call
.toLowerCase() on the incoming config key before doing startsWith, or always
compare configKey.toLowerCase().startsWith(GIT_CONFIG_KEY_PREFIX.toLowerCase());
apply the same normalization to the other matching block that checks git config
keys (the checks referencing GIT_CONFIG_KEY_PREFIX and the config key parsing in
the later hook-checking logic).

---

Outside diff comments:
In `@scripts/hooks/block-no-verify.js`:
- Around line 333-352: Replace the naive split (gap.trim().split(/\s+/)) with a
quote-aware tokenizer by calling tokenizeShellWords on the gap (e.g., tokens =
tokenizeShellWords(gap.trim())), then keep the existing loop logic that uses
expectFlagArg/onlyFlagsAndArgs and the same git global-flag list so quoted flag
values (like "--git-dir \"/tmp/repo dir/.git\"") are treated as single tokens;
ensure any whitespace-trimming still occurs before tokenization and leave the
fallback that advances searchPos unchanged. Also add a unit/test case that runs
the affected parsing path with a quoted path containing spaces (for example a
--git-dir value with spaces) to prevent regression.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2a96321a-7580-4ddf-8627-bb11c55a088f

📥 Commits

Reviewing files that changed from the base of the PR and between 021dae3 and e2d7388.

📒 Files selected for processing (3)
  • scripts/ci/validate-skills.js
  • scripts/hooks/block-no-verify.js
  • tests/hooks/block-no-verify.test.js

Comment thread scripts/hooks/block-no-verify.js Outdated
Comment thread scripts/hooks/block-no-verify.js Outdated
P0 — core.hooksPath case-insensitivity bypass (CodeRabbit):
  Git config section and variable names are case-insensitive per
  git-scm.com/docs/git-config, so 'git -c core.hookspath=/dev/null
  commit ...' or 'core.HOOKSPATH=...' bypassed the previous guard
  which compared against the literal 'core.hooksPath=' constant.
  Switched GIT_CONFIG_KEY_PREFIX to the lowercase form and
  lowercase the candidate tokens in hasHooksPathOverride before
  matching, in both the '-c <key>' and '-c<key>' code paths.

P1 — -t template short option missing (CodeRabbit):
  COMMIT_SHORT_OPTIONS_WITH_VALUE was missing 't'. Git parses
  'git commit -tn' as -t with template path 'n' (stuck-arg form),
  not '-t -n'. Without 't' in the set, the scanner reached the 'n'
  check and falsely flagged the command as a --no-verify bypass.
  Added 't' to the set with a comment explaining the stuck-arg
  semantics.

Tests (+3, 29 total in this file):
  - still allows -tn (n is the -t template path, not a flag)
  - blocks case-variant core.hooksPath (lowercase)
  - blocks case-variant core.hooksPath (uppercase)

Lint clean. 276/276 across the suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant