Skip to content

docs(adr): ADR-155 β€” nightly self-learning security harness#2417

Open
ruvnet wants to merge 1 commit into
mainfrom
docs/adr-155-nightly-security-harness
Open

docs(adr): ADR-155 β€” nightly self-learning security harness#2417
ruvnet wants to merge 1 commit into
mainfrom
docs/adr-155-nightly-security-harness

Conversation

@ruvnet

@ruvnet ruvnet commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Summary

  • Proposes a single nightly composite `nightly-security-harness.yml` GitHub Actions workflow that fans out into 5 orthogonal security dimensions (deps CVE, MCP static, MCP active pentest, CodeQL, differential drift) and converges into a learned triage step that ranks, dedupes, and routes findings
  • Scopes 3 self-learning loops (KRR-trained per-dimension confidence, isotonic CVSS calibration, auto-fix bid) each guarded by the ADR-150 triple-gate pattern
  • Triggered by today's CWE-78 (GHSA-vcv2-r9jh-99m5) β€” we caught it via a reporter, not our own gates

Decision drivers

  • Closes the CWE-78-class regression with Phase 1 alone (dumb `npm audit` + `gh advisory list` would have caught `agentic-flow ≀ 2.0.13` on day one)
  • Composes existing primitives β€” AgentDB persistence, KRR learning, ADR-026 routing for cost bounding, ADR-097 budget circuit breaker, ADR-150 oia-audit static surface, ADR-152 drift detection β€” no greenfield invention
  • Load-bearing invariant: harness is ADVISORY by default; learning layer learns what to surface vs suppress, NOT what to ship

Test plan

This is a proposal ADR β€” no code lands in this PR. Phase 1 implementation will be its own PR after this is merged. The ADR itself:

  • References every prior ADR it inherits constraints from (150, 151, 152, 026, 097, 074-078)
  • Names every failure mode + mitigation
  • Reserves ADR-156 for Phase 4 (auto-fix) so this PR doesn't pre-commit that decision
  • Honors the four ADR-150 architectural constraints (removable, optional, graceful degradation, CI-absent-path coverage)

πŸ€– Generated with RuFlo

Proposes a single nightly composite GitHub Actions workflow that fans
out into five orthogonal security scan dimensions (deps CVE, MCP static,
MCP active pentest, CodeQL, differential drift) and converges into a
learned triage step that ranks, dedupes, and routes findings.

Three learning loops are scoped (each guarded by its own triple-gate
env flag per the ADR-150 pattern):

  A. Per-dimension confidence weighting (KRR over `(finding, dimension,
     human_outcome)` tuples)
  B. Severity calibration (isotonic regression of CVSS β†’ realized impact
     in our stack)
  C. Auto-fix bid (out of scope here β€” reserved for ADR-156)

Triggered by today's CWE-78 incident in agentic-flow ≀ 2.0.13
(GHSA-vcv2-r9jh-99m5): we caught the vulnerability via a reporter, not
via our own gates. This ADR closes that class without inventing new
primitives β€” composes existing AgentDB persistence, KRR learning,
ADR-026 routing for cost-bounding, ADR-097 budget circuit breaker
pattern, ADR-150 oia-audit static surface, ADR-152 drift detection.

Load-bearing invariant: the harness is ADVISORY by default. The
learning layer learns what to surface vs suppress, NOT what to ship.
HIGH+ findings always alert regardless of suppression; auto-fix PRs
always go through human review; suppression entries have TTL.

Phased: Phase 1 lands the workflow with dumb max-severity ranking
(closes the CWE-78-class regression alone); Phase 2 adds loop A behind
a feature flag once 30 days of outcome data accumulate; Phase 3 adds
loop B; Phase 4 (auto-fix) requires its own ADR.

Co-Authored-By: RuFlo <ruv@ruv.net>
meefs pushed a commit to meefs/claude-code-flow that referenced this pull request Jun 22, 2026
…a to 0.2.6

Implements ADR-153 (Darwin Mode integration). Adds the WRITE layer that
closes the loop ADR-150's READ layer opens: score/genome describe a
harness; evolve changes one.

Three new surfaces, all honoring the four ADR-150 architectural constraints
(removable / optional / graceful degradation / CI-absent-path coverage):

  - harness-evolve         mutate 1 of 7 policy surfaces, sandbox-score, promote
  - harness-security-bench upstream's "Darwin Shield" (their own ADR-155)
  - harness-bench          create/verify bench suites for evolve --bench

Wiring:
  - plugins/ruflo-metaharness/scripts/_darwin.mjs       β€” shared subprocess helper
  - plugins/ruflo-metaharness/scripts/evolve.mjs        β€” main verb (--confirm gate)
  - plugins/ruflo-metaharness/scripts/security-bench.mjs β€” Darwin Shield wrapper
  - plugins/ruflo-metaharness/scripts/bench.mjs         β€” supporting verb
  - plugins/ruflo-metaharness/skills/harness-{evolve,security-bench,bench}/SKILL.md
  - v3/@claude-flow/cli/src/mcp-tools/metaharness-tools.ts β€” 3 new MCP tools:
      metaharness_evolve, metaharness_security_bench, metaharness_bench
  - v3/@claude-flow/cli/package.json optionalDependencies:
      + @metaharness/darwin ~0.3.1
      bump metaharness ~0.1.11 -> ~0.2.6
  - ruflo/package.json optionalDependencies: mirrored (per the ruvnet#2112 lesson β€”
      wrapper does NOT inherit root overrides; required for transitive pin)

Safety posture (matches mint.mjs convention):
  - evolve requires --confirm; without it, returns a dry-run plan
  - Ruflo caps: --generations 1..50, --children 1..20, --concurrency 1..8,
    --population 1..20, --cycles 1..100 (upstream supports more)
  - Upstream exit code 99 (safety-disqualified) propagates VERBATIM β€”
    not remapped β€” so CI can distinguish "evolution failed" from
    "evolution surfaced a safety-tripping mutation"
  - All three scripts emit {degraded: true, reason: 'metaharness-darwin-not-available'}
    and exit 0 when the optional dep is absent

Connection to ruflo's own ADR-155 (nightly self-learning security harness,
PR ruvnet#2417, issue ruvnet#2418): harness-security-bench is the closest reference
implementation for our Loop A reward-signal sanity check. Running it
periodically gives us the empirical floor β€” if Darwin Shield's champion
reaches TPR=1/FPR=0 on the seeded corpus, our loop A's gradient signal
is sound.

Tests:
  - plugins/ruflo-metaharness/scripts/test-graceful-degradation.mjs extended
    with the 3 new scripts; all 16 assertions pass (8 skills Γ— 2 contracts:
    exit 0 AND emit "degraded": true)
  - npx tsc --noEmit clean on v3/@claude-flow/cli

Co-Authored-By: RuFlo <ruv@ruv.net>
meefs pushed a commit to meefs/claude-code-flow that referenced this pull request Jun 22, 2026
MINOR per semver β€” backward-compatible additions:
- 3 new MCP tools: metaharness_evolve, metaharness_security_bench, metaharness_bench
- 3 new skills: harness-evolve, harness-security-bench, harness-bench
- New optional dep: @metaharness/darwin ~0.3.1
- Umbrella bump: metaharness ~0.1.11 β†’ ~0.2.6

Changes:
- Root `package.json`              3.12.4 β†’ 3.13.0
- `v3/@claude-flow/cli`            3.12.4 β†’ 3.13.0
- `ruflo/` wrapper                 3.12.4 β†’ 3.13.0

Published to npm with all three legacy dist-tags pointing at 3.13.0:

  @claude-flow/cli@3.13.0  β†’  latest, alpha, v3alpha
  claude-flow@3.13.0       β†’  latest, alpha, v3alpha
  ruflo@3.13.0             β†’  latest, alpha, v3alpha

Implements ADR-153 (Darwin Mode integration). All 4 ADR-150 architectural
constraints honored β€” removable, optional, graceful, CI-gated. Connects to
ADR-155 (ruvnet#2417, ruvnet#2418) by surfacing the upstream Darwin Shield as an
empirical floor for Loop A reward-signal soundness.

Verified post-publish: every package Γ— every dist-tag = 3.13.0.

Co-Authored-By: RuFlo <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant