fix(compliance): correct HIPAA Safe-Harbor citations + drop anonymisation overclaim (CF-18/CF-17) + verify_output tests (CF-14) by Ces107 · Pull Request #1 · Ces107/dcm-anon

Ces107 · 2026-06-01T07:21:54Z

Post-v0.6.0 correctness polish on the compliance citations and the independent verifier tests. No public API change; the only manifest-payload change is one added eu-ai-act disclosure.

CF-18 — HIPAA Safe-Harbor citation errors

Catch-all mislabelled (Q) (full-face photos) instead of (R) (any other unique identifying number/characteristic/code). Relabelled 11 tags + the burned-in-pixel finding.
DeviceSerialNumber mislabelled (N) (URLs) instead of (M) (device identifiers and serial numbers). Fixed in verify_output.py and the regulatory_mapping.py example mapping.
ENS (RD 311/2022) citation qualified: tool evidences op.exp.8 + mp.info.6; mp.info.3 (cifrado) is NOT implemented (controller responsibility).
EU AI Act Art. 10 gated on applicability via a new AI_ACT_APPLICABILITY_NOTE disclosure (binds only the high-risk Annex III provider).
Added CITATIONS_VERIFIED_ON (2026-06-01) carried in the DISCLAIMER.

CF-17 — drop Recital-26 anonymisation overclaim

Action.D GDPR clause claimed a dummy renders data "no longer personal data". That field-level anonymisation claim contradicts the global PSEUDONYMOUS classification (salted-hash remapped UIDs stay reversible with the withheld salt) and is the exact overclaim CNIL SAN-2024-013 (Cegedim) sanctioned. Narrowed to Art. 32(1)(a).

CF-14 — verify_output tests

New tests/test_verify_output.py (27 tests): metadata residual path, SQ recursion, cleanliness helpers, happy pixel-OCR path via fake pytesseract, multiframe mid-slice, no-pixel objects, VerificationResult logic, sampling, empty/garbage dirs, plus regression guards locking CF-18/CF-17. Coverage of verify_output.py ~74 -> 91 percent.

Verification

Full suite 226 passed, ruff + mypy --strict clean, examples/verify_golden.py exits 0.

…tion overclaim (CF-18/CF-17) + verify_output tests (CF-14) CF-18 — two factual HIPAA 164.514(b)(2)(i) category errors in the manifest citations, the kind a reviewing radiologist or DPO would catch: - verify_output.py: the free-text / quasi-identifier catch-all was mislabelled "(Q) Any other unique characteristic". (Q) is full-face photographs; the catch-all is (R) "Any other unique identifying number, characteristic, or code". Relabelled 11 tags plus the burned-in-pixel finding to (R). - DeviceSerialNumber was "(N) Device identifiers". (N) is URLs; device identifiers and serial numbers are (M). Fixed in verify_output.py and the regulatory_mapping.py Safe-Harbor example mapping (->M, was ->R). - Qualified the ENS (RD 311/2022) citation: the tool directly evidences op.exp.8 (audit log) and mp.info.6 (limpieza de documentos); mp.info.3 (cifrado) is NOT implemented and remains the controller responsibility. - Gated EU AI Act Art. 10 on applicability: added AI_ACT_APPLICABILITY_NOTE surfaced as an eu-ai-act manifest disclosure. Art. 10 binds only the provider of a high-risk Annex III system, not anyone who de-identifies. - Added CITATIONS_VERIFIED_ON constant; DISCLAIMER now carries the 2026-06-01 re-verification date. CF-17 — dropped the Recital-26 anonymisation overclaim from the Action.D GDPR clause. It claimed a schema-preserving dummy renders the substituted data "no longer personal data". That field-level anonymisation claim contradicts the global PSEUDONYMOUS (not anonymous) classification: salted-hash remapped UIDs stay reversible with the separately-held salt, so the dataset never becomes anonymous. Asserting otherwise is exactly the false-anonymisation overclaim CNIL SAN-2024-013 (Cegedim) sanctioned. Citation narrowed to Art. 32(1)(a); summary now states the dummy neutralises the field WITHOUT anonymising the dataset. CF-14 — new tests/test_verify_output.py (27 tests): metadata residual path, SQ recursion, value-cleanliness helpers, the happy pixel-OCR path via a fake pytesseract (no system tesseract binary needed), multiframe mid-slice OCR, no-pixel objects, VerificationResult status/conclusive/coverage logic, sampling, empty/garbage dirs, plus regression guards locking the CF-18/CF-17 corrections. verify_output.py coverage ~74 -> 91 percent. Full suite 226 passed, ruff + mypy --strict clean, golden completeness proof exits 0. No public API or manifest-hash-affecting change beyond the added eu-ai-act disclosure.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fba2054536

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T07:24:32Z

+    "in this manifest BIND ONLY the provider of a high-risk AI system within "
+    "the meaning of AI Act Art. 6(2) + Annex III (e.g. AI intended as a "
+    "medical device safety component, or otherwise listed in Annex III). If "


Include the Art. 6(1) high-risk route

For providers of AI that is itself a medical device or a safety component covered by Annex I product legislation, high-risk classification comes from AI Act Art. 6(1), not Art. 6(2) + Annex III. This new disclosure says Art. 10 binds only Art. 6(2)/Annex III providers and then tells non-Annex-III users Art. 10 does not apply, so an EU medical-device AI provider can receive a manifest that incorrectly disclaims mandatory data-governance obligations. Please include the Art. 6(1)/Annex I path or remove the medical-device example from the Annex III-only gate.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Jun 1, 2026

View reviewed changes

Ces107 merged commit 7e02c52 into main Jun 1, 2026
4 checks passed

Ces107 deleted the fix/cf-18-cf-14-citation-tests branch June 1, 2026 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compliance): correct HIPAA Safe-Harbor citations + drop anonymisation overclaim (CF-18/CF-17) + verify_output tests (CF-14)#1

fix(compliance): correct HIPAA Safe-Harbor citations + drop anonymisation overclaim (CF-18/CF-17) + verify_output tests (CF-14)#1
Ces107 merged 1 commit into
mainfrom
fix/cf-18-cf-14-citation-tests

Ces107 commented Jun 1, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ces107 commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CF-18 — HIPAA Safe-Harbor citation errors

CF-17 — drop Recital-26 anonymisation overclaim

CF-14 — verify_output tests

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ces107 commented Jun 1, 2026 •

edited

Loading