Skip to content

v3.5: canonical reference data layer — role-tiers + first-party-appids + transport-actions + pp-connectors #388

Description

@Daren9m

Background

Across the v3.4.0 audit umbrella (#326), four separate audits independently proposed introducing curated reference data files. The pattern is consistent enough to warrant coordinated treatment as a v3.5 release theme: canonical reference data layer for cross-consumer M365 governance.

The four proposed files

File Surfaced in Purpose
data/role-tiers.json #328 (PIM, #373) Tier-0 / Tier-1 / Tier-2 role inventory referenced by all PIM detection logic
data/microsoft-first-party-appids.json #361 (M365-Assess feedback) Canonical Microsoft owner-tenant + AppId allowlist for service principal classification
data/transport-rule-actions.json #339 (mail flow, #381) ~50 transport rule action types classified as benign / suspicious / hostile
data/power-platform-connectors.json #336 (Power Platform, #384) Power Platform connectors with recommended classifications (Business / Non-Business / Blocked)

Why coordinate as a single release theme

These files share strong design characteristics:

  • Curated reference data, not crosswalk data. They classify or enumerate Microsoft-controlled artifacts, not framework mappings.
  • Consumer-side authoritative lookups. Each consumer (M365-Assess, Az-Assess, EZ-CMMC) needs the same data. If CheckID owns them, consumers don't reimplement.
  • Update cadence is product-driven. When Microsoft adds a Power Platform connector or rebrands a role, the file needs refresh — not a per-consumer concern.
  • Schema design opportunity. Establishing a common JSON shape ({ id, displayName, classification, source, lastReviewed }) across all four files makes consumer ingestion uniform.

Scope of work

Schema design

  • Define a canonical reference data file shape in data/registry.schema.json (or sibling data/reference-data.schema.json)
  • Common fields: id, displayName, classification (enum per file), source (URL or "msft-1p"), lastReviewed (ISO date)
  • File-specific fields per data type

Per-file authoring

Build pipeline integration

  • Each file has a Build-*.py or curation script if data sourced from external (similar to Build-CisM365Crosswalk.py)
  • Validation gate in tests/registry-integrity.Tests.ps1
  • Schema docs in docs/SCHEMA-MIGRATION-3.x.md

Consumer-facing documentation

  • docs/REFERENCE-DATA.md — explain what these files are, how consumers should ingest them, schema contract
  • Each detection method appendix (per audit doc) updated with reference-data lookup pattern

Acceptance criteria

  • All 4 files created and populated
  • Common schema shape applied
  • Build/validation gates in place
  • Consumer documentation published
  • CHANGELOG documents the new layer
  • M365-Assess #887 (microsoft-first-party-appids consumer integration) coordinated

Source

v3.4.0 audit umbrella #326. Files independently proposed in #328 / #336 / #339 / #361.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew capability or significant additionschema

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions