docs(security): explicit security model, role/capability matrix, AGENTS.md linkage#3
Open
hbrooks wants to merge 7 commits into
Open
docs(security): explicit security model, role/capability matrix, AGENTS.md linkage#3hbrooks wants to merge 7 commits into
hbrooks wants to merge 7 commits into
Conversation
…TS.md linkage Adds a Security Model section to .github/SECURITY.md so the file is now the canonical, principle-based source of truth for what Apache Superset considers a vulnerability. The section is written in terms of trust boundaries (Admin / Operator / Codebase) and a role-and-capability matrix rather than enumerating specific files or libraries, so the model stays valid as the codebase evolves and is unambiguous for both human reporters and automated scanners. The Vulnerability Scope section is reframed around a single test: "Does this finding let a principal perform an action the role and capability matrix does not entitle them to?" In-scope and out-of-scope class lists are illustrative applications of that test, not exhaustive enumerations. AGENTS.md gains a top-level "Security and Threat Model" section that directs LLM agents and automated tooling to .github/SECURITY.md first, inlines the trust-boundary summary and the canonical authorization pattern, and cross-references the matrix. The pre-existing "Architecture Patterns > Security & Features" block is left in place with a one-line pointer to the new section.
Applies eight of ten review suggestions:
- Trust Boundaries: the Admin claim now lists concrete capabilities
(database registration, arbitrary SQL through SQL Lab, Jinja
rendering, configuration override) that make the OS-trust-equivalent
framing unavoidable, so the CNA Operational Rules 4.1 citation
supports the reasoning rather than replacing it.
- Trust Boundaries: adds the fail-closed / fail-open asymmetry to the
operator boundary so the boundary cannot be read as a blanket
disclaimer for insecure defaults shipped by the codebase.
- Roles and Capabilities: explicit note that the Public role follows
the same custom-role rule (operator grants shift the boundary for
that deployment, do not redefine the model).
- In Scope: SSRF bullet now describes a concrete test ("Apache
Superset itself, not the operator, controls the outbound destination
set") rather than the circular "codebase is responsible for
restricting".
- In Scope: new bullet for insecure-by-default behavior, pairing with
the codebase-fail-closed responsibility above.
- Out of Scope: third-party dependency clause split into two cases
(transitive / operator-selected: out; direct-pinned-vulnerable or
misused-by-codebase: in).
- AGENTS.md: authorization-pattern paragraph qualifies "not a finding
by itself" with "on authorization grounds", and notes that
injection, SSRF, XSS, or other classes are evaluated separately.
- AGENTS.md: new requirements block for automated tooling, requiring
each finding to name the matrix row violated and the assumed
principal; findings that cannot identify both are filed as questions,
not vulnerabilities.
Two suggestions intentionally not applied:
- "Theoretical attack vectors" wording kept strict on purpose; softening
to "deprioritized but reopenable on PoC" invites every speculative
submission and undermines the triage filter.
- Outcome of Reports section reviewed for consistency with the new
model; no rewording needed.
The previous wording "Theoretical attack vectors without a demonstrable, reproducible exploit" was the one place in the document that used "demonstrable" as a load-bearing word without distinguishing demonstrability from exploitability. That conflated two different filter outcomes (the reporter did not bother to demonstrate vs. the finding is not exploitable) into a single permanent closure, which is the same kind of policy laundering the rest of the model is structured to avoid. The replacement is no less strict in practice (the burden still rests with the reporter; findings without a PoC are still closed), it just names what the filter is filtering for and removes the permanence claim by allowing a refile if a proof of concept is later produced.
The Alpha and sql_lab rows did not match either superset/security/manager.py
or docs/admin_docs/security/security.mdx. Fixed:
- Alpha read column: now "all data sources", matching
ALPHA_ONLY_PERMISSIONS = {"all_database_access", "all_datasource_access"}
in superset/security/manager.py and the public docs ("Alpha users have
access to all data sources").
- Alpha SQL column: now "no by default (requires the sql_lab role)".
SQLLAB_ONLY_PERMISSIONS are gated to the sql_lab role
(superset/security/manager.py: set_role("sql_lab", ...) at line 1515)
and the docs explicitly state that Alpha needs sql_lab on a per
database basis.
- Alpha manage-databases column: footnoted as "data upload to existing
databases only" to capture ALPHA_ONLY_PMVS = {("can_upload", "Database")}.
- Gamma SQL column: same correction as Alpha; needs sql_lab role.
- Gamma write column: "own and explicitly shared" replaced with the
doc-accurate "own charts and dashboards on granted datasets" ("shared"
was not a real Superset concept).
- Removed the standalone "sql_lab" row, which presented the additive role
as a peer principal. Added an explicit note below the table naming it
additive and tying its scope to the base role's database grants.
- Embedded guest token: tightened "datasets bound to the embedded dashboard"
to "data sources bound to the dashboards in the token's resources claim",
matching the JWT structure in superset/security/guest_token.py.
The model itself is unchanged; only its description in the matrix.
… enumeration Two changes: 1. Removed the parenthetical "(REST API, web UI, SQL Lab, embedded SDK, notification system, server-side workers)" from the codebase trust boundary. The phrase "across its product surface" carries the principle on its own; enumerating components reintroduces the blacklist pattern the rest of the document was structured to avoid, and dates the doc as new surfaces are added. 2. Added documented-control enforcement to Trust Boundary 3 and a matching bullet in the in-scope list. The previous model covered matrix bypass, injection, authentication bypass, SSRF, XSS, and insecure defaults, but had no explicit principle for the class of findings where an operator has enabled a Superset-documented control (denylists, allowlists, sanitizers, validators) and a user evades it via a parser quirk, dialect trick, encoding, or comment splice. The new bullet makes that class in scope and explicitly applies it to all principals including Admin, since the operator's configuration of the control is the policy the codebase failed to enforce.
… boundaries The previous commit overcommitted by putting Apache Superset on the hook for enforcing any operator-configurable control as if it were a security boundary. That positioning is wrong for most of the configurable hardening Apache Superset exposes (SQL function or table denylists, URI restrictions on already-authorized database connectors, sanitization passes layered on top of an already-authorized SQL Lab session): these are belt-and-braces controls operators can layer on top of the role and capability matrix, not firewall-grade guarantees the codebase commits to. Two related changes: 1. Trust Boundary 3: drop the broad documented-control responsibility added in the previous commit. The codebase commits to enforcing the matrix; configurable hardening is treated separately under Vulnerability Scope. 2. In-Scope bullet rewritten to cover only controls Apache Superset documents specifically as security boundaries (row-level security, access checks tied to the matrix, documented sandboxes, and similar features whose documentation positions them as security-relevant). 3. New Out-of-Scope bullet making the carve-out explicit: bypasses of configurable defense-in-depth hardening that the documentation does not position as a security boundary are hardening improvements, not vulnerabilities. This restores the intended boundary: the matrix is the line, hardening is opt-in layering on top of it.
…n wording, lift headings Six small fixes from a re-read of the full document. None of these change the security model; they remove an internal contradiction, sharpen ambiguous wording, and make the heading hierarchy match the substantive structure. 1. Remove the legacy third-party-dependency paragraph in "Outcome of Reports" that contradicted the case (a)/(b) paragraph added in the previous commit. The new paragraph stays; the old one is gone. 2. Tighten "Outcome of Reports" itself. "may be converted into public GitHub issues" is now "are typically converted into public GitHub issues, where the community can contribute fixes alongside the maintainers", and explicitly notes that the triage decision and reasoning are communicated back to the reporter in either case. 3. Trust Boundary 3: the standalone "Apache Superset is not, by default, a SQL or database firewall" sentence has been folded into the Out-of-Scope hardening bullet under Vulnerability Scope, where it belongs alongside the explicit carve-out. In its place, Trust Boundary 3 now scopes the codebase's commitments to "the role and capability matrix and to controls Apache Superset's own documentation explicitly positions as security boundaries", which makes the test for the in-scope "documented security boundary" bullet checkable instead of judgement-call. 4. Promote Trust Boundaries, Roles and Capabilities, and Vulnerability Scope from bold subsections to `###` headings. They are the substantive body of the Security Model and should carry section weight. Trailing sections (Outcome of Reports, Vulnerability Aggregation & CVE Attribution) stay as `**bold**` since they are lighter-weight footers and unchanged from before this PR. 5. Embedded guest token row: "data sources bound to the dashboards in the token's resources claim" tightened to "data sources reachable through the embedded dashboards the token authorizes" — same meaning, less JWT-claim jargon for non-technical reporters. 6. Drop "by default" from the firewall disclaimer. Apache Superset is not a SQL or database firewall; the qualifier invited the question "what if I configure it to be one?", which it cannot. MITRE CNA citations (Operational Rules 4.1 in the Admin trust boundary and Out-of-Scope, and 4.1.10/4.1.11/4.2.13 in Vulnerability Aggregation) are unchanged and remain in their original locations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Originally PR apache#40503 in apache/superset by @sha174n