feat(gdpr): IP/privacy audit (PR2 of #6701)#7547
Conversation
Review Summary by QodoGDPR IP/privacy audit: tri-state ipLogging setting and leak fixes
WalkthroughsDescription• Replace boolean disableIPlogging with tri-state ipLogging setting - Supports "anonymous" (default), "truncated", "full" modes - Deprecation shim maps legacy boolean for one release cycle • Fix four IP logging leaks where disableIPlogging was silently ignored - Rate-limit warnings in PadMessageHandler and import/export - Authentication failure/success logs in webaccess • Remove dead clientVars.clientIp placeholder (always '127.0.0.1', never read) • Add comprehensive privacy documentation and operator-facing IP handling guide Diagramflowchart LR
A["Legacy disableIPlogging<br/>boolean"] -->|"Deprecation shim<br/>at load time"| B["New ipLogging<br/>tri-state"]
B -->|"'anonymous'"| C["ANONYMOUS<br/>in logs"]
B -->|"'truncated'"| D["Last octet zeroed<br/>IPv4/IPv6 /48"]
B -->|"'full'"| E["Full IP<br/>in logs"]
F["5 logging sites<br/>PadMessageHandler<br/>SocketIORouter<br/>webaccess<br/>importexport"] -->|"Route through<br/>anonymizeIp helper"| B
G["clientVars.clientIp<br/>placeholder"] -->|"Remove<br/>dead code"| H["Cleaner type<br/>surface"]
I["doc/privacy.md"] -->|"Operator<br/>documentation"| J["Clear IP handling<br/>statement"]
File Changes1. src/node/utils/anonymizeIp.ts
|
Code Review by Qodo
1. IP log-site fixes untested
|
| httpLogger.info( | ||
| `Failed authentication from IP ${anonymizeIp(req.ip, settings.ipLogging)}`); | ||
| if (await aCallFirst0('authnFailure', {req, res})) return; | ||
| if (await aCallFirst0('authFailure', {req, res, next})) return; | ||
| // No plugin handled the authentication failure. Fall back to basic authentication. |
There was a problem hiding this comment.
1. Ip log-site fixes untested 📘 Rule violation ☼ Reliability
The PR fixes multiple IP logging leaks, but the added tests only validate anonymizeIp() and shim mapping, not that the updated log call sites actually use it. This risks regressions where raw IPs could reappear in logs without failing CI.
Agent Prompt
## Issue description
Bug fixes for IP logging leaks are not covered by a regression test that would fail if a log site reverted to logging raw IPs.
## Issue Context
The PR updates multiple log call sites (auth logs, rate-limit warnings, socket connect logs) to route through `anonymizeIp(...)`. The newly added tests validate the helper and shim behavior but do not exercise the actual log-producing code paths.
## Fix Focus Areas
- src/node/hooks/express/webaccess.ts[182-212]
- src/node/hooks/express/importexport.ts[20-24]
- src/node/handler/PadMessageHandler.ts[279-285]
- src/node/handler/SocketIORouter.ts[62-66]
- src/tests/backend/specs/ipLoggingSetting.ts[1-70]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| */ | ||
| disableIPlogging: false, | ||
| ipLogging: 'anonymous', |
There was a problem hiding this comment.
2. iplogging changes default behavior 📘 Rule violation ⚙ Maintainability
ipLogging defaults to anonymous, changing the default logging behavior rather than keeping prior behavior unless explicitly enabled. This introduces a behavior change without a feature flag gating it off by default.
Agent Prompt
## Issue description
The new `ipLogging` functionality changes default behavior (`anonymous`) rather than preserving the prior default behavior unless explicitly enabled.
## Issue Context
Compliance requires new behavior to be off by default (behavior unchanged) unless explicitly enabled via a flag/config.
## Fix Focus Areas
- src/node/utils/Settings.ts[499-503]
- settings.json.template[475-493]
- settings.json.docker[478-489]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| export const anonymizeIp = (ip: string | null | undefined, mode: IpLogging): string => { | ||
| if (ip == null || ip === '') return 'ANONYMOUS'; | ||
| if (mode === 'anonymous') return 'ANONYMOUS'; | ||
| if (mode === 'full') return ip; | ||
| // truncated | ||
| const mapped = IPV4_MAPPED.exec(ip); | ||
| if (mapped != null) return `::ffff:${mapped[1].replace(/\.\d+$/, '.0')}`; | ||
| switch (isIP(ip)) { | ||
| case 4: return ip.replace(/\.\d+$/, '.0'); | ||
| case 6: return truncateIpv6(ip); | ||
| default: return 'ANONYMOUS'; | ||
| } |
There was a problem hiding this comment.
3. Iplogging not validated 🐞 Bug ≡ Correctness
If settings.ipLogging is set to an unsupported value (including null), anonymizeIp() will treat it as "truncated" instead of failing or falling back to the documented default "anonymous". This can silently change privacy behavior and produce unexpected partially-redacted IPs in logs.
Agent Prompt
## Issue description
`settings.ipLogging` is consumed at runtime as a trusted union, but config loading does not validate it. Any invalid value silently causes `anonymizeIp()` to behave as if mode is `truncated`.
## Issue Context
- `storeSettings()` copies parsed config values into `settings` with no schema validation.
- `anonymizeIp()` treats all non-`anonymous`/`full` values as `truncated`.
## Fix Focus Areas
- src/node/utils/Settings.ts[961-996]
- src/node/utils/anonymizeIp.ts[23-34]
## What to change
1. In `reloadSettings()` (after `storeSettings(...)`), validate `settings.ipLogging`.
- Allowed values: `anonymous`, `truncated`, `full`.
- If invalid (including null/undefined), `logger.error` or `logger.warn` and set `settings.ipLogging = 'anonymous'`.
2. Consider making `anonymizeIp()` defensive:
- Use an explicit `switch (mode)` with a `default` that returns `'ANONYMOUS'` (privacy-safe) or throws in dev/test.
3. Add/adjust a backend test asserting invalid `ipLogging` values are rejected or normalized to `'anonymous'`.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Second of five GDPR PRs (#6701). Audit identifies four log-sites that leak IPs despite disableIPlogging=true, proposes a tri-state ipLogging setting with a back-compat shim, and specifies a doc/privacy.md that documents Etherpad's actual IP handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 TDD-structured tasks: anonymizeIp helper + unit tests, tri-state ipLogging setting with disableIPlogging deprecation shim, wiring through 5 leaking log sites, clientVars.clientIp removal, access-log integration test, doc/privacy.md, and PR handoff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes four leaks where disableIPlogging was silently ignored (rate-limit warn, both auth-log calls in webaccess, import/export rate-limit warn) and normalises the four that did honour the flag onto the new ipLogging tri-state via the shared helper.
Server side: remove the literal '127.0.0.1' assignments from both clientVars and collab_client_vars. Type side: drop clientIp from ClientVarPayload and ServerVar. pad.getClientIp now returns the same '127.0.0.1' literal as a plugin-compat shim (pad_utils.uniqueId still uses it as a prefix).
14d3218 to
7fb5c05
Compare
Qodo review: - settings.ipLogging is loaded as a trusted union but nothing enforced the shape. An unknown value (e.g. a typo or null) silently fell through to anonymizeIp's "truncated" branch and emitted partially redacted IPs. Fall back to "anonymous" with a WARN at load time. - New regression test scans the four known log-sites for raw req.ip / socket.request.ip / request.ip inside logger calls that don't wrap through anonymizeIp / logIp, so a future edit that re-introduces a raw IP fails CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
disableIPlogging=true: thePadMessageHandlerrate-limit warn, both auth-log calls inwebaccess, and the import/export rate-limit warn.ipLogging: "full" | "truncated" | "anonymous"(default"anonymous"). Every IP log site now routes through a singleanonymizeIp()helper.disableIPloggingworking for one release via a load-time shim that mapstrue→"anonymous",false→"full"and emits a WARN when only the legacy setting is set.clientVars.clientIpplaceholder (always'127.0.0.1'; no client code read it).pad.getClientIp()is retained as a plugin-compat shim returning the same literal.doc/privacy.mddocuments exactly what Etherpad logs, where, and how to configure it.Part of the GDPR work tracked in #6701. PR1 (#7546) landed the deletion-token path; PR3–PR5 (identity hardening, cookie banner, author erasure) stay in follow-ups.
Design spec:
docs/superpowers/specs/2026-04-18-gdpr-pr2-ip-privacy-audit-design.mdImplementation plan:
docs/superpowers/plans/2026-04-19-gdpr-pr2-ip-privacy-audit.mdTest plan
pnpm --filter ep_etherpad-lite run ts-check