diff --git a/.planning/codebase/ARCHITECTURE.md b/.planning/codebase/ARCHITECTURE.md new file mode 100644 index 0000000000..4014b7341a --- /dev/null +++ b/.planning/codebase/ARCHITECTURE.md @@ -0,0 +1,161 @@ +# Architecture + +**Analysis Date:** 2026-01-21 + +## Pattern Overview + +**Overall:** Layered Event-Driven Architecture with Domain-Driven Design + +**Key Characteristics:** +- Observable-based event system using pub/sub pattern (LifeCycle) +- Clear dependency layering: core โ†’ rum-core โ†’ rum/logs/rum-react +- Assembly pattern for event enrichment and transformation +- Modular domain-driven structure with collection modules per concern +- Batch-based transport with compression support + +## Layers + +**Core Layer:** +- Purpose: Foundation library providing shared utilities and primitives +- Location: `packages/core/src` +- Contains: Browser APIs, observables, transport, session management, configuration, error handling, telemetry +- Depends on: None (base layer) +- Used by: All other packages (rum-core, logs, rum-react, worker, flagging) + +**RUM Core Layer:** +- Purpose: Real User Monitoring business logic without UI-specific features +- Location: `packages/rum-core/src` +- Contains: Event collection (views, actions, errors, resources, long tasks, vitals), RUM assembly, contexts, tracing +- Depends on: @datadog/browser-core +- Used by: rum, rum-slim, rum-react packages + +**Product Layer:** +- Purpose: User-facing SDK packages with specific feature sets +- Location: `packages/rum`, `packages/rum-slim`, `packages/logs`, `packages/rum-react` +- Contains: Public APIs, entry points, product-specific features (session replay, profiling, React integration) +- Depends on: core and/or rum-core +- Used by: End-user applications + +**Transport Layer:** +- Purpose: Data transmission to Datadog backend +- Location: `packages/core/src/transport`, `packages/rum-core/src/transport` +- Contains: Batching, compression, HTTP requests, event bridge, flush control +- Depends on: Core utilities +- Used by: Collection modules + +**Browser Integration Layer:** +- Purpose: Browser API observation and instrumentation +- Location: `packages/core/src/browser`, `packages/rum-core/src/browser` +- Contains: XHR/Fetch observables, performance observables, DOM mutation tracking, location change tracking +- Depends on: Core utilities +- Used by: Collection modules + +## Data Flow + +**RUM Event Collection Flow:** + +1. Browser event occurs (click, XHR, error, etc.) +2. Observable captures raw event (xhrObservable, performanceObservable, etc.) +3. Collection module processes event (actionCollection, resourceCollection, errorCollection) +4. Collection emits RAW_RUM_EVENT_COLLECTED to LifeCycle with domain context +5. Assembly module enriches event with contexts (view, session, user, global) +6. Assembly emits RUM_EVENT_COLLECTED with fully assembled event +7. Batch collects events and manages buffer +8. FlushController triggers flush based on size/time/page exit +9. Encoder compresses batch (optional, via worker) +10. HttpRequest sends to Datadog intake + +**Context Enrichment:** + +1. Context managers maintain state (viewHistory, sessionContext, userContext, globalContext) +2. Assembly reads current context values at event time +3. Hooks allow custom transformation via beforeSend callbacks +4. Rate limiters and telemetry track event volumes + +**State Management:** +- Session state persisted in cookies/localStorage +- View history maintained in memory with expiration +- Context managers use Observable pattern for updates +- ValueHistory tracks time-based context changes + +## Key Abstractions + +**LifeCycle:** +- Purpose: Central event bus for SDK-internal communication +- Examples: `packages/rum-core/src/domain/lifeCycle.ts` +- Pattern: Type-safe pub/sub with enum-based event types (AUTO_ACTION_COMPLETED, RAW_RUM_EVENT_COLLECTED, RUM_EVENT_COLLECTED, etc.) + +**Observable:** +- Purpose: Reactive data streams for browser events +- Examples: `packages/core/src/tools/observable.ts`, `packages/core/src/browser/xhrObservable.ts`, `packages/core/src/browser/fetchObservable.ts` +- Pattern: Subscribe/unsubscribe with typed callbacks, buffering support + +**Collection Modules:** +- Purpose: Domain-specific event capture and processing +- Examples: `packages/rum-core/src/domain/action/actionCollection.ts`, `packages/rum-core/src/domain/resource/resourceCollection.ts`, `packages/rum-core/src/domain/error/errorCollection.ts` +- Pattern: Subscribe to observables, emit to LifeCycle, manage domain state + +**Assembly:** +- Purpose: Event enrichment and transformation pipeline +- Examples: `packages/rum-core/src/domain/assembly.ts` +- Pattern: Combine raw events with contexts, apply hooks, validate modifications, emit assembled events + +**Context Managers:** +- Purpose: Stateful context tracking with customer data +- Examples: `packages/core/src/domain/contexts/userContext.ts`, `packages/rum-core/src/domain/contexts/viewHistory.ts` +- Pattern: ContextManager interface with set/get/remove, validation, storage sync + +**Batch:** +- Purpose: Event buffering and transmission +- Examples: `packages/core/src/transport/batch.ts` +- Pattern: Add messages to buffer, upsert by key, flush on trigger, encode before send + +## Entry Points + +**RUM Full (with Session Replay):** +- Location: `packages/rum/src/entries/main.ts` +- Triggers: Application calls `datadogRum.init()` +- Responsibilities: Initialize RUM core, start recorder API, start profiler API, create deflate encoder, expose DD_RUM global + +**RUM Slim (without Session Replay):** +- Location: `packages/rum-slim/src/entries/main.ts` +- Triggers: Application calls `datadogRum.init()` +- Responsibilities: Initialize RUM core with stub recorder, lighter bundle size + +**RUM Core Bootstrap:** +- Location: `packages/rum-core/src/boot/startRum.ts` +- Triggers: Called by product packages (rum, rum-slim) +- Responsibilities: Start all collection modules, initialize contexts, start transport, wire up LifeCycle subscriptions + +**Logs:** +- Location: `packages/logs/src/entries/main.ts` +- Triggers: Application calls `datadogLogs.init()` +- Responsibilities: Initialize log collection, start transport, expose DD_LOGS global + +**React Plugin:** +- Location: `packages/rum-react/src/entries/main.ts` +- Triggers: Application passes plugin to `datadogRum.init()` +- Responsibilities: React error boundary, performance tracking, React Router integration + +## Error Handling + +**Strategy:** Monitored execution with fallback to ensure SDK never breaks host application + +**Patterns:** +- `monitor()` and `monitored()` decorators wrap SDK functions to catch and report internal errors +- `catchUserErrors()` wraps user-provided callbacks to isolate user code failures +- `trackRuntimeError()` reports SDK errors to telemetry without throwing +- `computeRawError()` normalizes error objects with stack traces +- Multiple error sources tracked: console.error, window.onerror, unhandledrejection, CSP violations, ReportingObserver + +## Cross-Cutting Concerns + +**Logging:** Console display utilities with debug mode, warning/error helpers in `packages/core/src/tools/display.ts` + +**Validation:** Type checking and sanitization in context managers, JSON schema validation for remote config, field modification limits in assembly + +**Authentication:** Customer API keys in configuration, session tokens in cookies, synthetics test detection, tracking consent state management + +--- + +*Architecture analysis: 2026-01-21* diff --git a/.planning/codebase/CONCERNS.md b/.planning/codebase/CONCERNS.md new file mode 100644 index 0000000000..579d94c8ad --- /dev/null +++ b/.planning/codebase/CONCERNS.md @@ -0,0 +1,269 @@ +# Codebase Concerns + +**Analysis Date:** 2026-02-16 + +## Tech Debt + +**Next-Major Deferred Cleanup (10 items):** +- Issue: Ten `TODO next major` or `TODO(next-major)` comments defer breaking changes to the next major version. These accumulate complexity and make each option harder to remove. +- Files: + - `packages/rum-core/src/domain/configuration/configuration.ts:186` - `enablePrivacyForActionName` should become the default + - `packages/rum-core/src/boot/preStartRum.ts:68` - Remove `globalContextManager`, `userContextManager`, `accountContextManager` from pre-start strategy + - `packages/rum-core/src/boot/rumPublicApi.ts:614` - Remove `strategy` from plugin `onRumStart` callback + - `packages/rum-core/src/boot/rumPublicApi.ts:724` - Decide on relative time support for `addTiming` + - `packages/rum-core/src/browser/performanceObservable.ts:241` - Remove performance entry fallback + - `packages/core/src/domain/configuration/configuration.ts:255` - Remove `internalAnalyticsSubdomain` option, replace with `proxyFn` + - `packages/core/src/browser/fetchObservable.ts:46` - Remove "WAIT" action when `trackEarlyRequests` is removed + - `packages/core/src/tools/readBytesFromStream.ts:6` - Always collect stream body when `trackEarlyRequests` is removed + - `packages/logs/src/boot/preStartLogs.ts:42` - Same context manager cleanup as rum-core + - `packages/rum-react/test/reactOldBrowsersSupport.ts:5` - Bump browser targets for `measureOptions` +- Impact: Increasing configuration surface area with deprecated options that cannot be safely removed until the next major. Makes the init path harder to reason about. +- Fix approach: Track all items in a v7 migration document. Group them for a single coordinated major version bump. + +**Generated Type Files Checked Into Source:** +- Issue: Large auto-generated type files (2000+ lines) from JSON schemas committed to version control with `/* eslint-disable */` +- Files: `packages/rum-core/src/rumEvent.types.ts` (2184 lines), `packages/core/src/domain/telemetry/telemetryEvent.types.ts` (942 lines), `packages/rum/src/types/sessionReplay.ts` (991 lines) +- Impact: Massive diffs on schema changes, no linting on generated code, IDE performance degradation when these files are open +- Fix approach: Generate at build time rather than checking in, or split schemas into smaller logical units + +**Flagging Package is a Stub:** +- Issue: `packages/flagging` is a placeholder with zero implementation and a hardcoded `console.log` +- Files: `packages/flagging/src/hello.ts` (entire file is a TODO), `packages/flagging/src/entries/main.ts` +- Impact: Workspace overhead for a non-functional package. Dependency version (`@datadog/browser-core: 6.22.0`) is stale vs. the monorepo version (`6.27.1`), indicating it is not maintained alongside other packages. +- Fix approach: Either implement the feature or remove the package from the workspace until it is ready + +**Old Cookie Migration Code:** +- Issue: Migration code for legacy cookie format (`_dd`, `_dd_r`, `_dd_l`) permanently kept with comment: "This migration should remain in the codebase as long as older versions are available/live" +- Files: `packages/core/src/domain/session/oldCookiesMigration.ts` (42 lines), called from `packages/core/src/domain/session/storeStrategies/sessionInCookie.ts` +- Impact: Indefinite technical debt with no expiration date. Runs on every cookie-based session initialization. +- Fix approach: Define a migration cutoff version (e.g., only support migrations from v5+). Add a `monitor-until` deadline. + +**Deprecated APIs Still Maintained:** +- Issue: Multiple deprecated types and methods remain exported +- Files: + - `packages/rum/src/entries/main.ts:35` - `RumGlobal` deprecated for `DatadogRum` + - `packages/rum-slim/src/entries/main.ts:24` - Same deprecation + - `packages/logs/src/entries/main.ts:29` - `LogsGlobal` deprecated for `DatadogLogs` + - `packages/core/src/tools/boundedBuffer.ts:6-19` - `BoundedBuffer` deprecated for `BufferedObservable` + - `packages/core/src/domain/configuration/configuration.ts:106` - `allowFallbackToLocalStorage` deprecated for `sessionPersistence: 'local-storage'` + - `packages/rum-core/src/boot/rumPublicApi.ts:280-283` - `setUser` without required `id` deprecated +- Impact: Increases API surface, documentation burden, and test maintenance +- Fix approach: Batch-remove all deprecated APIs in the v7 major release + +**Module-Level Mutable State (Global Singletons):** +- Issue: At least 16 `let` declarations at module scope in `packages/core/src/` create shared mutable state +- Files: + - `packages/core/src/tools/experimentalFeatures.ts:28` - `enabledExperimentalFeatures` singleton shared between RUM and Logs + - `packages/core/src/domain/telemetry/telemetry.ts:79` - `telemetryObservable` module-level variable + - `packages/core/src/tools/monitor.ts:3-4` - `onMonitorErrorCollected` and `debugMode` globals + - `packages/core/src/tools/valueHistory.ts:33-35` - `cleanupHistoriesInterval` and `cleanupTasks` globals + - `packages/core/src/tools/utils/timeUtils.ts:102` - `navigationStart` cache +- Impact: Shared state between RUM and Logs products when used via NPM can cause unexpected cross-product side effects. Makes testing harder (requires explicit reset functions). Comment in `experimentalFeatures.ts` acknowledges: "an experimental flag set on the RUM product will be set on the Logs product." +- Fix approach: Move shared state into an initialization context object passed through the call chain. Short-term: document all singletons and their cross-product implications. + +## Known Bugs + +**Session Replay Serialization Privacy Bugs (3 documented):** +- Symptoms: Incorrect masking/unmasking behavior for certain form input types +- Files: `packages/rum/src/domain/record/serialization/serializeAttributes.spec.ts` + - Line 101: `value` treated as `maskable-image` instead of `maskable` ("TODO: This is a bug!") + - Line 137: `` has inconsistent `maskUnlessAllowlisted` behavior ("TODO: This is almost certainly a bug") + - Line 173: `` falls back to DOM attribute value when `HTMLInputElement#value` is falsy ("TODO: This is a bug!") +- Trigger: Elements with the above types are serialized for Session Replay under various privacy levels +- Workaround: Tests document the incorrect behavior and treat it as expected for now + +**Firefox Worker Error Handling:** +- Symptoms: Deflate worker errors handled differently across browsers +- Files: `packages/rum/src/domain/deflate/deflateWorker.ts` +- Trigger: Worker initialization failure - Chromium throws exception, Firefox fires error event +- Workaround: Code handles both patterns. Comment references https://bugzilla.mozilla.org/show_bug.cgi?id=1736865#c2 + +**Session Replay Race Condition on Page Unload:** +- Symptoms: Last segment may not be sent before page unload when using Web Worker +- Files: `test/e2e/lib/framework/flushEvents.ts` (lines 10-23) +- Trigger: Fast page navigation where Worker async communication completes after beforeunload event +- Workaround: E2E tests use a delayed endpoint (`/ok?duration=200`) to allow Worker time to send requests + +**Privacy CSS Element Order Not Preserved:** +- Symptoms: When ignoring `