diff --git a/crates/mdk-core/CHANGELOG.md b/crates/mdk-core/CHANGELOG.md index fd8e036a..a72adb1b 100644 --- a/crates/mdk-core/CHANGELOG.md +++ b/crates/mdk-core/CHANGELOG.md @@ -25,6 +25,7 @@ ### Breaking changes +- **kind:445 group message and Welcome rumor wire formats now pad the encrypted plaintext to power-of-two buckets.** Senders pad the serialized MLS payload to a minimum of 512 bytes for kind:445 events and 1024 bytes for Welcome rumors, then round up to the next power of two for larger payloads. Receivers in this release accept both padded and legacy unpadded inputs, but older receivers using `MlsMessageIn::tls_deserialize_exact` will reject padded messages — all clients must upgrade. Closes [marmot-protocol/marmot-security#33](https://github.com/marmot-protocol/marmot-security/issues/33) (Welcome size leaks group member count) and [marmot-protocol/marmot-security#37](https://github.com/marmot-protocol/marmot-security/issues/37) (SelfRemove size leaks departure intent). ([#308](https://github.com/marmot-protocol/mdk/pull/308)) - **MIP-04 media upload/reference structs now include optional audio display metadata.** `EncryptedMediaUpload` and `MediaReference` both add `duration_ms: Option` and `waveform: Option>`; callers constructing these structs directly must provide the new fields. The IMETA parser reads optional NIP-A0 `duration` and `waveform` entries on a best-effort basis, and the tag builder emits them when provided. ([#300](https://github.com/marmot-protocol/mdk/pull/300)) - **Extension version bumped from 2 to 3**: `NostrGroupDataExtension::CURRENT_VERSION` is now `3`. The new version adds a `disappearing_message_secs` field to the TLS-serialized group data extension. Existing v1/v2 groups are forward-compatible (the field deserializes as `None`). `NostrGroupConfigData::new` now takes an additional `disappearing_message_secs: Option` parameter. ([#253](https://github.com/marmot-protocol/mdk/pull/253)) - **`NostrGroupDataExtension::migrate_to_v2` removed from public API.** The 0.8.0 method was used only to construct test fixtures; production code never migrates extensions in-place (deserialization upgrades v1/v2 → v3 through `into_v3`, and new groups author v3 directly). Mirrors the existing convention for `migrate_group_image_v1_to_v2`, which was already documented as internal-only. ([#253](https://github.com/marmot-protocol/mdk/pull/253)) @@ -41,6 +42,8 @@ ### Fixed +- Padded kind:445 message plaintexts and Welcome rumor payloads to fixed power-of-two buckets so a relay observer cannot fingerprint SelfRemove proposals by size or estimate group member count from gift-wrap event length. Empty SelfRemove `PublicMessage`s now share a bucket with short text messages, and small-group Welcomes share a bucket with each other regardless of exact member count. Closes [marmot-protocol/marmot-security#33](https://github.com/marmot-protocol/marmot-security/issues/33) and [marmot-protocol/marmot-security#37](https://github.com/marmot-protocol/marmot-security/issues/37). ([#308](https://github.com/marmot-protocol/mdk/pull/308)) +- Relaxed `process_mls_message` from `MlsMessageIn::tls_deserialize_exact` to non-strict `tls_deserialize` so the trailing zero padding bytes added by the new sender path are ignored. Both `process_mls_message` (kind:445) and `parse_serialized_welcome` (gift-wrapped Welcomes) now also reject non-zero trailing bytes, pinning the padding region to "zero only" so a malicious sender (including a group admin authoring a Welcome) cannot use the padding as a covert channel to the receiver. Legacy unpadded messages continue to deserialize successfully. ([#308](https://github.com/marmot-protocol/mdk/pull/308)) - Fixed admin auto-commit of legacy `Remove(self)` leaves in mixed/legacy groups so departing members are actually removed instead of silently remaining in the MLS group. ([#288](https://github.com/marmot-protocol/mdk/pull/288)) - Accepted `NostrGroupDataExtension` payloads from future versions with unknown trailing fields, per MIP-01's forward-compatibility requirement. Previously, any v(N+1) extension on the wire was rejected by `deserialize_bytes`, which would have bricked every group operation (commit/proposal/welcome/admin checks) the moment any peer authored a newer-version extension. ([#88](https://github.com/marmot-protocol/marmot-security/issues/88)) - Recorded rollback snapshots for locally merged admin-list updates so clients can converge on the MIP-03 winner when competing admins concurrently mutate `admin_pubkeys`. ([#289](https://github.com/marmot-protocol/mdk/pull/289)) diff --git a/crates/mdk-core/src/groups.rs b/crates/mdk-core/src/groups.rs index cee4bedd..2152b509 100644 --- a/crates/mdk-core/src/groups.rs +++ b/crates/mdk-core/src/groups.rs @@ -34,6 +34,7 @@ use crate::constant::{GROUP_CONTEXT_REQUIRED_EXTENSIONS, SUPPORTED_PROPOSALS}; use crate::error::Error; use crate::messages::EventTag; use crate::messages::crypto::encrypt_message_with_exporter_secret; +use crate::padding::{MESSAGE_PADDING_FLOOR, WELCOME_PADDING_FLOOR, pad_to_bucket}; use crate::util::{ContentEncoding, encode_content}; /// Result of creating a new MLS group @@ -2427,6 +2428,12 @@ where /// Per MIP-03, the encryption key is derived via `MLS-Exporter("marmot", "group-event", 32)`, /// a random 12-byte nonce is generated per event. No AAD is used per MIP-03. /// The content format is `base64(nonce || ciphertext)`. + /// + /// Before encryption the serialized MLS message is padded to a power-of-two + /// bucket (see `crate::padding`) so a relay cannot fingerprint specific + /// message types (notably empty SelfRemove proposals) by their on-the-wire + /// size. The receiver's MLS deserializer consumes only the framed message + /// bytes and ignores the trailing zero padding. pub(crate) fn build_message_event( &self, group_id: &GroupId, @@ -2440,7 +2447,8 @@ where // Derive the encryption key via MLS exporter (stable per epoch) let secret: group_types::GroupExporterSecret = self.exporter_secret(group_id)?; - let encrypted_content = encrypt_message_with_exporter_secret(&secret, &serialized_content)?; + let padded_content = pad_to_bucket(&serialized_content, MESSAGE_PADDING_FLOOR); + let encrypted_content = encrypt_message_with_exporter_secret(&secret, &padded_content)?; // Generate ephemeral key for signing (MUST NOT be reused across events) let ephemeral_nostr_keys: Keys = Keys::generate(); @@ -2506,12 +2514,20 @@ where let committer_pubkey = self.get_own_pubkey(group)?; let mut welcome_rumors_vec = Vec::new(); + // Pad once before the per-recipient loop: the MLS Welcome bytes are + // identical for every recipient in this commit, and padding to a + // power-of-two bucket prevents a relay from inferring group size from + // the resulting NIP-59 gift-wrap event length. The recipient's + // `Welcome::tls_deserialize` consumes only the framed welcome and + // ignores the trailing zero padding. + let padded_welcome = pad_to_bucket(&serialized_welcome, WELCOME_PADDING_FLOOR); + for event in key_package_events { // SECURITY: Always use base64 encoding with explicit encoding tag per MIP-00/MIP-02. // This prevents downgrade attacks and parsing ambiguity across clients. let encoding = ContentEncoding::Base64; - let encoded_welcome = encode_content(&serialized_welcome, encoding); + let encoded_welcome = encode_content(&padded_welcome, encoding); tracing::debug!( target: "mdk_core::groups", @@ -8776,4 +8792,610 @@ mod tests { "expiration must equal created_at + duration" ); } + + /// Wire-format padding tests guarding marmot-security issues #33 + /// (Welcome size leaks group size) and #37 (SelfRemove size leaks + /// departure intent). Each test exercises a sender path and asserts that + /// the observable on-the-wire content size is bucketed instead of leaking + /// the underlying plaintext length. + mod padding_tests { + use base64::Engine; + use base64::engine::general_purpose::STANDARD as BASE64; + use nostr::{EventBuilder, Kind}; + + use super::*; + use crate::MDK; + use crate::messages::crypto::encrypt_message_with_exporter_secret; + use crate::padding::{MESSAGE_PADDING_FLOOR, WELCOME_PADDING_FLOOR}; + + /// Decoded length of a kind:445 wrapper event's content, stripping + /// the 12-byte nonce and 16-byte Poly1305 tag added by + /// ChaCha20-Poly1305 — what's left is the padded MLS plaintext. + fn decoded_plaintext_len(content: &str) -> usize { + let decoded = BASE64 + .decode(content) + .expect("kind:445 content must be valid base64"); + assert!( + decoded.len() >= 28, + "decoded content must have nonce + tag (≥28 bytes)" + ); + // 12-byte nonce + 16-byte Poly1305 tag overhead. + decoded.len() - 28 + } + + /// Two text messages with very different short lengths must produce + /// kind:445 events with identical plaintext sizes — proof that the + /// floor bucket flattens their fingerprints. Guards issue #37. + #[test] + fn test_kind_445_short_messages_share_padding_bucket() { + let mdk = create_test_mdk(); + let (creator, members, admins) = create_test_group_members(); + let group_id = create_test_group(&mdk, &creator, &members, &admins); + + let short = mdk + .create_message(&group_id, create_test_rumor(&creator, "hi"), None) + .expect("short message creates"); + let longer = mdk + .create_message( + &group_id, + create_test_rumor(&creator, &"x".repeat(50)), + None, + ) + .expect("longer message creates"); + + let short_len = decoded_plaintext_len(&short.content); + let longer_len = decoded_plaintext_len(&longer.content); + + assert_eq!( + short_len, longer_len, + "small messages in the same bucket must produce identical sizes \ + (short={short_len}, longer={longer_len})" + ); + assert!( + short_len >= MESSAGE_PADDING_FLOOR, + "padded plaintext must be at least the bucket floor ({} bytes), got {}", + MESSAGE_PADDING_FLOOR, + short_len + ); + assert!( + short_len.is_power_of_two(), + "padded plaintext length must be a power-of-two bucket, got {short_len}" + ); + } + + /// A SelfRemove proposal (issue #37's specific concern) must not be + /// distinguishable by size from a routine small text message — both + /// land in the same bucket. + #[test] + fn test_self_remove_event_size_matches_text_message() { + let alice_keys = Keys::generate(); + let bob_keys = Keys::generate(); + + let alice_mdk = create_test_mdk(); + let bob_mdk = create_test_mdk(); + + let bob_kp = create_key_package_event(&bob_mdk, &bob_keys); + + let create_result = alice_mdk + .create_group( + &alice_keys.public_key(), + vec![bob_kp], + create_nostr_group_config_data(vec![alice_keys.public_key()]), + ) + .expect("alice creates group"); + let group_id = create_result.group.mls_group_id.clone(); + alice_mdk + .merge_pending_commit(&group_id) + .expect("alice merges"); + + let bob_welcome = &create_result.welcome_rumors[0]; + let bob_preview = bob_mdk + .process_welcome(&nostr::EventId::all_zeros(), bob_welcome) + .expect("bob processes welcome"); + bob_mdk.accept_welcome(&bob_preview).expect("bob accepts"); + + let small_text = bob_mdk + .create_message(&group_id, create_test_rumor(&bob_keys, "hi"), None) + .expect("small text creates"); + let leave = bob_mdk.leave_group(&group_id).expect("bob leaves"); + + let text_len = decoded_plaintext_len(&small_text.content); + let leave_len = decoded_plaintext_len(&leave.evolution_event.content); + + assert_eq!( + text_len, leave_len, + "SelfRemove event must bucket to the same size as a small text \ + message (text={text_len}, leave={leave_len}) — otherwise a \ + relay can fingerprint departures by size" + ); + } + + /// The receiver must accept legacy (unpadded) kind:445 plaintexts so + /// patched clients can continue to read messages produced by older + /// senders. This pins the forward-compatibility direction of the + /// `tls_deserialize_exact` → `tls_deserialize` relaxation in + /// `process_mls_message`. + #[test] + fn test_receiver_accepts_legacy_unpadded_kind_445() { + let alice_keys = Keys::generate(); + let bob_keys = Keys::generate(); + + let alice_mdk = create_test_mdk(); + let bob_mdk = create_test_mdk(); + + let bob_kp = create_key_package_event(&bob_mdk, &bob_keys); + let create_result = alice_mdk + .create_group( + &alice_keys.public_key(), + vec![bob_kp], + create_nostr_group_config_data(vec![alice_keys.public_key()]), + ) + .expect("alice creates group"); + let group_id = create_result.group.mls_group_id.clone(); + alice_mdk + .merge_pending_commit(&group_id) + .expect("alice merges"); + + let bob_welcome = &create_result.welcome_rumors[0]; + let bob_preview = bob_mdk + .process_welcome(&nostr::EventId::all_zeros(), bob_welcome) + .expect("bob processes welcome"); + bob_mdk.accept_welcome(&bob_preview).expect("bob accepts"); + + // Hand-craft an unpadded kind:445 event the way pre-padding + // senders did: encrypt the raw serialized MLS payload directly, + // with no padding pass. + let mut mls_group = bob_mdk + .load_mls_group(&group_id) + .expect("load ok") + .expect("group exists"); + let mut rumor = create_test_rumor(&bob_keys, "legacy unpadded hello"); + let serialized = bob_mdk + .create_mls_message_payload(&mut mls_group, &mut rumor) + .expect("serialize ok"); + let secret = bob_mdk + .exporter_secret(&group_id) + .expect("exporter secret available"); + let encrypted_content = encrypt_message_with_exporter_secret(&secret, &serialized) + .expect("legacy encrypt ok"); + + let bob_group = bob_mdk.get_group(&group_id).expect("group").expect("ok"); + let h_tag = + nostr::Tag::custom(nostr::TagKind::h(), [hex::encode(bob_group.nostr_group_id)]); + let encoding_tag = + nostr::Tag::custom(nostr::TagKind::Custom("encoding".into()), ["base64"]); + let ephemeral_keys = Keys::generate(); + let legacy_event = EventBuilder::new(Kind::MlsGroupMessage, encrypted_content) + .tag(h_tag) + .tag(encoding_tag) + .sign_with_keys(&ephemeral_keys) + .expect("sign ok"); + + // The patched receiver must successfully process the legacy + // unpadded event. + alice_mdk + .process_message(&legacy_event) + .expect("patched receiver must accept legacy unpadded kind:445"); + } + + /// Trailing bytes after the MLS framing MUST be zero. Anything else + /// would form a covert channel inside the padding region — reject it + /// so the wire format is pinned to "zero-byte padding only". + /// + /// The test runs the same construction twice with the only difference + /// being one tail byte: zero (must succeed) vs non-zero (must yield + /// `MessageProcessingResult::Unprocessable`). Any other failure mode + /// would affect both branches identically, so a divergent outcome + /// isolates the padding check. + #[test] + fn test_receiver_rejects_kind_445_with_non_zero_padding() { + use mdk_storage_traits::{GroupId, MdkStorageProvider}; + use nostr::Event; + + fn craft_event( + sender_mdk: &MDK, + sender_keys: &Keys, + group_id: &GroupId, + tail_byte: u8, + ) -> Event { + let mut mls_group = sender_mdk + .load_mls_group(group_id) + .expect("load ok") + .expect("group exists"); + let mut rumor = create_test_rumor(sender_keys, "padding-tail-probe"); + let mut serialized = sender_mdk + .create_mls_message_payload(&mut mls_group, &mut rumor) + .expect("serialize ok"); + // Pad past the MLS frame so the receiver-side check inspects + // a real trailing region. `next_power_of_two` is the identity + // function on a value that already IS a power of two, so when + // the serialized payload happens to land exactly on a bucket + // boundary the resize would be a no-op and the tail mutation + // would clobber real MLS frame bytes (yielding a deserialize + // failure instead of the padding-check failure under test). + // Bump the target one bucket up in that case to guarantee at + // least one freshly-added zero byte to mutate. + let mut target = serialized.len().next_power_of_two().max(512); + if target == serialized.len() { + target = target.checked_mul(2).expect("bucket overflow"); + } + serialized.resize(target, 0); + *serialized.last_mut().expect("padded buffer is non-empty") = tail_byte; + + let secret = sender_mdk + .exporter_secret(group_id) + .expect("exporter secret available"); + let encrypted_content = + encrypt_message_with_exporter_secret(&secret, &serialized).expect("encrypt ok"); + + let group = sender_mdk.get_group(group_id).expect("group").expect("ok"); + let h_tag = + nostr::Tag::custom(nostr::TagKind::h(), [hex::encode(group.nostr_group_id)]); + let encoding_tag = + nostr::Tag::custom(nostr::TagKind::Custom("encoding".into()), ["base64"]); + let ephemeral_keys = Keys::generate(); + EventBuilder::new(Kind::MlsGroupMessage, encrypted_content) + .tag(h_tag) + .tag(encoding_tag) + .sign_with_keys(&ephemeral_keys) + .expect("sign ok") + } + + // Two independent groups so each branch runs against a clean + // receiver state (process_message records failure metadata that + // would otherwise cross-contaminate the second call via the + // deduplication path). + fn setup_pair() -> (MDK, MDK, Keys, GroupId) { + let alice_keys = Keys::generate(); + let bob_keys = Keys::generate(); + let alice_mdk = create_test_mdk(); + let bob_mdk = create_test_mdk(); + + let bob_kp = create_key_package_event(&bob_mdk, &bob_keys); + let create_result = alice_mdk + .create_group( + &alice_keys.public_key(), + vec![bob_kp], + create_nostr_group_config_data(vec![alice_keys.public_key()]), + ) + .expect("alice creates group"); + let group_id = create_result.group.mls_group_id.clone(); + alice_mdk + .merge_pending_commit(&group_id) + .expect("alice merges"); + + let bob_welcome = &create_result.welcome_rumors[0]; + let bob_preview = bob_mdk + .process_welcome(&nostr::EventId::all_zeros(), bob_welcome) + .expect("bob processes welcome"); + bob_mdk.accept_welcome(&bob_preview).expect("bob accepts"); + + (alice_mdk, bob_mdk, bob_keys, group_id) + } + + // Zero-tail control: identical construction except the last byte + // is zero. Must successfully decode. + let (alice_ok, bob_ok, bob_keys_ok, group_id_ok) = setup_pair(); + let zero_tail_event = craft_event(&bob_ok, &bob_keys_ok, &group_id_ok, 0); + let ok_result = alice_ok + .process_message(&zero_tail_event) + .expect("zero-tail control must process"); + assert!( + matches!(ok_result, MessageProcessingResult::ApplicationMessage(_)), + "zero-tail control must succeed (got {:?})", + ok_result + ); + + // Mutated tail: flip the final padding byte to a non-zero value. + // Must be rejected as Unprocessable. + let (alice_bad, bob_bad, bob_keys_bad, group_id_bad) = setup_pair(); + let bad_tail_event = craft_event(&bob_bad, &bob_keys_bad, &group_id_bad, 0xFF); + let bad_result = alice_bad + .process_message(&bad_tail_event) + .expect("process_message returns Ok wrapping the rejection variant"); + assert!( + matches!(bad_result, MessageProcessingResult::Unprocessable { .. }), + "non-zero padding must be classified as Unprocessable (got {:?})", + bad_result + ); + } + + /// Welcome rumors for groups within the floor's coverage must share a + /// padding bucket so a relay cannot estimate member count from the + /// kind:1059 gift-wrap size. Guards issue #33. + /// + /// The floor (`WELCOME_PADDING_FLOOR = 8192`) was chosen from a + /// measurement sweep over MDK's only supported ciphersuite + /// (`MLS_128_DHKEMX25519_AES128GCM_SHA256_Ed25519`): raw welcome size + /// is ~1085 bytes at 1 added member and grows ~365 bytes per added + /// member, with the 8192-byte bucket boundary crossed between 19 and + /// 20 added members. This test pins both halves of that boundary: + /// + /// * 1, 5, 10, 15, 19 added members must all share the 8192 bucket + /// (anti-fingerprinting property). + /// * 20 added members must jump to the 16384 bucket — this is the + /// loud-failure tripwire: if OpenMLS bloats the welcome (e.g. wider + /// credentials, extra extensions, a different ciphersuite), the + /// boundary slides down and this assertion fires, prompting a + /// re-measurement and a floor re-decision rather than a silent + /// privacy regression. + /// + /// Re-run the measurement diagnostic with: + /// `cargo test -p mdk-core measure_welcome_rumor_sizes -- --ignored --nocapture` + #[test] + fn test_welcome_rumor_size_is_bucketed_for_small_groups() { + fn welcome_decoded_len(member_count: usize) -> usize { + // Fresh MDK per call: signer/key-package state from earlier + // groups must not perturb the welcome serialization. + let mdk = create_test_mdk(); + let creator = Keys::generate(); + let mut key_packages = Vec::new(); + for _ in 0..member_count { + let member = Keys::generate(); + key_packages.push(create_key_package_event(&mdk, &member)); + } + let result = mdk + .create_group( + &creator.public_key(), + key_packages, + create_nostr_group_config_data(vec![creator.public_key()]), + ) + .expect("group creates"); + let welcome = &result.welcome_rumors[0]; + BASE64 + .decode(&welcome.content) + .expect("welcome content is base64") + .len() + } + + // In-floor cohort: every welcome here MUST land in the 8192 bucket. + let in_floor_sizes = [1usize, 5, 10, 15, 19]; + let in_floor_lens: Vec<(usize, usize)> = in_floor_sizes + .iter() + .map(|&n| (n, welcome_decoded_len(n))) + .collect(); + + for &(n, len) in &in_floor_lens { + assert_eq!( + len, + WELCOME_PADDING_FLOOR, + "welcome for N={n} added members must occupy the floor bucket \ + ({floor} bytes), got {len}. Either OpenMLS welcome size shifted \ + or the floor needs re-measurement.", + floor = WELCOME_PADDING_FLOOR + ); + } + + // All in-floor welcomes share a single bucket → indistinguishable. + let unique_in_floor: std::collections::BTreeSet = + in_floor_lens.iter().map(|&(_, l)| l).collect(); + assert_eq!( + unique_in_floor.len(), + 1, + "welcomes for in-floor group sizes must all share one bucket; \ + saw distinct sizes {:?}. Relay observers could distinguish these.", + in_floor_lens + ); + + // Boundary tripwire: one member past the in-floor cohort must + // promote to the next bucket. If this fires *upward* (welcome + // shrinks), great — bump the in-floor cohort. If it fires + // *downward* (boundary slid in), the floor is leaking privacy + // for groups it used to cover; re-measure and revisit. + let breakpoint_len = welcome_decoded_len(20); + assert_eq!( + breakpoint_len, + WELCOME_PADDING_FLOOR * 2, + "welcome at N=20 must occupy the next bucket ({} bytes), got {}. \ + The 8192 floor's coverage window has shifted — re-run the \ + measurement diagnostic and adjust WELCOME_PADDING_FLOOR or the \ + in-floor cohort accordingly.", + WELCOME_PADDING_FLOOR * 2, + breakpoint_len + ); + } + + /// Measurement (not a property test): sweep group size and report the + /// *unpadded* serialized Welcome length so we can pick + /// `WELCOME_PADDING_FLOOR` from data instead of guessing. + /// + /// The padded buffer contains exactly one `MlsMessageIn` followed by + /// zero padding; `tls_deserialize` advances the cursor by exactly the + /// framed length, so `padded.len() - remaining.len()` is the raw + /// welcome size. Run with: + /// + /// cargo test -p mdk-core measure_welcome_rumor_sizes -- --nocapture + #[test] + #[ignore = "diagnostic: prints a size table, run manually with --nocapture"] + fn measure_welcome_rumor_sizes_across_group_sizes() { + use openmls::prelude::MlsMessageIn; + use tls_codec::Deserialize as TlsDeserialize; + + fn welcome_raw_len(member_count: usize) -> usize { + // Fresh MDK per iteration so signer/key-package storage from + // earlier rounds doesn't perturb the serialized welcome size. + let mdk = create_test_mdk(); + let creator = Keys::generate(); + let mut key_packages = Vec::new(); + for _ in 0..member_count { + let member = Keys::generate(); + key_packages.push(create_key_package_event(&mdk, &member)); + } + let result = mdk + .create_group( + &creator.public_key(), + key_packages, + create_nostr_group_config_data(vec![creator.public_key()]), + ) + .expect("group creates"); + let welcome = &result.welcome_rumors[0]; + let padded = BASE64 + .decode(&welcome.content) + .expect("welcome content is base64"); + let total = padded.len(); + let mut cursor: &[u8] = &padded; + MlsMessageIn::tls_deserialize(&mut cursor) + .expect("padded welcome must contain a deserializable MlsMessage prefix"); + total - cursor.len() + } + + fn bucket_for(raw: usize, floor: usize) -> usize { + if raw <= floor { + floor + } else { + raw.next_power_of_two() + } + } + + let sizes: &[usize] = &[ + 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 15, 16, 17, 18, 19, 20, 30, 50, 75, 100, + ]; + + println!(); + println!( + "=== Welcome rumor size by group member count (added members, +1 creator) ===" + ); + println!( + "{:>5} | {:>7} | {:>8} | {:>8} | {:>8} | {:>8}", + "N", "raw", "Δ/N-1", "floor=1k", "floor=4k", "floor=8k" + ); + println!("{}", "-".repeat(62)); + + let mut prev_raw: Option = None; + let mut samples: Vec<(usize, usize)> = Vec::new(); + for &n in sizes { + let raw = welcome_raw_len(n); + let b1k = bucket_for(raw, 1024); + let b4k = bucket_for(raw, 4096); + let b8k = bucket_for(raw, 8192); + let per_member = match prev_raw { + Some(p) if n > 1 => format!("{:+}", raw as i64 - p as i64), + _ => "—".to_string(), + }; + println!( + "{:>5} | {:>7} | {:>8} | {:>8} | {:>8} | {:>8}", + n, raw, per_member, b1k, b4k, b8k + ); + prev_raw = Some(raw); + samples.push((n, raw)); + } + + // Boundary report: for each candidate floor, find the largest N + // whose welcome still fits in that floor's first bucket. + println!(); + println!("=== First bucket bleed-over by candidate floor ==="); + for floor in [1024usize, 2048, 4096, 8192, 16384] { + let mut last_in_floor = 0usize; + for &(n, raw) in &samples { + if raw <= floor { + last_in_floor = n; + } + } + println!( + "floor={:>5} bytes: groups with N ≤ {} share the first bucket", + floor, last_in_floor + ); + } + println!(); + } + + /// Trailing bytes after the MLS Welcome framing MUST be zero. A + /// malicious admin authoring a Welcome must not be able to embed + /// arbitrary data in the padding region as a covert channel to the + /// new joiner. Mirrors the kind:445 enforcement in + /// `process_mls_message` and closes the asymmetry flagged in the + /// PR #308 review. + /// + /// Same side-by-side strategy as the kind:445 negative test: vary + /// only the trailing tail byte (0 vs 0xFF) and assert divergent + /// outcomes — success vs `InvalidWelcomeMessage`. + #[test] + fn test_receiver_rejects_welcome_with_non_zero_padding() { + use nostr::EventBuilder; + + use crate::Error; + + fn craft_tampered_rumor( + rumor: &nostr::UnsignedEvent, + tail_byte: u8, + ) -> nostr::UnsignedEvent { + let mut decoded = BASE64 + .decode(&rumor.content) + .expect("welcome content is base64"); + assert!(decoded.len() >= WELCOME_PADDING_FLOOR); + // The sender's padder may produce a buffer whose length is + // exactly the serialized welcome's length (when that length + // already equals a power-of-two bucket boundary), leaving no + // trailing padding region. Append a known-zero byte first so + // the subsequent tail mutation is guaranteed to land on a + // padding byte rather than an MLS frame byte. + decoded.push(0); + *decoded.last_mut().expect("decoded buffer non-empty") = tail_byte; + let re_encoded = BASE64.encode(&decoded); + + let mut tags: Vec = rumor.tags.iter().cloned().collect(); + // EventBuilder appends its own tags; rebuilding from scratch + // with the original tags preserves the relays/e/encoding tags + // the receiver validates before reaching parse_serialized_welcome. + let mut builder = EventBuilder::new(rumor.kind, re_encoded); + builder = builder.tags(tags.drain(..)); + builder.build(rumor.pubkey) + } + + fn fresh_group_with_welcome() -> ( + MDK, + MDK, + nostr::UnsignedEvent, + ) { + let admin_keys = Keys::generate(); + let joiner_keys = Keys::generate(); + let admin_mdk = create_test_mdk(); + let joiner_mdk = create_test_mdk(); + let joiner_kp = create_key_package_event(&joiner_mdk, &joiner_keys); + let create_result = admin_mdk + .create_group( + &admin_keys.public_key(), + vec![joiner_kp], + create_nostr_group_config_data(vec![admin_keys.public_key()]), + ) + .expect("admin creates group"); + let rumor = create_result.welcome_rumors[0].clone(); + (admin_mdk, joiner_mdk, rumor) + } + + // Zero-tail control: the same construction with the last byte + // explicitly set to zero must succeed. + let (_admin_ok, joiner_ok, rumor_ok) = fresh_group_with_welcome(); + let zero_rumor = craft_tampered_rumor(&rumor_ok, 0); + joiner_ok + .process_welcome(&nostr::EventId::all_zeros(), &zero_rumor) + .expect("zero-tail control must process"); + + // Non-zero tail: must be rejected as InvalidWelcomeMessage. + let (_admin_bad, joiner_bad, rumor_bad) = fresh_group_with_welcome(); + let bad_rumor = craft_tampered_rumor(&rumor_bad, 0xFF); + let err = joiner_bad + .process_welcome(&nostr::EventId::all_zeros(), &bad_rumor) + .expect_err("non-zero padding must be rejected"); + // process_welcome wraps preview errors in Error::Welcome(String). + // Accept any of: + // * unwrapped Error::InvalidWelcomeMessage (if no sanitizer runs) + // * Welcome-wrapped Debug form ("InvalidWelcomeMessage" in msg) + // * Sanitized welcome-failure category ("welcome_parse_failed"), + // introduced by the security hardening on master in #307. The + // branch may be tested standalone or merged with master in CI, + // so we accept both forms. + let surfaces_invalid = match &err { + Error::InvalidWelcomeMessage => true, + Error::Welcome(msg) => { + msg.contains("InvalidWelcomeMessage") || msg.contains("welcome_parse_failed") + } + _ => false, + }; + assert!( + surfaces_invalid, + "expected InvalidWelcomeMessage (or Welcome-wrapped / sanitized form), got: {err:?}" + ); + } + } } diff --git a/crates/mdk-core/src/lib.rs b/crates/mdk-core/src/lib.rs index cfd8918b..651d6a94 100644 --- a/crates/mdk-core/src/lib.rs +++ b/crates/mdk-core/src/lib.rs @@ -31,6 +31,7 @@ pub mod messages; #[cfg(feature = "mip05")] #[cfg_attr(docsrs, doc(cfg(feature = "mip05")))] pub mod mip05; +mod padding; pub mod prelude; mod state_validation; #[cfg(any(test, feature = "test-utils"))] diff --git a/crates/mdk-core/src/messages/process.rs b/crates/mdk-core/src/messages/process.rs index 40efd0dd..23bfc912 100644 --- a/crates/mdk-core/src/messages/process.rs +++ b/crates/mdk-core/src/messages/process.rs @@ -44,7 +44,19 @@ where group: &mut MlsGroup, message_bytes: &[u8], ) -> Result { - let mls_message = MlsMessageIn::tls_deserialize_exact(message_bytes)?; + // Use non-strict deserialization so trailing zero-padding bytes added + // by the sender (see `crate::padding`) are silently ignored. The TLS + // length-prefixed framing means the decoder reads exactly the MLS + // message and leaves the cursor at the start of any padding. Enforce + // that the trailing bytes are zero-only so the padding region cannot + // be used as a covert channel for unencrypted data. + let mut cursor = message_bytes; + let mls_message = MlsMessageIn::tls_deserialize(&mut cursor)?; + if cursor.iter().any(|&b| b != 0) { + return Err(Error::Message( + "Malformed message content: non-zero bytes in padding region".to_string(), + )); + } let protocol_message = mls_message.try_into_protocol_message()?; // Return error if group ID doesn't match diff --git a/crates/mdk-core/src/padding.rs b/crates/mdk-core/src/padding.rs new file mode 100644 index 00000000..542bd6d1 --- /dev/null +++ b/crates/mdk-core/src/padding.rs @@ -0,0 +1,159 @@ +//! Bucket-based length padding for encrypted plaintexts. +//! +//! Both kind:445 group messages (MIP-03) and gift-wrapped Welcome events +//! (MIP-02) historically leaked information through their on-the-wire size: +//! +//! - SelfRemove proposals are empty `MlsMessage::PublicMessage`s and stand out +//! as the smallest possible kind:445 events (issue #37). +//! - Welcomes carry the full MLS ratchet tree, so the encrypted gift-wrap size +//! scales with the group's member count (issue #33). +//! +//! Padding the plaintext to a power-of-two bucket before encryption maps every +//! plaintext into one of a small number of length classes, so a relay observer +//! cannot distinguish messages within the same bucket. +//! +//! Padding is appended as zero bytes. The on-the-wire MLS framing uses TLS +//! length-prefixed encoding, so the decoder consumes exactly the framed +//! message and the trailing zero bytes are ignored. The deserialization sites +//! that depend on this property are `process_mls_message` in +//! `crate::messages::process` and `process_welcome` in [`crate::welcomes`]. + +/// Minimum bucket size for kind:445 group message plaintexts. +/// +/// Chosen so that the smallest possible MLS message (an empty SelfRemove +/// `PublicMessage`) and a typical short text message land in the same bucket. +/// A bare text rumor — pubkey, kind, created_at, id, sig, plus a few-byte +/// `content` — already serializes to ~280 bytes of MLS `PrivateMessage` +/// plaintext, so a 256-byte floor would leave SelfRemove proposals +/// distinguishable. 512 bytes absorbs both classes. +/// +/// Guards marmot-security issue #37 (SelfRemove size fingerprinting). +pub(crate) const MESSAGE_PADDING_FLOOR: usize = 512; + +/// Minimum bucket size for the Welcome rumor payload (raw MLS bytes, before +/// base64 encoding and NIP-59 gift wrapping). +/// +/// Welcomes carry the full ratchet tree + per-recipient HPKE-encrypted +/// GroupSecrets, so their serialized size scales with member count. For the +/// only ciphersuite MDK supports today +/// (`MLS_128_DHKEMX25519_AES128GCM_SHA256_Ed25519`), the raw welcome grows by +/// roughly 365 bytes per added member, with small +68 bumps when the ratchet +/// tree depth grows (added member counts 2, 4, 8, 16, …). At 8192 bytes the +/// floor absorbs every welcome up to and including a 20-member group +/// (19 added members + 1 creator); the next padded bucket (16384) covers +/// roughly 21–38 member groups. Anything below this floor — most importantly +/// 1-on-1 DMs and small group chats — is bucketed identically, so a relay +/// observing kind:1059 gift-wrap sizes cannot distinguish a DM from a small +/// team chat. +/// +/// To re-measure (e.g. after an OpenMLS bump), run: +/// `cargo test -p mdk-core measure_welcome_rumor_sizes -- --ignored --nocapture` +/// +/// Guards marmot-security issue #33 (Welcome size leaks group size). +pub(crate) const WELCOME_PADDING_FLOOR: usize = 8192; + +/// Returns the smallest power-of-two bucket that fits `len` bytes and is at +/// least `floor` bytes wide. +/// +/// `floor` MUST itself be a power of two; this is a programmer-controlled +/// constant in this crate, so we debug-assert it rather than returning a +/// runtime error. +fn bucket_for_len(len: usize, floor: usize) -> usize { + debug_assert!( + floor.is_power_of_two(), + "padding floor must be a power of two" + ); + + if len <= floor { + return floor; + } + + len.checked_next_power_of_two() + .expect("padded payload length overflowed usize") +} + +/// Returns a padded copy of `plaintext` whose length is rounded up to the next +/// power-of-two bucket of at least `floor` bytes. The bytes added beyond the +/// original length are all zero. +/// +/// The receiver-side MLS deserializers (`MlsMessageIn::tls_deserialize` and +/// `Welcome::tls_deserialize`) consume exactly the framed message bytes and +/// ignore trailing padding. +pub(crate) fn pad_to_bucket(plaintext: &[u8], floor: usize) -> Vec { + let target = bucket_for_len(plaintext.len(), floor); + let mut padded = Vec::with_capacity(target); + padded.extend_from_slice(plaintext); + padded.resize(target, 0); + padded +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn bucket_floors_small_lengths() { + assert_eq!(bucket_for_len(0, 256), 256); + assert_eq!(bucket_for_len(1, 256), 256); + assert_eq!(bucket_for_len(255, 256), 256); + assert_eq!(bucket_for_len(256, 256), 256); + } + + #[test] + fn bucket_rounds_up_to_next_power_of_two() { + assert_eq!(bucket_for_len(257, 256), 512); + assert_eq!(bucket_for_len(512, 256), 512); + assert_eq!(bucket_for_len(513, 256), 1024); + assert_eq!(bucket_for_len(1023, 256), 1024); + assert_eq!(bucket_for_len(1024, 256), 1024); + assert_eq!(bucket_for_len(1025, 256), 2048); + } + + #[test] + fn welcome_floor_buckets() { + assert_eq!(bucket_for_len(0, WELCOME_PADDING_FLOOR), 8192); + assert_eq!(bucket_for_len(1024, WELCOME_PADDING_FLOOR), 8192); + assert_eq!(bucket_for_len(8192, WELCOME_PADDING_FLOOR), 8192); + assert_eq!(bucket_for_len(8193, WELCOME_PADDING_FLOOR), 16384); + assert_eq!(bucket_for_len(16385, WELCOME_PADDING_FLOOR), 32768); + } + + #[test] + fn pad_appends_zero_bytes_to_bucket_size() { + let payload = vec![0xAAu8; 100]; + let padded = pad_to_bucket(&payload, 256); + + assert_eq!(padded.len(), 256); + assert_eq!(&padded[..100], &payload[..]); + assert!(padded[100..].iter().all(|&b| b == 0)); + } + + #[test] + fn pad_no_op_when_exact_bucket() { + let payload = vec![0x55u8; 256]; + let padded = pad_to_bucket(&payload, 256); + + assert_eq!(padded.len(), 256); + assert_eq!(padded, payload); + } + + #[test] + fn pad_different_inputs_share_bucket() { + let small_a = pad_to_bucket(&[0xAA; 10], MESSAGE_PADDING_FLOOR); + let small_b = pad_to_bucket(&[0xBB; 200], MESSAGE_PADDING_FLOOR); + let empty = pad_to_bucket(&[], MESSAGE_PADDING_FLOOR); + + assert_eq!(small_a.len(), small_b.len()); + assert_eq!(small_b.len(), empty.len()); + assert_eq!(empty.len(), MESSAGE_PADDING_FLOOR); + } + + #[test] + fn pad_promotes_to_next_bucket() { + let medium = pad_to_bucket(&[0xCC; 600], MESSAGE_PADDING_FLOOR); + let larger = pad_to_bucket(&[0xDD; 1000], MESSAGE_PADDING_FLOOR); + + assert_eq!(medium.len(), 1024); + assert_eq!(larger.len(), 1024); + } +} diff --git a/crates/mdk-core/src/welcomes.rs b/crates/mdk-core/src/welcomes.rs index b89b1f6f..15606ab4 100644 --- a/crates/mdk-core/src/welcomes.rs +++ b/crates/mdk-core/src/welcomes.rs @@ -411,8 +411,15 @@ where &self, mut welcome_message: &[u8], ) -> Result<(StagedWelcome, NostrGroupDataExtension), Error> { - // Parse welcome message + // Parse welcome message. Non-strict deserialization permits trailing + // bytes so the sender-side bucket padding (see `crate::padding`) is + // ignored, but we then enforce that those trailing bytes are zero + // only: an admin authoring a Welcome must not be able to use the + // padding region as a covert channel to the new joiner. let welcome_message_in = MlsMessageIn::tls_deserialize(&mut welcome_message)?; + if welcome_message.iter().any(|&b| b != 0) { + return Err(Error::InvalidWelcomeMessage); + } let welcome: Welcome = match welcome_message_in.extract() { MlsMessageBodyIn::Welcome(welcome) => welcome,