Draft
Conversation
Defines the on-disk record format for the two-phase commit write-ahead log: length + crc32c + tag + rmp-serde framing. Adds WAL config options to General and crc32c as a dependency.
Adds SegmentReader for iterating records out of an existing segment and Segment for appending. SegmentReader::into_writable converts the reader into a writable segment, truncating any torn tail.
Adds Wal: a cloneable handle around an mpsc channel into a single writer task that owns the active Segment, batches concurrent appends behind one fsync per batch, and rotates segments at the configured size limit. Shutdown via AtomicBool + Notify; the writer body is wrapped in catch_unwind so a panic doesn't hang shutdown. Switches Segment to a batch-only append API (append_batch takes pre-encoded bytes plus record count) so the writer can encode the whole batch into one buffer and issue a single write_all.
Adds recovery::recover_transactions which scans every segment and hands each in-flight transaction to Manager::restore_transaction. Wraps probe + recovery + writer spawn behind Wal::open, with distinct error variants for directory access failures.
WAL initialization replays in-flight 2PC transactions back into the manager so they can be driven to a terminal state. If the WAL can't be opened, 2PC continues in-memory only without durability rather than failing pgdog startup. Recovery distinguishes corruption from IO failures: corrupt segments are renamed to .broken and skipped, with restore skipped entirely on any corruption so a missing Committing record can't silently invert a committed transaction. IO failures abort recovery and disable the WAL.
Manager logs Begin and Committing on phase transitions and End on cleanup, before mutating in-memory state. WAL write failures are logged and the transaction proceeds without durability rather than blocking 2PC. Shutdown drains the cleanup queue and then the WAL so final End records make it to disk. Drops the unused participant-shards field from Begin and Checkpoint records since cleanup fans to all current shards via the existing 42704-tolerant path.
A second pgdog pointed at the same WAL dir now fails fast with the prior holder's PID and start time instead of silently corrupting the log.
|
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a WAL with recovery for 2pc transaction state for crash-safety.
Still needs integration tests
Closes #911