[WIP] nvme_driver: WORKAROUND: don't defer queue create to after restore#3149
[WIP] nvme_driver: WORKAROUND: don't defer queue create to after restore#3149mattkur wants to merge 2 commits intomicrosoft:release/1.7.2511from
Conversation
There was a problem hiding this comment.
Pull request overview
Adjusts NVMe driver save/restore behavior to avoid deferring I/O queue creation until after restore (workaround for restore-time queue handling issues).
Changes:
- Restores all saved I/O queues eagerly (previous eager-restore filter is disabled).
- Alters drain-after-restore state assignment for restored queues (empty queues now get
SelfDrained). - Disables the “prototype IO queue” (
proto_io) restore path by commenting it out.
| // .filter(|q| { | ||
| // q.queue_data.qid == 1 || !q.queue_data.handler_data.pending_cmds.commands.is_empty() | ||
| // }) | ||
| .flat_map(|q| -> Result<IoQueue<D>, anyhow::Error> { |
There was a problem hiding this comment.
The queue-restore selection filter has been commented out, leaving dead/commented code in the restore path. Please remove the commented-out code and update the surrounding restore-step comments to reflect the new behavior (restoring all I/O queues eagerly), or gate the behavior behind an explicit flag/config so this doesn’t ship as commented-out logic.
| // ); | ||
| // max_seen_qid = max_seen_qid.max(q.queue_data.qid); | ||
| // let mem_block = restored_memory | ||
| // .iter() | ||
| // .find(|mem| { | ||
| // mem.len() == q.queue_data.mem_len && q.queue_data.base_pfn == mem.pfns()[0] | ||
| // }) | ||
| // .expect("unable to find restored mem block") | ||
| // .to_owned(); | ||
| // ( | ||
| // q.cpu, | ||
| // ProtoIoQueue { | ||
| // save_state: q.clone(), | ||
| // mem: mem_block, | ||
| // drain_after_restore, | ||
| // }, | ||
| // ) | ||
| // }) | ||
| // .collect(); | ||
|
|
There was a problem hiding this comment.
The prototype-queue restore block is entirely commented out. As a result, drain_after_restore_for_proto_queues (created above) becomes unused and the restore logic for proto_io is effectively disabled, but the surrounding code still carries the proto-queue scaffolding. Please either fully remove the proto-queue mechanism (and delete the now-unused precreation logic/variables), or re-enable it with the intended behavior—leaving it commented risks warnings and makes the restore semantics hard to reason about.
| // ); | |
| // max_seen_qid = max_seen_qid.max(q.queue_data.qid); | |
| // let mem_block = restored_memory | |
| // .iter() | |
| // .find(|mem| { | |
| // mem.len() == q.queue_data.mem_len && q.queue_data.base_pfn == mem.pfns()[0] | |
| // }) | |
| // .expect("unable to find restored mem block") | |
| // .to_owned(); | |
| // ( | |
| // q.cpu, | |
| // ProtoIoQueue { | |
| // save_state: q.clone(), | |
| // mem: mem_block, | |
| // drain_after_restore, | |
| // }, | |
| // ) | |
| // }) | |
| // .collect(); |
| worker.io = saved_state | ||
| .worker_data | ||
| .io | ||
| .iter() | ||
| .filter(|q| { | ||
| q.queue_data.qid == 1 || !q.queue_data.handler_data.pending_cmds.commands.is_empty() | ||
| }) | ||
| // .filter(|q| { | ||
| // q.queue_data.qid == 1 || !q.queue_data.handler_data.pending_cmds.commands.is_empty() | ||
| // }) | ||
| .flat_map(|q| -> Result<IoQueue<D>, anyhow::Error> { |
There was a problem hiding this comment.
This change alters restore semantics (empty queues are now restored eagerly instead of being deferred via proto_io), but there’s no unit/integration test covering save/restore behavior to catch regressions like a queue getting stuck in drain-after-restore. Please add a test in nvme_driver/src/tests.rs that saves state with a mix of empty/non-empty IO queues and verifies restore produces usable issuers/queues (and does not hang waiting for drain completion).
| if !q.queue_data.handler_data.pending_cmds.commands.is_empty() { | ||
| drain_after_restore_template.new_draining() | ||
| } else if q.queue_data.qid == 1 { | ||
| drain_after_restore_for_qid1 | ||
| .take() | ||
| .expect("only QID 1 should be empty in eager restore") | ||
| .unwrap_or_else(|| drain_after_restore_template.new_self_drained()) | ||
| } else { | ||
| drain_after_restore_template.new_draining() | ||
| drain_after_restore_template.new_self_drained() | ||
| }, |
There was a problem hiding this comment.
DrainAfterRestoreBuilder::new_self_drained() is used for empty restored queues. If the last Draining queue finishes and signals before a given empty queue is restored, that queue’s waiter will be created after the uncached signal and can remain stuck in SelfDrained forever (no new guest IO processed on that queue). Use DrainAfterRestoreBuilder::new_for_new_queue() (or pre-create/register all SelfDrained waiters before starting any draining queues) for empty restored queues to avoid missing the drain-complete signal when draining finishes early.
No description provided.