Skip to content

fix: correct NVML P2P WRITE status warning to say INDEX_WRITE#2248

Open
EylonKrause wants to merge 1 commit into
NVIDIA:masterfrom
EylonKrause:fix/nvml-p2p-write-warning
Open

fix: correct NVML P2P WRITE status warning to say INDEX_WRITE#2248
EylonKrause wants to merge 1 commit into
NVIDIA:masterfrom
EylonKrause:fix/nvml-p2p-write-warning

Conversation

@EylonKrause

Copy link
Copy Markdown

Description

In ncclNvmlEnsureInitialized (src/misc/nvmlwrap.cc), the P2P-status probe queries NVML_P2P_CAPS_INDEX_READ and then NVML_P2P_CAPS_INDEX_WRITE. The failure WARN for the WRITE query incorrectly prints NVML_P2P_CAPS_INDEX_READ — a copy-paste from the READ branch just above it.

The effect is purely diagnostic but misleading: an operator debugging a P2P write-capability failure is told the read query failed, pointing them at the wrong thing.

Related Issues

None.

Changes & Impact

  • src/misc/nvmlwrap.cc: one diagnostic string literal corrected (INDEX_READINDEX_WRITE) in the WRITE branch's WARN. No change to the NVML call, control flow, stored status, or return value. Non-breaking.

Performance Impact

None — diagnostic message text only.

Testing

  • Builds clean with make src.build (no new warnings).
  • Verifiable by inspection: the READ branch's WARN correctly says READ; the WRITE branch's WARN previously also said READ, and now says WRITE to match its query.

The failure WARN for the NVML_P2P_CAPS_INDEX_WRITE query printed
NVML_P2P_CAPS_INDEX_READ, misdirecting anyone debugging a P2P-write
capability failure. Match the message to the query.

Signed-off-by: EylonKrause <eylon1909@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant