fix: close XML file on parse-error paths to avoid FILE* leak#2247
Open
EylonKrause wants to merge 1 commit into
Open
fix: close XML file on parse-error paths to avoid FILE* leak#2247EylonKrause wants to merge 1 commit into
EylonKrause wants to merge 1 commit into
Conversation
ncclTopoGetXmlFromFile, ncclTopoGetXmlGraphFromFile and ncclTopoDumpXmlToFile only called fclose() on the success path; any error from xmlLoadSub/ncclTopoDumpXmlRec returned via NCCLCHECK before the fclose, leaking the FILE* (and its fd) on every malformed NCCL_TOPO_FILE/NCCL_GRAPH_FILE. Close the file first, then propagate. Signed-off-by: EylonKrause <eylon1909@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
ncclTopoGetXmlFromFile,ncclTopoGetXmlGraphFromFile, andncclTopoDumpXmlToFile(src/graph/xml.cc) only callfclose()on the success path. BecauseNCCLCHECK(...)returns immediately on a non-success result, an error fromxmlLoadSub()/ncclTopoDumpXmlRec()returns from the function before thefclose(), leaking theFILE*(and its underlying file descriptor).This triggers whenever
NCCL_TOPO_FILEorNCCL_GRAPH_FILEpoints at a file that fails to parse — e.g. a wrongversionattribute, an unterminated/mismatched tag, or an over-long value/name. Each affectedncclCommInitRankthen leaks one descriptor.Fix: capture the result,
fclose()unconditionally, then propagate viaNCCLCHECK— mirroring theexit:-label cleanup discipline already used elsewhere in this file (e.g.ncclTopoGetXmlFromSys).Related Issues
None.
Changes & Impact
src/graph/xml.cc: three call sites (two read paths + the dump path). The only behavioral change is that the file handle is released on the error path. No API change; non-breaking.Performance Impact
None — this is error-path-only resource cleanup.
Testing
make src.build(no new warnings).<system version="2"></system>viaNCCL_TOPO_FILE), loopingncclCommInitRankand counting open fds pointing at that file:1, 2, 3, 4, 5, 6across six failed inits (leak).0on every iteration (closed).