From @jgallagher59701, copied out of NGAP ticket:
The first issue was the root cause of the bug we found on test.o.o where some accesses just spun forever. Look at and plan to fix the remaining issues. HYRAX-2130 addresses the first problem.
Note: The following analysis is not perfect, but it sheds light on some long standing problems. This is from Codex using GPT 5.5 set to 'Extra High' (FWIW):
I did an analysis-only pass. No files changed, no configure/build/test run. I used the current tree under /Users/jimg/src/opendap/hyrax/bes; no dependency paths or configure flags were involved.
Your suspicion is plausible, but the biggest issue may be adjacent to the decompressor rather than inside zlib.
Main Findings
DMRPP.MaxComputeThreads is read as a boolean, not an integer.
In DmrppRequestHandler.cc, d_max_compute_threads uses TheBESKeys::read_bool_key(). That means:
unset default 8 becomes true, then 1
configured DMRPP.MaxComputeThreads = 8 becomes false, then 0
with compute threads enabled and max set to 0, process_chunks_*_concurrent() can spin forever because no future is started and the chunk queue never drains.
This is the most concrete bug I found. It can also hide real parallel decompression bugs because the default effective parallelism is probably only one worker.
Chunk read/decompress state is mutable and shared through copied arrays.
DmrppArray and DmrppCommon use default copy behavior, while DmrppCommon stores std::shared_ptr<Chunk> in d_chunks. SuperChunk and Chunk::filter_chunk() mutate those Chunk objects during reads.
So if duplicated/cached DMR++ arrays are read concurrently, two requests can race on the same Chunk’s d_is_read, d_is_inflated, d_read_buffer, and ownership flag. That is a real concurrent decompression hazard.
SuperChunk leaves child chunks pointing into ephemeral aggregate storage.
SuperChunk::map_chunks_to_buffer() maps each child chunk into d_read_buffer at SuperChunk.cc. Later retrieve_data() marks each child chunk read at SuperChunk.cc. But SuperChunk frees d_read_buffer in its destructor SuperChunk.h.
For deflate/shuffle chunks, filter_chunk() usually replaces the buffer with owned decompressed memory. For unfiltered chunks, and likely fletcher-only chunks, the child chunk can be left with d_is_read=true and a dangling non-owned buffer after the SuperChunk dies. That becomes dangerous with shallow-copied chunks or any later re-read path.
-
Exception handling corrupts global compute-thread accounting.
get_next_future() clears all futures and sets the passed counter to zero on any exception at DmrppArray.cc. That counter is the global chunk_processing_thread_counter from SuperChunk.cc. One failed request can therefore reset accounting for unrelated in-flight requests.
-
Inside one normal, uncopied array read, the per-chunk decompression path mostly looks structurally safe.
Each worker gets one chunk, zlib state is local, the recursive address vectors are local, and destination writes should be disjoint for well-formed chunk metadata. So I would not start by blaming zlib or memcpy in the single-request path.
Recommended Fix Order
- Fix
DMRPP.MaxComputeThreads to use read_ulong_key() and reject/coerce zero when compute threads are enabled.
- Stop sharing mutable
Chunk read/decompression state across DmrppArray copies. Deep-copy chunks or move per-read buffers/state out of Chunk.
- Add cleanup/reset after
SuperChunk processing so child chunks do not retain pointers into freed aggregate buffers.
- Replace the global counter mutation with an RAII-style scoped token/semaphore; never reset the global counter to zero on one future’s exception.
- Add focused tests for numeric
DMRPP.MaxComputeThreads, duplicated-array concurrent reads, and unfiltered SuperChunk reuse.
From @jgallagher59701, copied out of NGAP ticket:
The first issue was the root cause of the bug we found on test.o.o where some accesses just spun forever. Look at and plan to fix the remaining issues. HYRAX-2130 addresses the first problem.
Note: The following analysis is not perfect, but it sheds light on some long standing problems. This is from Codex using GPT 5.5 set to 'Extra High' (FWIW):
I did an analysis-only pass. No files changed, no configure/build/test run. I used the current tree under
/Users/jimg/src/opendap/hyrax/bes; no dependency paths or configure flags were involved.Your suspicion is plausible, but the biggest issue may be adjacent to the decompressor rather than inside zlib.
Main Findings
DMRPP.MaxComputeThreadsis read as a boolean, not an integer.In DmrppRequestHandler.cc,
d_max_compute_threadsusesTheBESKeys::read_bool_key(). That means:unset default
8becomestrue, then1configured
DMRPP.MaxComputeThreads = 8becomesfalse, then0with compute threads enabled and max set to
0,process_chunks_*_concurrent()can spin forever because no future is started and the chunk queue never drains.This is the most concrete bug I found. It can also hide real parallel decompression bugs because the default effective parallelism is probably only one worker.
Chunkread/decompress state is mutable and shared through copied arrays.DmrppArray and DmrppCommon use default copy behavior, while
DmrppCommonstoresstd::shared_ptr<Chunk>in d_chunks.SuperChunkandChunk::filter_chunk()mutate thoseChunkobjects during reads.So if duplicated/cached DMR++ arrays are read concurrently, two requests can race on the same
Chunk’sd_is_read,d_is_inflated,d_read_buffer, and ownership flag. That is a real concurrent decompression hazard.SuperChunkleaves child chunks pointing into ephemeral aggregate storage.SuperChunk::map_chunks_to_buffer()maps each child chunk intod_read_bufferat SuperChunk.cc. Laterretrieve_data()marks each child chunk read at SuperChunk.cc. ButSuperChunkfreesd_read_bufferin its destructor SuperChunk.h.For deflate/shuffle chunks,
filter_chunk()usually replaces the buffer with owned decompressed memory. For unfiltered chunks, and likely fletcher-only chunks, the child chunk can be left withd_is_read=trueand a dangling non-owned buffer after theSuperChunkdies. That becomes dangerous with shallow-copied chunks or any later re-read path.Exception handling corrupts global compute-thread accounting.
get_next_future()clears all futures and sets the passed counter to zero on any exception at DmrppArray.cc. That counter is the globalchunk_processing_thread_counterfrom SuperChunk.cc. One failed request can therefore reset accounting for unrelated in-flight requests.Inside one normal, uncopied array read, the per-chunk decompression path mostly looks structurally safe.
Each worker gets one chunk, zlib state is local, the recursive address vectors are local, and destination writes should be disjoint for well-formed chunk metadata. So I would not start by blaming zlib or
memcpyin the single-request path.Recommended Fix Order
DMRPP.MaxComputeThreadsto useread_ulong_key()and reject/coerce zero when compute threads are enabled.Chunkread/decompression state acrossDmrppArraycopies. Deep-copy chunks or move per-read buffers/state out ofChunk.SuperChunkprocessing so child chunks do not retain pointers into freed aggregate buffers.DMRPP.MaxComputeThreads, duplicated-array concurrent reads, and unfiltered SuperChunk reuse.