Skip to content

Fix memory safety vulnerabilities in high-level and VFD code#6140

Open
brtnfld wants to merge 9 commits intoHDFGroup:developfrom
brtnfld:snyk
Open

Fix memory safety vulnerabilities in high-level and VFD code#6140
brtnfld wants to merge 9 commits intoHDFGroup:developfrom
brtnfld:snyk

Conversation

@brtnfld
Copy link
Collaborator

@brtnfld brtnfld commented Jan 2, 2026

Address multiple CWE-415 (double-free), CWE-416 (use-after-free), and CWE-122 (buffer overflow) vulnerabilities identified by static analysis:

  • hl/src/H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL after free and adding NULL check in cleanup path

  • hl/src/H5LT.c: Fix multiple memory issues:

    • Set myinput to NULL after free in H5LTtext_to_dtype()
    • Add NULL check in realloc_and_append() to prevent use-after-free
    • Refactor duplicated stmp handling by creating H5LT_append_dtype_super_text() helper function, eliminating ~50 lines of repeated code across 4 case blocks
  • hl/src/H5TB.c: Replace unsafe strcpy() with strncpy() in H5TBget_field_info() using HLTB_MAX_FIELD_LEN constant to prevent buffer overflow

  • hl/src/H5TBpublic.h: Document buffer size requirements for field_names parameter

  • src/H5FDstdio.c: Fix inconsistent resource cleanup in H5FD_stdio_open() by using file->fp instead of f throughout error paths

  • src/H5VLnative.c: Add assert checks for obj and file parameters in H5VL_native_get_file_struct() following internal API conventions

SAFE project work.


Important

Fixes memory safety vulnerabilities in HDF5 codebase, addressing double-free, use-after-free, and buffer overflow issues across multiple files.

  • Memory Safety Fixes:
    • H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL after free and adding NULL check in cleanup.
    • H5LT.c: Set myinput to NULL after free in H5LTtext_to_dtype(), add NULL check in realloc_and_append(), refactor H5LT_append_dtype_super_text() to reduce code duplication.
    • H5TB.c: Replace strcpy() with strncpy() in H5TBget_field_info() to prevent buffer overflow.
  • Documentation:
    • H5TBpublic.h: Document buffer size requirements for field_names parameter.
  • Resource Management:
    • H5FDstdio.c: Use file->fp consistently in H5FD_stdio_open() for error paths.
  • Assertions:
    • H5VLnative.c: Add assert checks for obj and file parameters in H5VL_native_get_file_struct().

This description was created by Ellipsis for 7b22833. You can customize this summary. It will automatically update as commits are pushed.

Address multiple CWE-415 (double-free), CWE-416 (use-after-free),
and CWE-122 (buffer overflow) vulnerabilities identified by static analysis:

- hl/src/H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL
  after free and adding NULL check in cleanup path

- hl/src/H5LT.c: Fix multiple memory issues:
  * Set myinput to NULL after free in H5LTtext_to_dtype()
  * Add NULL check in realloc_and_append() to prevent use-after-free
  * Refactor duplicated stmp handling by creating H5LT_append_dtype_super_text()
    helper function, eliminating ~50 lines of repeated code across 4 case blocks

- hl/src/H5TB.c: Replace unsafe strcpy() with strncpy() in H5TBget_field_info()
  using HLTB_MAX_FIELD_LEN constant to prevent buffer overflow

- hl/src/H5TBpublic.h: Document buffer size requirements for field_names parameter

- src/H5FDstdio.c: Fix inconsistent resource cleanup in H5FD_stdio_open() by
  using file->fp instead of f throughout error paths

- src/H5VLnative.c: Add assert checks for obj and file parameters in
  H5VL_native_get_file_struct() following internal API conventions
@brtnfld brtnfld added the Component - C Library Core C library issues (usually in the src directory) label Jan 2, 2026
@brtnfld brtnfld added the HDFG-internal Internally coded for use by the HDF Group label Jan 2, 2026
@github-project-automation github-project-automation bot moved this to To be triaged in HDF5 - TRIAGE & TRACK Jan 2, 2026

/* Use the value in the property list */
if (H5Pget_file_locking(fapl_id, &unused, &file->ignore_disabled_file_locks) < 0) {
fclose(file->fp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue with these close calls? file->fp should be the same as f at this point

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snyk flags it for clarity on resource ownership. The concern is that it's not explicit which pointer "owns" the resource after the assignment. Once you've assigned file->fp = f, the FILE* is conceptually owned by the file structure, and using file->fp in cleanup makes this ownership clear.

hl/src/H5LT.c Outdated
buf = tmp_realloc;
}

if (!buf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of the _no_user_buf parameter isn't really obvious, but it seems like this check overlaps with the same check inside that block, which seems like it would imply buf being allowed to be passed in as NULL in the false case. But I'm guessing this check was added due to the strlen(buf) below. This seems like we should determine whether it was ever intended for buf to be allowed as NULL.

Copy link
Collaborator Author

@brtnfld brtnfld Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments, side-stepped the question if we should assert/error more clearly.

#define HLTB_MAX_FIELD_LEN 255
#define TABLE_CLASS "TABLE"
#define TABLE_VERSION "3.0"
/* HLTB_MAX_FIELD_LEN is now defined in H5TBpublic.h */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Harmless, but it's probably unnecessary to document that a macro used to be in this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Address a breaking API change introduced in commit 7b22833 where
H5TBget_field_info unconditionally wrote a null terminator at byte 254
(HLTB_MAX_FIELD_LEN - 1), requiring all callers to allocate 255-byte
buffers regardless of actual field name length.

Changes:
- hl/src/H5TB.c: Implement smart truncation that only enforces the
  255-byte limit when field names actually exceed it. For typical
  short field names, only the actual string length plus null terminator
  is written, preserving backward compatibility with existing code
  using smaller buffers.

- hl/src/H5TBpublic.h: Update documentation for HLTB_MAX_FIELD_LEN
  and H5TBget_field_info to clarify that 255-byte buffers are only
  required for exceptionally long field names. Short names are copied
  exactly without padding.

- hl/test/test_table.c: Add new test case "field info with small buffers
  (backward compatibility)" that verifies the function works correctly
  with 32-byte buffers for typical field names, ensuring no buffer
  overflow occurs.

This fix maintains the security improvement (preventing unbounded writes
from the original strcpy) while avoiding the compatibility hazard of
requiring all existing code to be updated.

Fixes: CWE-122 (Heap-based Buffer Overflow) - user-side compatibility issue
Maintains: Security fix from 7b22833
Add comprehensive documentation and assertions to address review feedback
about the ambiguous intent of the _no_user_buf parameter and redundant
NULL checks in realloc_and_append (H5LT.c).

Changes:
- Enhanced function header comment to document:
  * Two operating modes (library-managed vs user-provided buffer)
  * Explicit parameter descriptions
  * Preconditions that buf must never be NULL

- Added inline comments explaining:
  * Mode 1 (library-managed): buf initialized via calloc, can reallocate
  * Mode 2 (user-provided): fixed-size buffer, no reallocation
  * Why there are two NULL checks (defensive programming)

- Added assertion before the second NULL check to:
  * Document the API contract (buf must be valid)
  * Aid debugging in development builds
  * Make it clear this check is for defensive programming

The second NULL check (line ~1978) is intentionally redundant:
- In library-managed mode: already checked at line ~1945
- In user-provided mode: catches caller errors
- Prevents strlen(buf) crash regardless of mode

This addresses the review comment about unclear intent and overlapping
checks, making it explicit that buf=NULL is never a valid input, and
the checks are defensive programming against logic/caller errors.

Addresses: Review feedback from jhendersonHDF on commit 7b22833
The comment '/* HLTB_MAX_FIELD_LEN is now defined in H5TBpublic.h */'
documents a past refactoring but provides no useful information for
current development. The constant is properly defined in H5TBpublic.h
and available through the include at line 20.

Removes code archaeology that doesn't aid understanding.
… API

Changes to hl/src/H5LT.c:
- Simplify defensive redundancy in realloc_and_append()
- Consolidate triple NULL check (lines 1945, 1972, 1979) into single
  assertion + runtime check at function start
- Improves code clarity while maintaining identical safety guarantees
- No functional change: both debug (assertion) and production (runtime)
  safety preserved

Changes to hl/test/test_table.c:
- Add comprehensive HLTB_MAX_FIELD_LEN boundary testing
- Tests field name truncation at exact boundaries:
  * 253 chars: no truncation (253 + null = 254 < 255)
  * 254 chars: no truncation (254 + null = 255 = limit)
  * 255 chars: truncates to 254 (255 + null = 256 > limit)
  * 1000 chars: truncates to 254 (extreme case)
- Complements existing small-buffer backward compatibility test
- Verifies truncation logic in H5TB.c:3037-3040 works correctly
- Fix compiler warnings: remove unused boundary_field_sizes variable,
  initialize boundary_names_out to NULL
Document the following changes in release_docs/CHANGELOG.md:

Library section:
- Fixed file descriptor leaks in stdio VFD error paths (H5FDstdio.c)
- Added defensive NULL pointer checks in native VOL connector (H5VLnative.c)

High-Level Library section:
- Fixed critical buffer overflow vulnerability in H5TBget_field_info() (CWE-120)
  * SECURITY FIX: Replaced unbounded strcpy() with bounds-checked memcpy()
  * Field names exceeding 255 chars are now safely truncated
  * Backward compatibility preserved for small buffers
- Made HLTB_MAX_FIELD_LEN constant public (moved to H5TBpublic.h)
- Fixed memory leaks and improved safety in H5LT functions (H5LT.c)
  * Added NULL check after strdup() in H5LTtext_to_dtype()
  * Enhanced documentation for realloc_and_append()
- Eliminated code duplication in H5LT datatype conversion
  * New helper function H5LT_append_dtype_super_text() reduces ~80 lines
  * Improves maintainability across ENUM, VLEN, ARRAY, COMPLEX handlers
- Fixed use-after-free risk in H5DSis_scale() (H5DS.c)

These entries correspond to commits 7b22833 through 31edfe0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component - C Library Core C library issues (usually in the src directory) HDFG-internal Internally coded for use by the HDF Group

Projects

Status: To be triaged

Development

Successfully merging this pull request may close these issues.

4 participants