Skip to content

Fix NULL file pointer dereferences in vlen disk operations#6385

Closed
tbeu wants to merge 1 commit into
HDFGroup:developfrom
tbeu:fix/vlen-disk-write-null-check
Closed

Fix NULL file pointer dereferences in vlen disk operations#6385
tbeu wants to merge 1 commit into
HDFGroup:developfrom
tbeu:fix/vlen-disk-write-null-check

Conversation

@tbeu

@tbeu tbeu commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

When reading corrupted HDF5 files, the file pointer passed to the disk-based vlen type operations (H5T__vlen_disk_read, H5T__vlen_disk_write, H5T__vlen_disk_isnull, H5T__vlen_disk_setnull, H5T__vlen_disk_delete) and H5T__vlen_set_loc can be NULL. The existing assert(file) causes a crash in debug builds, while release builds proceed to dereference the NULL pointer in H5VL_blob_put / H5VL_blob_get / H5VL_blob_specific, causing a SEGV.

Changes

Replace assert(file) with proper NULL checks and HGOTO_ERROR so that the error is propagated through the normal HDF5 error handling mechanism.

ASAN report (from OSS-Fuzz)

https://issues.oss-fuzz.com/issues?q=5366895365914624

=================================================================
==1150==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x5d09af33f82f bp 0x7ffd030797b0 sp 0x7ffd03079770 T0)
==1150==The signal is caused by a READ memory access.
==1150==Hint: address points to the zero page.
# 0 H5VL_blob_put hdf5/src/H5VLcallback.c:7453:48
# 1 H5T__vlen_disk_write hdf5/src/H5Tvlen.c:879:9
# 2 H5T__conv_vlen hdf5/src/H5Tconv_vlen.c:474:29
# 3 H5T_convert_with_ctx hdf5/src/H5T.c:6545:14
# 4 H5T_convert hdf5/src/H5T.c:6474:9
# 5 H5A__read hdf5/src/H5Aint.c:787:21

When reading corrupted HDF5 files, the file pointer passed to the disk-based
vlen type operations (H5T__vlen_disk_read, H5T__vlen_disk_write,
H5T__vlen_disk_isnull, H5T__vlen_disk_setnull, H5T__vlen_disk_delete) and
H5T__vlen_set_loc can be NULL. The existing assert(file) causes a crash in
debug builds, while release builds proceed to dereference the NULL pointer
in H5VL_blob_put / H5VL_blob_get / H5VL_blob_specific, causing a SEGV.

Replace assert(file) with proper NULL checks and HGOTO_ERROR so that the
error is propagated through the normal HDF5 error handling mechanism.

Found by OSS-Fuzz via the matio project fuzzer (ClusterFuzz testcase
5366895365914624).
@bmribler

bmribler commented May 4, 2026

Copy link
Copy Markdown
Collaborator

Please do not merge this yet. The comment in #6377 also applies here.

@tbeu

tbeu commented May 4, 2026

Copy link
Copy Markdown
Contributor Author

The issue occurs with corrupted HDF5 files on fuzzing. It is not possible to detect / prevent the error outside hdf5, i.e. as consumer of hdf5 lib.

@jhendersonHDF

Copy link
Copy Markdown
Collaborator

Hi @tbeu,

I just wanted to clarify a bit. What @bmribler is saying is that we currently believe it would be more appropriate to fix this at a higher level rather than converting these assert()s to regular error checks. That is, we expect that these deeper internal interfaces should always be getting a non-NULL pointer (the reason for the assert()s being there) and checking with a regular error check elsewhere would be better.

Comment thread src/H5Tvlen.c
H5VL_file_cont_info_t cont_info = {H5VL_CONTAINER_INFO_VERSION, 0, 0, 0};
H5VL_file_get_args_t vol_cb_args; /* Arguments to VOL callback */

assert(file);

@jhendersonHDF jhendersonHDF May 4, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, it may make sense to leave this as an assert() and instead add a regular error check to H5T_set_loc() in H5T.c when the loc param is H5T_LOC_DISK. Even better would be to figure out where the file pointer initially gets set to NULL and fix the issue there.

@tbeu

tbeu commented May 4, 2026

Copy link
Copy Markdown
Contributor Author

Closing in favor of a fix at the decode level (validating vlen.type in H5O__dtype_decode_helper). See replacement PR.

@tbeu tbeu closed this May 4, 2026
@github-project-automation github-project-automation Bot moved this from To be triaged to Done in HDF5 - TRIAGE & TRACK May 4, 2026
@tbeu tbeu deleted the fix/vlen-disk-write-null-check branch May 4, 2026 19:18
tbeu added a commit to tbeu/hdf5_repo that referenced this pull request May 6, 2026
…_set_loc

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of HDFGroup#6378 and HDFGroup#6385.

Closes HDFGroup#6378
Closes HDFGroup#6385

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
tbeu added a commit to tbeu/hdf5_repo that referenced this pull request May 7, 2026
…_set_loc

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of HDFGroup#6378 and HDFGroup#6385.

Closes HDFGroup#6378
Closes HDFGroup#6385

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
tbeu added a commit to tbeu/hdf5_repo that referenced this pull request May 14, 2026
…_set_loc

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of HDFGroup#6378 and HDFGroup#6385.

Closes HDFGroup#6378
Closes HDFGroup#6385

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
tbeu added a commit to tbeu/hdf5_repo that referenced this pull request May 18, 2026
…_set_loc

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of HDFGroup#6378 and HDFGroup#6385.

Closes HDFGroup#6378
Closes HDFGroup#6385

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
jhendersonHDF pushed a commit that referenced this pull request May 28, 2026
…_set_loc (#6395)

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of #6378 and #6385.

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
hyoklee pushed a commit to hyoklee/hdf5 that referenced this pull request May 29, 2026
…_set_loc (HDFGroup#6395)

H5O__dtype_decode_helper() reads vlen.type from the file without
validation. With corrupted HDF5 files (e.g. from fuzzing), this field
can have an invalid value that is neither H5T_VLEN_SEQUENCE nor
H5T_VLEN_STRING, which later triggers assert(0) in H5T__vlen_set_loc()
(debug builds) or a NULL pointer dereference / SEGV in release builds.

Fix by:
1. Adding a validation check in H5O__dtype_decode_helper() immediately
   after reading the vlen.type field, returning an error if the value
   is invalid.
2. Adding a NULL file pointer check in H5T_set_loc() before calling
   H5T__vlen_set_loc() when loc == H5T_LOC_DISK, so the low-level
   assert(file) invariant is never violated.

This fixes the root cause at the decode level where the bad value
enters the system, as requested in review of HDFGroup#6378 and HDFGroup#6385.

Found by OSS-Fuzz via the matio fuzzer (ClusterFuzz testcase
5366895365914624).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants