Skip to content

mtmd: several bug fixes#24784

Merged
ngxson merged 6 commits into
ggml-org:masterfrom
ngxson:xsn/mtmd_vuln_0
Jun 19, 2026
Merged

mtmd: several bug fixes#24784
ngxson merged 6 commits into
ggml-org:masterfrom
ngxson:xsn/mtmd_vuln_0

Conversation

@ngxson

@ngxson ngxson commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Overview

Lot of 5 bug fixes:

  • Validate n_mel_bins (1–256) and other audio hparams (sample_rate, n_fft, window_len, hop_len > 0) at GGUF load time in clip.cpp, plus a defense-in-depth check before warmup allocation
  • Change mtmd_audio_mel / mtmd_audio_mel_filters fields from int/int32_t to int64_t, fix all n_mel * n_len index calculations to use (size_t) casts, and replace the broken overflow check in log_mel_spectrogram with proper size_t-based validation
  • Add hard limit image_size <= 65536 at GGUF load time in clip.cpp to prevent width * height overflow in clip_image_size::area(); kept int for width/height fields
  • Change offset from int to int64_t in log_mel_spectrogram_worker_thread to prevent overflow when i * frame_step > INT_MAX on long audio, and use safe valid_len calculation for Hann window indexing
  • int Overflow in get_u32() potentially lead to Negative Vector Resize

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: yes, pi-agent with glm-5.1 + deepseek-v4-pro, running with a special-crafted memory-update loop

@ngxson ngxson requested a review from ServeurpersoCom June 18, 2026 22:49
@ngxson ngxson requested a review from a team as a code owner June 18, 2026 22:49

@ServeurpersoCom ServeurpersoCom left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it's hardening without regression.

One more pre-existing issue spotted with Opus 4.8:
area() in clip.h is still int * int, so the overflow it mentions isn't really closed. Could make it return int64_t:

int64_t area() const {
return (int64_t) width * height;
}

@ngxson

ngxson commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

Nice catch! I decided to keep it as int32_t and just have an upper limit to make things a bit easier, the max area can never exceed 46000 × 46000 = 2 116 000 000 (~2 116 megapixels, that's already an insane image size!)

@ngxson ngxson merged commit e2e7a9b into ggml-org:master Jun 19, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants