Skip to content

ROCm regression: can't load ds4 after commit d881f2a #410

@99z

Description

@99z

Prior to d881f2a I was able to load the q2-q4 imatrix model both with and without MTP on a Framework Desktop:

nsm@neuro ~/src/ds4 $ ./ds4-server --mtp /home/nsm/src/ds4/gguf/DeepSeek-V4-Flash-MTP-Q4K-Q8_0-F32.gguf --mtp-draft 2 --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192 --host 0.0.0.0
ds4: MTP support model loaded: /home/nsm/src/ds4/gguf/DeepSeek-V4-Flash-MTP-Q4K-Q8_0-F32.gguf (draft=2)
ds4: ROCm backend initialized on Radeon 8060S Graphics (sm_115)
ds4: ROCm chunk-copying 90.89 GiB model image
ds4: ROCm loading model tensors into device cache: 90.06 GiBds4: ROCm model chunk copy complete in 18.247s (90.88 GiB tensors)
ds4: ROCm chunk-copying 3.55 GiB model image
ds4: ROCm loading model tensors into device cache: 2.06 GiBds4: ROCm model chunk copy complete in 0.701s (3.55 GiB tensors)
ds4: ROCm preparing model tensor mappings: 90.36 GiB
ds4: ROCm startup model preparation covered 90.88 GiB of tensor spans in 0.787s
ds4: ROCm preparing model tensor mappings: 2.38 GiB
ds4: ROCm startup model preparation covered 3.55 GiB of tensor spans in 0.018s
ds4: rocm backend initialized for graph diagnostics
0614 14:21:27 ds4-server: context buffers 3567.06 MiB (ctx=100000, backend=rocm, prefill_chunk=8192, raw_kv_rows=8192, compressed_kv_rows=25002)
0614 14:21:27 ds4-server: KV disk cache /tmp/ds4-kv (budget=8192 MiB, cross-quant=accept, min=512, cold_max=30000, continued=10000, trim=32, align=2048, hit_half_life=21600s)
0614 14:21:27 ds4-server: listening on http://0.0.0.0:8000

After that commit, I can't load the q2-q4 with or without MTP:

nsm@neuro ~/src/ds4 $ ./ds4-server --mtp /home/nsm/src/ds4/gguf/DeepSeek-V4-Flash-MTP-Q4K-Q8_0-F32.gguf --mtp-draft 2 --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192 --host 0.0.0.0
ds4: MTP support model loaded: /home/nsm/src/ds4/gguf/DeepSeek-V4-Flash-MTP-Q4K-Q8_0-F32.gguf (draft=2)
ds4: ROCm backend initialized on Radeon 8060S Graphics (sm_115)
ds4: ROCm preparing model tensor mappings: 60.76 GiBds4: ROCm model range alloc failed for tensor-span:108 (528.00 MiB): out of memory

ds4: accelerator failed to prepare model tensor span 108 at offset 65244457792
ds4: rocm failed to prepare optional model cache
nsm@neuro ~/src/ds4 $ ./ds4-server --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192 --host 0.0.0.0
ds4: ROCm backend initialized on Radeon 8060S Graphics (sm_115)
ds4: ROCm preparing model tensor mappings: 90.36 GiB
ds4: ROCm startup model preparation covered 90.88 GiB of tensor spans in 27.024s
ds4: rocm backend initialized for graph diagnostics
0614 14:16:14 ds4-server: context buffers 3567.06 MiB (ctx=100000, backend=rocm, prefill_chunk=8192, raw_kv_rows=8192, compressed_kv_rows=25002)
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor fill f32 launch failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
ds4: ROCm tensor alloc failed: out of memory
0614 14:16:15 ds4-server: failed to create rocm session

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions