Fix uring teardown by hbirth · Pull Request #143 · DDNStorage/linux

hbirth · 2026-04-10T08:36:27Z

cherry pick from redfs-ubuntu-noble

Signed-off-by: Horst Birthelmer <[email protected]> (cherry picked from commit ad21e5a)

Fix uninterruptible sleep (D state) hangs during FUSE filesystem teardown when using io_uring. The issue manifests as processes stuck waiting for requests that are never completed, particularly affecting force requests like FUSE_FLUSH or when requests are created after fuse_abort_conn() already finished. If on daemon exit io_uring_try_cancel_requests() runs and calls fuse_uring_cancel() which will teardown the entries by calling fuse_uring_entry_teardown() before fuse_abort_conn() then we end up in fuse_uring_abort with queue_refs == 0 and the queues are never stopped. If the queues are stopped all new requests will be rejected, but that does not happen, so all new calls are stuck. Signed-off-by: Horst Birthelmer <[email protected]> (cherry picked from commit 9550b4d)

bsbernd · 2026-04-10T17:23:51Z

@hbirth I need to look tomorrow into the 9.4 branch, because 9.4 does not have this F_CANCEL and I had added a workaround into that kernel version therefore. Need to verify it for races (we have a jira for a related issue on 9.4).

hbirth · 2026-04-10T18:36:39Z

So the case where queue_refs == 0 can not occur? I have to check, too, then.

hbirth · 2026-04-11T08:27:21Z

To me this looks like a slightly better case for the problem at hand.
if IO_URING_F_CANCEL is not defined we never call fuse_uring_cancel() so the queue_refs never go to 0.
This part of the fix is technically not really needed, since the case will not occur, but will not do anything wrong either.

bsbernd · 2026-04-11T12:25:56Z

@hbirth I had to look up what I had done (commit 14ba960). Issue with it are on flight registration SQEs when either way of fuse_abort_conn() is called - these registration SQEs will not be released. There would be a way to handle it through locks and another ref count, but given that 9.4 is deprecated we better leave it in the current state.

hbirth added 2 commits April 10, 2026 09:38

fuse: debug print requests when we hang in fuse_wait_aborted()

2e18ba5

Signed-off-by: Horst Birthelmer <[email protected]> (cherry picked from commit ad21e5a)

hbirth requested a review from bsbernd April 10, 2026 09:20

bsbernd approved these changes Apr 10, 2026

View reviewed changes

hbirth merged commit d6048c3 into DDNStorage:redfs-rhel9_4-427.42.1 Apr 13, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix uring teardown#143

Fix uring teardown#143
hbirth merged 2 commits intoDDNStorage:redfs-rhel9_4-427.42.1from
hbirth:redfs-rhel9_4-427.42.1

hbirth commented Apr 10, 2026 •

edited

Loading

Uh oh!

bsbernd commented Apr 10, 2026

Uh oh!

hbirth commented Apr 10, 2026

Uh oh!

hbirth commented Apr 11, 2026

Uh oh!

bsbernd commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hbirth commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bsbernd commented Apr 10, 2026

Uh oh!

hbirth commented Apr 10, 2026

Uh oh!

hbirth commented Apr 11, 2026

Uh oh!

bsbernd commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hbirth commented Apr 10, 2026 •

edited

Loading