Skip to content

Handle result from PyObject_VisitManagedDict#6032

Merged
rwgk merged 6 commits intopybind:masterfrom
maxbachmann:patch-1
Apr 12, 2026
Merged

Handle result from PyObject_VisitManagedDict#6032
rwgk merged 6 commits intopybind:masterfrom
maxbachmann:patch-1

Conversation

@maxbachmann
Copy link
Copy Markdown
Contributor

@maxbachmann maxbachmann commented Apr 8, 2026

Cursor-generated:

Summary

This PR fixes pybind11_traverse() for py::dynamic_attr() instances on Python 3.13+ by propagating the return value from PyObject_VisitManagedDict().

Without this, pybind11 could silently swallow a non-zero return from the GC/referrer visitor callback. In practice, that means the relationship between an instance and its managed __dict__ can be missed during traversal-based operations. A visible symptom is that gc.get_referrers(instance.__dict__) can fail to report the owning pybind11 instance.

This PR also adds a regression test for that behavior, together with version/runtime skips so the test only runs where the interpreter behavior is known to be reliable.

Background

Types created with py::dynamic_attr() store per-instance attributes in a Python __dict__. For garbage collection and related traversal-based introspection to work correctly, pybind11's tp_traverse implementation must expose that dictionary to the visitor callback.

Historically, pybind11 did this by obtaining the instance dict pointer directly and visiting it with Py_VISIT(dict). That matters because Py_VISIT does not just call the visitor: it also immediately propagates a non-zero return value.

On Python 3.13+, CPython moved dynamic instance dictionaries to the managed-dict APIs, so pybind11 uses PyObject_VisitManagedDict(self, visit, arg) instead. However, pybind11 did not check or return that function's integer result.

That changed the semantics of pybind11_traverse() on the managed-dict path:

  • before: visiting the instance dict behaved like Py_VISIT(dict), including early return on non-zero visitor results
  • after the switch to PyObject_VisitManagedDict(): pybind11 always continued and eventually returned 0, even if the visitor had requested an early exit

This is subtle, but it changes the contract of tp_traverse. The concrete user-visible failure that motivated this PR is:

instance = m.DynamicAttr()
instance.a = "test"
assert instance in gc.get_referrers(instance.__dict__)

Without this PR, that assertion can fail because the traversal machinery does not correctly report the instance as a referrer of its managed __dict__.

Why this fix is correct

The implementation change is intentionally small:

int ret = PyObject_VisitManagedDict(self, visit, arg);
if (ret) {
    return ret;
}

This is the same behavior pybind11 already had on the pre-3.13 path via Py_VISIT(dict): if the visitor returns non-zero, propagate that result immediately instead of discarding it.

That makes the Python 3.13+ managed-dict path match the long-standing non-managed-dict path and restores the expected tp_traverse behavior rather than introducing new semantics.

In other words, this PR does not change what pybind11 traverses. It only fixes how pybind11 handles the visitor result while traversing managed dicts.

Test coverage

This PR adds a focused regression test for the exact externally observable behavior:

  • a minimal DynamicAttr test type is added with py::dynamic_attr()
  • the Python test creates an instance, writes an attribute into __dict__, and checks that gc.get_referrers(instance.__dict__) includes the owning instance

This test is intentionally high-level. Instead of asserting on implementation details, it checks the behavior that users actually care about: whether the runtime can discover the instance-to-dict relationship through traversal.

Why the new test is version- and runtime-gated

While working on the regression test, it became clear that CPython itself has version-specific managed-dict behavior in this area.

Relevant upstream context:

  • python/cpython issue: #130327
  • python/cpython follow-up/fix discussion: #148275

Based on local testing and the PR discussion, the new regression test is currently treated as reliable only on:

  • CPython >= 3.13.13
  • CPython >= 3.14.4

Accordingly, the test is skipped on:

  • CPython < 3.13.13
  • CPython 3.14.0 through 3.14.3
  • non-CPython interpreters for now

The non-CPython skip is deliberate. PyPy and GraalPy already differ from CPython in several GC- and referrer-sensitive cases, and this PR is about restoring correct CPython managed-dict traversal semantics. Skipping other interpreters avoids overstating what has been verified there.

This is also why the skip is expressed in terms of interpreter behavior, not platform behavior. Android, iOS, and Emscripten are handled elsewhere in CI as platform targets; the uncertainty here is specifically about interpreter traversal semantics.

Validation and debugging history

This PR had an awkward testing window because CI was still installing CPython 3.14.3 for a while even after 3.14.4 had been released. That made the new regression test appear to fail in CI even though the underlying pybind11 fix looked correct.

The version split was later confirmed locally:

  • on CPython 3.14.2, the regression still fails due to the interpreter-side issue
  • on CPython 3.14.4, the regression passes

With the version/runtime gating in place:

  • pre-commit passes
  • local testing passes on CPython 3.14.2
  • local testing passes on CPython 3.14.4

Net effect

After this PR:

  • pybind11 correctly propagates PyObject_VisitManagedDict() return values
  • the Python 3.13+ managed-dict path behaves consistently with the older Py_VISIT(dict) path
  • the bug is covered by a targeted regression test
  • that test only runs where the interpreter result is known to be trustworthy

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 8, 2026

How much trouble is it to add a test?

If adding a test is too much trouble: could you please explain in the PR description what the problem was, and why this is the correct fix?

I'm totally willing to believe this fix is needed and correct, but without any test or explanation it's very opaque.

@maxbachmann
Copy link
Copy Markdown
Contributor Author

maxbachmann commented Apr 8, 2026

This is the same handling as PY_VISIT.

#define Py_VISIT(op)                                                    \
    do {                                                                \
        if (op) {                                                       \
            int vret = visit(_PyObject_CAST(op), arg);                  \
            if (vret)                                                   \
                return vret;                                            \
        }                                                               \
    } while (0)

I had the same bug in the Python wrapper I am implementing at my company and so checked whether others had this problem as well.

For us this fixes the following case:

    instance = m.DynamicAttr()
    instance.a = "test"
    assert instance in gc.get_referrers(instance.__dict__)

For pybind11 this unit test still fails due to python/cpython#130327. It should presumably work once we upgrade to 3.14.4 which was released yesterday.

@maxbachmann
Copy link
Copy Markdown
Contributor Author

Worth noting that cpython doesn't always properly handle the result either (python/cpython#148275).

@maxbachmann
Copy link
Copy Markdown
Contributor Author

I was hoping that since 3.14.4 is released in https://github.com/actions/python-versions it would pick it up now, but apparently that isn't the case

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 9, 2026

I was hoping that since 3.14.4 is released in https://github.com/actions/python-versions it would pick it up now, but apparently that isn't the case

In the past I've seen it taking a day or so. Do you see the option to "rerun failed tests"?

(I'll try to look at this PR carefully on the weekend.)

@maxbachmann
Copy link
Copy Markdown
Contributor Author

No I don't have that option.

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 9, 2026

I just clicked the button, but it's still picking up: Successfully set up CPython (3.14.3)

Under

https://github.com/pybind/pybind11/actions/runs/24187486767/job/70595813802?pr=6032

I see:

  =================================== FAILURES ===================================
  ______________________________ test_get_referrers ______________________________
   
      def test_get_referrers():
          instance = m.DynamicAttr()
          instance.a = "test"
  >       assert instance in gc.get_referrers(instance.__dict__)
  E       AssertionError: assert <pybind11_tests.class_.DynamicAttr object at 0x7195a1b995e0> in []
  E        +  where [] = <built-in function get_referrers>({'a': 'test'})
  E        +    where <built-in function get_referrers> = gc.get_referrers
  E        +    and   {'a': 'test'} = <pybind11_tests.class_.DynamicAttr object at 0x7195a1b995e0>.__dict__
   
  instance   = <pybind11_tests.class_.DynamicAttr object at 0x7195a1b995e0>
   
  tests/test_class.py:52: AssertionError

I think it'll be best if you skip that test for Python versions between 3.14.0 and 3.14.3 (or similar; I'm not sure).

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 9, 2026

If you need more reruns, tag me here with "another rerun needed" or similar. I cannot focus on this PR before the weekend, but I can do quick button clicks.

This avoids a warning on msvc about Py_Visit shadowing the vret variable.
@maxbachmann
Copy link
Copy Markdown
Contributor Author

I think it'll be best if you skip that test for Python versions between 3.14.0 and 3.14.3 (or similar; I'm not sure).

Yes once we have confirmed this is indeed working under 3.14.4 we should skip this test on most targets:

  • < 3.13.13
  • 3.14.0-3.14.3
  • maybe non cpython interpreters (I don't know how they behave)

I didn't have Cpython 3.14.4 for local testing either and so only tested by manually hacking in the way Cpython 3.14.4 handles managed dict visits.

I just clicked the button, but it's still picking up: Successfully set up CPython (3.14.3)

It's still giving me 3.14.3 after my variable name fix to avoid shadowing warnings on msvc

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 11, 2026

@maxbachmann Based on local testing, your fix seems to work.

First with Python 3.14.2 installed locally from sources:

=========================================================== test session starts ============================================================
platform linux -- Python 3.14.2, pytest-9.0.2, pluggy-1.6.0
installed packages of interest: build==1.4.2 numpy==2.4.3 scipy==1.17.1
C++ Info: Ubuntu Clang 18.1.3 (1ubuntu1) C++20 __pybind11_internals_v12_system_libstdcpp_gxx_abi_1xxx_use_cxx11_abi_1__ PYBIND11_SIMPLE_GIL_MANAGEMENT=False
rootdir: /wrk/forked/pybind11/tests
configfile: pytest.ini
plugins: timeout-2.4.0, xdist-3.8.0
24 workers [1320 items]
.............................................................................................................................F...... [ 10%]
.................................................................................................................................... [ 20%]
.................................................................................................................................... [ 30%]
.................................................................................................................................... [ 40%]
.................................................................................................................................... [ 50%]
.................................................................................................................................... [ 60%]
.................................................................................................................................... [ 70%]
............................................................................................................................s..s.... [ 80%]
....................................................................................s............................................... [ 90%]
.............................X.......s...................x.x........................................................................ [100%]
================================================================= FAILURES =================================================================
____________________________________________________________ test_get_referrers ____________________________________________________________
[gw10] linux -- Python 3.14.2 /wrk/bld/pybind11_gcc_v3.14.2_df793163d58_default/TestVenv/bin/python3

    def test_get_referrers():
        instance = m.DynamicAttr()
        instance.a = "test"
>       assert instance in gc.get_referrers(instance.__dict__)
E       AssertionError: assert <pybind11_tests.class_.DynamicAttr object at 0x79af681bf6f0> in []
E        +  where [] = <built-in function get_referrers>({'a': 'test'})
E        +    where <built-in function get_referrers> = gc.get_referrers
E        +    and   {'a': 'test'} = <pybind11_tests.class_.DynamicAttr object at 0x79af681bf6f0>.__dict__

instance   = <pybind11_tests.class_.DynamicAttr object at 0x79af681bf6f0>

test_class.py:52: AssertionError
================================================================= XPASSES ==================================================================
========================================================= short test summary info ==========================================================
SKIPPED [1] test_pytypes.py:1052: C++20 non-type template args feature not available.
SKIPPED [1] test_pytypes.py:1096: C++20 non-type template args feature not available.
SKIPPED [1] test_stl.py:168: no <experimental/optional>
SKIPPED [1] test_thread.py:72: Deadlock with the GIL
XFAIL test_unnamed_namespace_a.py::test_have_class_any_struct[None] - Known issues: https://github.com/pybind/pybind11/pull/4319
XFAIL test_unnamed_namespace_a.py::test_have_both_class_any_struct - Known issues: https://github.com/pybind/pybind11/pull/4319
XPASS test_unnamed_namespace_a.py::test_have_class_any_struct[unnamed_namespace_a_any_struct] - Known issues: https://github.com/pybind/pybind11/pull/4319
FAILED test_class.py::test_get_referrers - AssertionError: assert <pybind11_tests.class_.DynamicAttr object at 0x79af681bf6f0> in []
===================================== 1 failed, 1312 passed, 4 skipped, 2 xfailed, 1 xpassed in 12.98s =====================================

ERROR: completed_process.returncode=1

Then with Python 3.14.4 installed locally from sources, everything else exactly the same:

=========================================================== test session starts ============================================================
platform linux -- Python 3.14.4, pytest-9.0.3, pluggy-1.6.0
installed packages of interest: build==1.4.3 numpy==2.4.4
C++ Info: Ubuntu Clang 18.1.3 (1ubuntu1) C++20 __pybind11_internals_v12_system_libstdcpp_gxx_abi_1xxx_use_cxx11_abi_1__ PYBIND11_SIMPLE_GIL_MANAGEMENT=False
rootdir: /wrk/forked/pybind11/tests
configfile: pytest.ini
plugins: timeout-2.4.0, xdist-3.8.0
24 workers [1320 items]
.................................................................................................................................... [ 10%]
.................................................................................................................................... [ 20%]
.................................................................................................................................... [ 30%]
.ss................................................................................................................................. [ 40%]
.................................................................................................................................... [ 50%]
.................................................................................................................................... [ 60%]
.................................................................................................................................... [ 70%]
.................................................................................................................................s.s [ 80%]
...........................................................................................s........................................ [ 90%]
........s....................X................xx.................................................................................... [100%]
================================================================= XPASSES ==================================================================
========================================================= short test summary info ==========================================================
SKIPPED [1] test_eigen_matrix.py:754: could not import 'scipy': No module named 'scipy'
SKIPPED [1] test_eigen_matrix.py:764: could not import 'scipy': No module named 'scipy'
SKIPPED [1] test_pytypes.py:1052: C++20 non-type template args feature not available.
SKIPPED [1] test_pytypes.py:1096: C++20 non-type template args feature not available.
SKIPPED [1] test_stl.py:168: no <experimental/optional>
SKIPPED [1] test_thread.py:72: Deadlock with the GIL
XFAIL test_unnamed_namespace_a.py::test_have_class_any_struct[None] - Known issues: https://github.com/pybind/pybind11/pull/4319
XFAIL test_unnamed_namespace_a.py::test_have_both_class_any_struct - Known issues: https://github.com/pybind/pybind11/pull/4319
XPASS test_unnamed_namespace_a.py::test_have_class_any_struct[unnamed_namespace_a_any_struct] - Known issues: https://github.com/pybind/pybind11/pull/4319
========================================== 1311 passed, 6 skipped, 2 xfailed, 1 xpassed in 12.91s ==========================================

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 11, 2026

FYI — Cursor generated the skips already (one minute). It works locally with 3.14.2 and 3.14.4. I'll push them when the CI for #6029 is done.

rwgk added 2 commits April 11, 2026 13:37
The managed-dict referrer check is only known to work on CPython 3.13.13+ and 3.14.4+, while earlier releases and non-CPython interpreters can report different traversal behavior.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 11, 2026

@maxbachmann I pushed commit e9f51a1 and pasted a Cursor-generated full PR description; I only glanced through quickly. Could you please review everything and fix as needed?

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 11, 2026

(sorry copy-paste mishap before, I had the wrong commit hash. corrected)

@maxbachmann
Copy link
Copy Markdown
Contributor Author

Looks good to me

@rwgk rwgk merged commit 0db7f72 into pybind:master Apr 12, 2026
155 of 156 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Apr 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs changelog Possibly needs a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants