Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-60074: add new stable API function PyType_FromMetaModuleAndSpec #93012

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

wjakob
Copy link

@wjakob wjakob commented May 20, 2022

TL;DR: Numerous flagship Python packages internally rely on pybind11 to bridge C++ and Python. This includes SciPy, PyTorch, Tensorflow, JAX, and many others. This PR adds critical functionality needed to eventually move more of them onto the Py_LIMITED_API and thereby simplify deployment of binary wheels.

Context: C++ <-> Python binding tools like pybind11 and nanobind require the ability to instantiate types using a custom metaclass. pybind11 uses them to install custom handling for some type-related operations, and nanobind goes even further by storing binding-related data structures in an enlarged PyHeapTypeObject allocated by a metaclass.

Unfortunately is not currently possible to dynamically create suitable heap types using the Py_LIMITED_API, which means that each extension library using these tools must be compiled many times for redistribution. I believe that it would be useful to the Python community to at least optionally support Py_LIMITED_API in these tools, but this requires a small change in CPython.

This pull request is another attempt at addressing the issue pointed out in issue #60074. It adds a new stable API function PyType_FromMetaModuleAndSpec that mirrors the behavior of PyType_FromModuleAndSpec except that it takes an additional metaclass argument.

I used this PR to create a proof-of-concept version of nanobind (limited_api branch) that dynamically create types using the limited API. It works 🎉 .

I have a somewhat audacious request, which is that this change is considered for Python 3.11 despite having recently entered feature freeze. I believe that this change is small enough and of significant utility for the wider Python community (potentially even for other groups working on bindings like SWIG or Cython)

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

Hi @vstinner 👋 -- here my attempt at a minimal CPython patch that is needed to make nanobind work with Py_LIMITED_API.

Also pinging @abalkin @loewis @seberg @encukou who commented on the previous PR #60074.

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

If these changes are acceptable, it would be great if this feature could still be included in Python 3.11.

@erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented May 20, 2022

If these changes are acceptable, it would be great if this feature could still be included in Python 3.11.

That is not possible; we passed the feature freeze with several weeks. You will need to target 3.12.

@timfel
Copy link

@timfel timfel commented May 20, 2022

Isn't that naming a bit off? PyMetaType_FromModuleSpec sounds to me like you're creating a new meta type, but you're just creating a type with a custom meta-type. iirc the patch on bpo just called it PyType_FromModuleAndSpec which I think is the better name

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

@erlend-aasland This means that usable stable API interface for libraries like nanobind/pybind11 would be available starting in 2023-10, i.e., in more than a year! 😢

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

@timfel  1�7 PyType_FromModuleAndSpec is already taken. I was considering PyType_FromModuleAndSpecAndMeta but this seemed getting silly/long. Thoughts?

(Edit: ultimately converged towards the name PyType_FromMetaModuleAndSpec)

@timfel
Copy link

@timfel timfel commented May 20, 2022

I was considering PyType_FromModuleAndSpecAndMeta

Ah, right, that was the naming proposal in the old patch. Yes, this does seem pretty long 🤔

@erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented May 20, 2022

@erlend-aasland This means that usable stable API interface for nanobind would be available in 2023-10, i.e. in more than a year! 😢

You can of course try your case with the 3.11 release manager (Pablo IIRC) :)

Quoting PEP-602:

The following three months are spent on four versioned beta releases where no new features can be added but bug fixes are still included.

We are now in the 3.11 beta phase. Note that "no new features" are emphasised in bold type.

@seberg
Copy link
Contributor

@seberg seberg commented May 20, 2022

I would hope @encukou can make a quick call (or give an opinion). The new API probably comes with no compatibility concerns the only thing would be getting the API itself wrong (and even that can be deprecated).

For NumPy, it would also be nice to have something like this, but it did not seem particularly pressing (static types work and our DTypes are de-facto immortal anyway).

I would still be 👍 with pushing this forward quickly. As per API, I can think of two things that may need quick input:

  1. Petr had brought up the idea of making this one InitFromSpec (and do not include the alloc part). This was in gh-89546.
  2. In principle bases should be scanned for metatypes as per my old PR. That is a bug-fix, but coupled with the tp_new check might have a tiny compat concern: It could probably be just ignored for now.

But, things don't need to be perfect and besides those two calls to make, I don't think there is any reason to not just do this. We clearly need this problem to be fixed.

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

I have written a separate email to appeal to @pablogsal.

@pablogsal
Copy link
Member

@pablogsal pablogsal commented May 20, 2022

Could you please add a bunch of tests for this API in Modules/_testcapimodule.c?

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

Could you please add a bunch of tests for this API in Modules/_testcapimodule.c?

Yes, I will do this ASAP.

Given the valid concern by @timfel that this function sounds like it creates a metatype, I will also change the name of the function to the somewhat long but more descriptive PyType_FromMetaModuleAndSpec.

@pablogsal
Copy link
Member

@pablogsal pablogsal commented May 20, 2022

(I will answer your email as soon as I can)

wjakob added 2 commits May 20, 2022
Added a new stable API function ``PyType_FromMetaModuleAndSpec``, which
mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it
takes an additional metaclass argument. This is, e.g., useful for
language binding tools that need to store additional information in the
type object.
@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

@pablogsal - Finished renaming, and test added.

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

(Could somebody push the button to approve the workflows?)

@pablogsal
Copy link
Member

@pablogsal pablogsal commented May 20, 2022

Please, add a test that checks that we raise if you provide a tp_new. We likely also need a bunch of tests around this case if the downstream class has a custom allocation schema but I need to think about this in more detail.

@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

@pablogsal done.

Modules/_testcapimodule.c Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented May 20, 2022

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@pablogsal pablogsal requested review from vstinner and encukou May 20, 2022
@wjakob
Copy link
Author

@wjakob wjakob commented May 20, 2022

@pablogsal I have made the requested changes; please review again.

@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented May 20, 2022

Thanks for making the requested changes!

@pablogsal: please review the changes made to this pull request.

@bedevere-bot bedevere-bot requested a review from pablogsal May 20, 2022
@wjakob wjakob changed the title gh-60074: add new stable API function PyMetaType_FromModuleAndSpec gh-60074: add new stable API function PyType_FromMetaModuleAndSpec May 20, 2022
@gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented May 20, 2022

I would find it odd if this new feature was allowed to be added during beta while we have categorically said no to new features, no matter how small. What's special about this? (Asking @pablogsal, not the OP or other fans of the feature.)

Doc/c-api/type.rst Outdated Show resolved Hide resolved
Lib/test/test_capi.py Outdated Show resolved Hide resolved
@wlav
Copy link

@wlav wlav commented May 20, 2022

Timeline aside, I'd like to add my support for PyType_FromMetaModuleAndSpec.

Project is cppyy, also a C++ - Python binder, but a run-time one, for both CPython and PyPy, used mostly in academia (high energy physics, chemistry, math, biology ...), HPC, and systems.

Use of metaclasses is especially important for cppyy when doing cross-language (multiple) inheritance at run-time, which is a very common use case: it allows implementation of C++ interface classes in Python. Right now, it mostly uses a couple of hacks based on observed CPython behavior rather than accessing internals directly (although that is not completely avoided). A clean API would be so much more preferred.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

These will give better error messages if the tests fail

Lib/test/test_capi.py Outdated Show resolved Hide resolved
Lib/test/test_capi.py Outdated Show resolved Hide resolved
@gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented May 20, 2022

I'd like to add my support for PyType_FromMetaModuleAndSpec.

Oh, definitely! I've needed this myself.

@pablogsal
Copy link
Member

@pablogsal pablogsal commented May 20, 2022

What's special about this? (Asking @pablogsal, not the OP or other fans of the feature.)

Apparently if I understand correctly this small change is needed by hpy, numpy Devs, pybind11, PyO3 and other binder implementors and I have been told that delaying 1 year the API would make things difficult for all of them (although I still don't fully understand currently what problems this creates in every case).

As I mentioned @wjakob by email, feature freeze is quite a strict deadline. All core developers are subjected to it and I mentioned this several times in python-dev and several other forums (including discourse). Additionally, our devguide states the restriction very clearly.

Given that this feature seems to have an impact in a lot of packages and the ecosystem as a whole (assuming this is confirmed by the different maintainers) I would like to ask the Steering Council for advice on how to proceed here and if everyone agrees and this is as relevant as I have been told I am willing to consider make an exception.

In any case, even if the SC agrees, the maintainers confirm the necessity of this API and there is no other opposition I will not allow this to be added after beta 2, so we have 10 days to collectively make a decision.

@gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented May 20, 2022

Okay, sounds like a plan. We should go full speed ahead with getting the PR merged in main then, so if the SC agrees the backport can be merged right away.

Doc/c-api/type.rst Outdated Show resolved Hide resolved
Doc/c-api/typeobj.rst Outdated Show resolved Hide resolved
Lib/test/test_capi.py Outdated Show resolved Hide resolved
Modules/_testcapimodule.c Outdated Show resolved Hide resolved
Objects/typeobject.c Outdated Show resolved Hide resolved
@@ -3384,6 +3390,12 @@ PyType_FromModuleAndSpec(PyObject *module, PyType_Spec *spec, PyObject *bases)
char *res_start;
short slot_offset, subslot_offset;

if (meta->tp_new != PyType_Type.tp_new) {
Copy link

@timfel timfel May 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n.b.: If numpy were to use this to migrate to heap types for the dtypes, their DTypeMeta class currently does define a custom tp_new:

dtypemeta_new(PyTypeObject *NPY_UNUSED(type),
        PyObject *NPY_UNUSED(args), PyObject *NPY_UNUSED(kwds))
{
    PyErr_SetString(PyExc_TypeError,
            "Preliminary-API: Cannot subclass DType.");
    return NULL;
}

I'm not sure how important it would be for them to keep this error message, maybe someone from NumPy can comment?

See: https://github.com/numpy/numpy/blob/ae8b9ce90dc2d5922def5db8cf7d4410c60b3e13/numpy/core/src/multiarray/dtypemeta.c#L45

Copy link
Contributor

@erlend-aasland erlend-aasland May 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disallowing subclassing can already be achieved by clearing the Py_TPFLAGS_BASETYPE flag:

https://docs.python.org/3/c-api/typeobj.html#Py_TPFLAGS_BASETYPE

cpython/Objects/typeobject.c

Lines 3502 to 3508 in 079f0dd

if (!_PyType_HasFeature(base, Py_TPFLAGS_BASETYPE)) {
PyErr_Format(PyExc_TypeError,
"type '%.100s' is not an acceptable base type",
base->tp_name);
Py_DECREF(bases);
goto fail;
}

Copy link
Contributor

@seberg seberg May 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, disallowing subclassing might work (of course the dtypes are subclasses, but I guess we don't disallow subclasses in C with that).
In practice, I don't expect NumPy needs a tp_new. I would love to see a dataclass-like API to create a new DType at some point, but that would work differently anyway. So this should be OK.

I would love if @encukou would comment on this, because I think he has a clearer picture than me.

From my current understanding, there were two scenarios:

  1. This checks makes sense for the current API where a metaclass may exist in the bases! (a current bug that this is ignored, though!). Since we may otherwise forget to initialize the metaclass properly.
  2. For the new function, it might make sense to tell users that it is their responsibility to make sure the metaclass is properly initialized, and just not add this check at all!

Now, Petr had suggested ApplySpec for the second use-case. I am not sure if that is as convenient in practice (why not call alloc for the user?). But the point is, the new function is for metaclass providers who know what additional initialization is needed for their metaclass (so the check isn't really needed).

@fs-nv
Copy link

@fs-nv fs-nv commented May 22, 2022

Adding this functionality would be very useful for Nvidia's Nsight Compute CUDA kernel profiling tool, which relies on the python limited API for its python report interface. The report interface module, which is used by numerous internal and external users to automate access to collected profiling data, is fairly large, and shipping a dedicated module for each potential local python version is not feasible. We therefore currently rely on swig's limited API support, but would like to switch to a more modern alternative such as pybind11, if that had a comparable feature set.

As such, I would like to state the support of the Nsight Compute team (and likely of other teams at Nvidia, too), for this change.

@gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented May 22, 2022

OT: @wjakob I sent you off-list email. Did you get it? If not can you email me at guido@python.org?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet