Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting tp_vectorcall for heap types #100554

Closed
davidhewitt opened this issue Dec 27, 2022 · 2 comments
Closed

Support setting tp_vectorcall for heap types #100554

davidhewitt opened this issue Dec 27, 2022 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement

Comments

@davidhewitt
Copy link
Contributor

davidhewitt commented Dec 27, 2022

Feature or enhancement

The tp_vectorcall slot can be used with static types to define a more efficient implementation of __new__ / __init__. This slot does not have a typedef in typeslots.h, so it cannot currently be set for PyType_FromSpec or read using PyType_GetSlot.

In 3.12 other vectorcall functionality looks set to stabilise, so please consider adding a Py_tp_vectorcall typedef and allow heap types to set / get this member.

Pitch

Adding the ability for tp_vectorcall to be used in the limited API enables extension types to make use of the vectorcall optimisation in more functionality.

Previous discussion

I see in #85784 that tp_vectorcall is deliberately inaccessible with PyType_GetSlot because it is not part of the limited API.

If there is support for this proposal, I am happy to have a first stab at implementation.

Linked PRs

@davidhewitt davidhewitt added the type-feature A feature request or enhancement label Dec 27, 2022
@arhadthedev arhadthedev added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API labels Feb 3, 2023
@encukou
Copy link
Member

encukou commented Feb 3, 2023

Just so everyone reading this is on the same page, this is about more efficient calling of the type, not of instances.
For calling instances, the vectorcall offset can be set using a "__vectorcalloffset__" PyMemberDef (see the end of the docs section, and read the linked docs carefully -- vectorcall is fast but, compared to tp_call, relatively tricky to implement).

I haven't checked if implementing tp_vetorcall correctly is possible -- remember that it'll need to match what tp_call does, including any changes to it in future versions. The bar is higher than with static types.

See devguide and vectorcall_limited for how to add tests for this. (If it doesn't make it into 3.12 it'll need to be in a seprarte file.) The tests should show that it's possible to correctly use the feature with limited API, as well as cover any edge cases.

@wjakob
Copy link
Contributor

wjakob commented Aug 25, 2024

I'd like to second this feature request. I was just looking into using vector calls to accelerate object construction for nanobind and realized that it's not possible in the stable API because of the missing type slot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants