New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyType_FromSpec should take metaclass as an argument #60074
Comments
PyType_FromSpec() is a convenient function to create types dynamically in C extension modules, but its usefulness is limited by the fact that it creates new types using the default metaclass. I suggest adding a new C API function PyObject *PyType_FromSpecEx(PyObject *meta, PyType_Spec *spec) and redefine PyType_FromSpec() as PyType_FromSpecEx((PyObject *)&PyType_Type, spec) This functionality cannot be implemented by user because PyType_FromSpec requires access to private slotoffsets table. A (trivial) patch attached. |
The patch is a bit light: see how type_new also computes the metaclass from the base classes. |
On Thu, Sep 6, 2012 at 12:44 PM, Amaury Forgeot d'Arc
This was intentional. I was looking for a lightweight facility to |
As you can see from my first message, I originally considered PyType_FromSpecEx(PyObject *meta, PyType_Spec *spec) without bases. (In fact I was unaware of the recent addition of PyType_FromSpecWithBases.) Maybe the original signature makes more sense than the one in the patch. Explicitly setting a metaclass is most useful for the most basic type. On the other hand, a fully general function may eventually replace both PyType_FromSpec and PyType_FromSpecWithBases for most uses. |
What is your use case for this API? |
On Sep 6, 2012, at 5:10 PM, Martin v. Löwis <report@bugs.python.org> wrote:
I can describe my use case, but it is somewhat similar to ctypes. I searched the tracker for a PEP-3121 refactoring applied to ctypes and could not find any. I'll try to come up with a PEP-3121 patch for ctypes using the proposed API. |
If it's very special, I'm -0 on this addition. This sounds like this is something very few people would ever need, and they can learn to write more complicated code to achieve the same effect. Convenience API exists to make the common case convenient. I'm -1 on calling it PyType_FromSpecEx. |
On Sep 6, 2012, at 6:25 PM, Martin v. Löwis <report@bugs.python.org> wrote:
I find it encouraging that you commented on the choice of name. :-) I can live with PyType_FromMetatypeAndSpec and leave out bases. PyType_FromTypeAndSpec is fine too. On the substance, I don't think this API is just convenience. In my application I have to replace meta type after my type is created with PyType_FromSpec. This is fragile and works only for very simple metatypes. Let's get back to this discussion once I have a ctypes patch. I there will be a work-around for ctypes it will probably work for my case. (My case is a little bit more complicated because I extend the size of my type objects to store custom metadata. Ctypes fudge this issue by hiding extra data in a custom tp_dict. ) |
This API may make it easier to declare ABCs in C. |
As for declaring ABCs: I don't think the API is necessary, or even helps. An ABC is best created by *calling* ABCMeta, with the appropriate name, a possibly-empty bases tuple, and a dict. What FromSpec could do is to fill out slots with custom functions, which won't be necessary or desirable for ABCs. The really tedious part may be to put all the abstract methods into the ABC, for which having a TypeSpec doesn't help at all. (But I would certainly agree that simplifying creation of ABCs in extension modules is a worthwhile reason for an API addition) For the case that Alexander apparently envisions, i.e. metaclasses where the resulting type objects extend the layout of heap types: it should be possible for an extension module to fill out the entire type "from scratch". This will require knowledge of the layout of heap types, so it can't use just the stable ABI - however, doing this through the stable ABI won't be possible, anyway, since the extended layout needs to know how large a HeapType structure is. If filling out a type with all slots one-by-one is considered too tedious, and patching ob_type too hacky - here is another approach: Use FromSpec to create a type with all slots filled out, then call the metatype to create a subtype of that. I.e. the type which is based on a metatype would actually be a derived class of the type which has the slots defined. |
As a matter of fact, this is what the io module is doing (except that |
I know this is quite an old bug that was closed almost 10 years ago. But I am wishing this had been accepted; it would have been quite useful for my case. I'm working on a new iteration of the protobuf extension for Python. At runtime we create types dynamically, one for each message defined in a .proto file, eg. from "message Foo" we dynamically construct a "class Foo". I need to support class variables like Foo.BAR_FIELD_NUMBER, but I don't want to put all these class variables into tp_dict because there are a lot of them and they are rarely used. So I want to implement __getattr__ for the class, which requires having a metaclass. This is where the proposed PyType_FromSpecEx() would have come in very handy. The existing protobuf extension gets around this by directly calling PyType_Type.tp_new() to create a type with a given metaclass: It's unclear to me if PyType_Type.tp_new() is intended to be a supported/public API. But in any case, it's not available in the limited API, and I am trying to restrict myself to the limited API. (I also can't use PyType_GetSlot(PyType_Type, Py_tp_new) because PyType_Type is not a heap type.) Put more succinctly, I do not see any way to use a metaclass from the limited C API. Possible solutions I see:
|
You can also call (PyObject_Call*) the metaclass with (name, bases, namespace); this should produce a class. Or not: >>> class Foo(metaclass=print):
... def foo(self): pass
...
Foo () {'__module__': '__main__', '__qualname__': 'Foo', 'foo': <function Foo.foo at 0x7f6e9ddd9e50>} PyType_FromSpecEx will surely need to limit the metaclass to subtypes of type. What other limitations are there? How closely can we approach the behavior of the
I wouldn't recommend doing that after PyType_Ready is called. Including indirectly, which the type-creation functions in the stable ABI do. |
But won't that just call my metaclass's tp_new? I'm trying to do this from my metaclass's tp_new, so I can customize the class creation process. Then Python code can use my metaclass to construct classes normally.
Why not? What bad things will happen? It seems to be working so far. Setting ob_type directly actually solves another problem that I had been having with the limited API. I want to implement tp_getattro on the metaclass, but I want to first delegate to PyType.tp_getattro to return any entry that may be present in the type's tp_dict. With the full API I could call self->ob_type->tp_base->tp_getattro() do to the equivalent of super(), but with the limited API I can't access type->tp_getattro (and PyType_GetSlot() can't be used on non-heap types). I find that this does what I want: PyTypeObject *saved_type = self->ob_type;
self->ob_type = &PyType_Type;
PyObject *ret = PyObject_GetAttr(self, name);
self->ob_type = saved_type; Previously I had tried: PyObject *super = PyObject_CallFunction((PyObject *)&PySuper_Type, "OO",
self->ob_type, self);
PyObject *ret = PyObject_GetAttr(super, name);
Py_DECREF(super); But for some reason this didn't work. |
It breaks the unwritten contract that "once PyType_Ready is called, the C struct will not be modified". This is implemented in PyPy, since calling PyType_Ready creates the PyPy object in the interpreter based on the C structure. Any further changes will not be reflected in the PyPy interpreter object, so now the python-level and c-level objects do not agree what type(obj) is. We have discussed this in the PyPy team, and would like to propose relaxing the contract to state that "if the c-level contents of an object are modified, PyType_Modified must be called to re-synce the python level and c-level objects" |
Since PyPy does not use the Limited API, PySide can quite easily work around the limitations by directly working with the type object. But the usage of PyType_Modified() would make very much sense for PySide‘s new switchable features. That would work immediately without any change, because we already use that function to invalidate Python 3.10‘s type cache. |
I found a way to use metaclasses with the limited API. I found that I can access PyType_Type.tp_new by creating a heap type derived from PyType_Type: static PyType_Slot dummy_slots[] = { static PyType_Spec dummy_spec = { PyObject *bases = Py_BuildValue("(O)", &PyType_Type);
PyObject *type = PyType_FromSpecWithBases(&dummy_spec, bases);
Py_DECREF(bases);
type_new = PyType_GetSlot((PyTypeObject*)type, Py_tp_new);
Py_DECREF(type);
#ifndef Py_LIMITED_API
assert(type_new == PyType_Type.tp_new);
#endif
// Creates a type using a metaclass.
PyObject *uses_metaclass = type_new(metaclass, args, NULL); PyType_GetSlot() can't be used on PyType_Type directly, since it is not a heap type. But a heap type derived from PyType_Type will inherit tp_new, and we can call PyType_GetSlot() on that. Once we have PyType_Type.tp_new, we can use it to create a new type using a metaclass. This avoids any of the class-switching tricks I was trying before. We can also get other slots of PyType_Type like tp_getattro to do the equivalent of super(). The PyType_FromSpecEx() function proposed in this bug would still be a nicer solution to my problem. Calling type_new() doesn't let you specify object size or slots. To work around this, I derive from a type I created with PyType_FromSpec(), relying on the fact that the size and slots will be inherited. This works, but it introduces an extra class into the hierarchy that ideally could be avoided. But I do have a workaround that appears to work, and avoids the problems associated with setting ob_type directly (like PyPy incompatibility). |
nvm, I just realized that you were speaking about Windows specifically here. I believe you that on Windows "static" makes no difference in this case. The second point stands: if you consider LoadLibrary()/dlopen() to be outside the bounds of what the C standard speaks to, then the docs shouldn't invoke the C standard to explain the behavior. |
The section of documentation you reference explains that this behaviour is not covered by the standard ("applied to a non-static variable like PyBaseObject_Type() is not required to produce an address constant"), and so static addresses of exported symbols do not have to be supported. It also says that gcc supports it (I assume by generating dynamic code for getting the address) while MSVC does not (requiring you to write your own dynamic code). The conclusion, "tp_base should be set in the extension module’s init function," is exactly the right conclusion if you want your code to work across all the supported compilers. Invoking the C standard to explain why this looks similar to standard code but actually is not is totally fine. Though I do note that the text can obviously be clearer. I assume it was written this way because of a discussion that started "but the C standard says ..." and so it was clarified to point out that this isn't actually the part of the spec that someone thought it was. If we can make it clearer, happy to, but it's certainly not incorrect as it stands. |
This behavior is covered by the standard. The following C translation unit is valid according to C99: struct PyTypeObject;
extern struct PyTypeObject Foo_Type;
struct PyTypeObject *ptr = &Foo_Type; Specifically, &Foo_Type is an "address constant" per the standard because it is a pointer to an object of static storage duration (6.6p9). The Python docs contradict this with the following incorrect statement:
This statement is incorrect:
MSVC rejects this standard-conforming TU when __declspec(dllimport) is added: https://godbolt.org/z/GYrfTqaGn I am pretty sure this is out of compliance with C99. |
Windows/MSVC defines DLLs as separate programs, with their own lifetime and entry point (e.g. you can reload a DLL multiple times and it will be reinitialised each time). So there's no conflict with the standard here, and certainly nothing that affects the real discussion. If you'd like to continue this sideline, feel free to take it elsewhere - I'm done with it. Let's keep the focus on making sure the added feature is useful for users. |
All of this is true of so's in ELF also. It doesn't mean that the implementation needs to reject standards-conforming programs. I still think the Python documentation is incorrect on this point. I filed https://bugs.python.org/issue45306 to track this separately. |
Just to note, that there are two – somewhat distinct – issues here in my opinion:
My patch tries to address the first (the class creator has to take care that this is reasonable for the metaclass). I had hoped the On the discussion on There is no reason that a |
One thing I noticed isn't copied is the string pointed to by tp_name: Line 3427 in 0c50b8c
This isn't an issue if tp_name is initialized from a string literal. But if tp_name is created dynamically, it could lead to a dangling pointer. If the general message is that "everything is copied by _FromSpec", it might make sense to copy the tp_name string too.
Yes that seems reasonable. I generally prefer static declarations, since they will end up in .data instead of .text and will avoid a copy to the stack at runtime. But these are very minor differences, especially for code that only runs once at startup, and a safe-by-default recommendation of always initializing PyType_* on the stack makes sense. |
I will guess this is probably just an oversight/bug since the main aim was to move towards heap-types, and opened an issue: https://bugs.python.org/issue45315 |
And I'm surprised that you're surprised :) But I'd still say that best practice is to make specs static if possible. (And I have some plans to make static specs useful in type checking, since we can assume that types made from the same spec share the memory layout.)
Whoa, I missed the patch completely -- 2021 looks too much like 2012, I'm used to patches being old since we use pull requests now, and the conversation turned to slots too quickly... but missing that you mentioned it is completely on me. Sorry! Do you want to go through with the patch yourself, or should I take over? It still needs:
|
Yeah, I will try and have a look. I had posted the patch, because the test looked like a bit of a larger chunk of work ;).
:). I am coming from a completely different angle, probably. Just if you are curious, I use a from-spec like API (not public yet) in NumPy and dynamic use seemed natural/logical in that use-case, e.g.: https://github.com/numpy/numpy/blob/c92864091e5928d92bc109d1505febe35f3909f1/numpy/core/src/multiarray/convert_datatype.c#L2434 But, I am trying to understand the preference for static better. There is probably something to learn or even important I am missing.
(I don't understand how it is useful, unless you reuse slot structs?) |
The new issue is bpo-45383. There's a sprint newt week; I'll probably get to it then.
The original motivating use case is not "create types dynamically on demand", but "create types without details of the C structure being compiled into the extension". That is, make it possible for modules to be compatible with several versions of Python.
This is quickly getting out of scope here, but if you're interested, I put down some notes here: encukou/abi3#19 |
I just stumbled across this issue along with #89546 and was wondering what the current status of these issues is? My use case is for a binding tool named nanobind. I would like to allocate a type that is larger than a The puzzle in my case is that
|
A side comment: the initial patch associated with this PR looked perfectly good to me (modulo, perhaps, the naming that was being discussed). Short and simple. What do you think about reviving this change and rebasing it on the latest Python version? |
@wjakob I think you should look at gh-89546 where I think we managed to clarify a bit how this should develop. I (and maybe also Petr) mean to come back to this at some point for NumPy (I need to make the new NumPy DTypes compatible with the "limited API"), but it has not managed to top my priority/interest list for a while. |
I commented here, because this other PR does not look like it would fix my issue: there might not be a base class with a suitable metaclass -- it should be possible to allocate a sufficiently large PyTypeObject even in such a case. |
I am also curious what was wrong with the solution proposed here? It seems that the discussion got hung up on a minor technical issue (whether to pass in the metaclass as a function argument or as a slot member, and if it's valid to use a |
I cannot say why each of the proposals did not really follow up on. Probably because it is tricky and nobody pushed much more. I had a PR to just increase the allocation size, but never polished up the tests (plus I now think we it should be restricted it a bit more then I did). However, I now think that we figured out some stuff in gh-89546 in that:
But I don't know, maybe the approach proposed here has its advantages, I can't say I reread the whole discussion here, things had just made a lot more sense to me after the discussion in gh-89546 so I thought it might be the more worthwhile path to persue. |
Added a new stable API function ``PyMetaType_FromModuleAndSpec``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Added a new stable API function ``PyType_FromMetaModuleAndSpec``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Added a new stable API function ``PyType_FromMetaModuleAndSpec``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Added a new stable API function ``PyType_FromMetaclass``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Added a new stable API function ``PyType_FromMetaclass``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Added a new stable API function ``PyType_FromMetaclass``, which mirrors the behavior of ``PyType_FromModuleAndSpec`` except that it takes an additional metaclass argument. This is, e.g., useful for language binding tools that need to store additional information in the type object.
Yes, with a note that #89546 follows up on this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: