New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of typing._ProtocolMeta._get_protocol_attrs and isinstance #74690
Comments
In 3.6.0, invocation of isinstance calls typing._ProtocolMeta._get_protocol_attrs. My program uses isinstance to allow for flexibility in parameter types in certain key functions. I realize that using isinstance is frowned upon, but it seems to make sense in my case. As a result, >95% of its run-time is inside typing._ProtocolMeta._get_protocol_attrs (!). I have created a simple wrapper around isinstance which caches its result with a Dict[Tuple[type, type], bool]. This solved the performance problem, but introduced a different problem - type checking. I use mypy and type annotations, and my code cleanly type-checks (with the occasional # type: ignore). If I switch to using my own isinstance function, then mypy's type inference no longer treats it as special, so it starts complaining about all uses of values protected by if isinstance(value, SomeType): ... I propose that either the private typing._ProtocolMeta.__subclasscheck__ (which invokes _get_protocol_attrs), or the public isinstance, would be modified to cache their results. I can create a PR for either approach, if this is acceptable. |
Thanks for reporting! The runtime implementation of protocol classes will be thoroughly reworked as a part of PEP-544, see also python/typing#417 for a proof of concept runtime implementation. Also, there is another ongoing discussion python/typing#432 about a global refactoring of typing module that will significantly improve performance. Therefore, I would wait with any large PRs until these two stories are settled. If you still want to propose a small PR, you can do this at the upstream typing repo https://github.com/python/typing |
@orenbenkiki @ilevkivskyi What is the status of this issue, five years on? |
TBH I am not sure this issue is still relevant (at least I didn't hear any recent complaints about |
Hm, actually the short-circuiting proposed in that comment may be a good idea (unless I forgot something) for a simple perf optimization (btw FWIW this is what mypy does internally, it always checks nominal subtypig first). |
For color, the current implementation lives in Beartype and is divided among two classes: The meat is in Happy to discuss further, if desired. |
Caching may need a more careful consideration (especially w.r.t. monkey-patching classes). I was specifically referring to this idea class _ProtocolMeta(ABCMeta):
# …
def __instancecheck__(cls, instance):
if super().__instancecheck__(instance):
# Short circuit for direct inheritors
return True
else:
# … existing implementation that checks method names, etc. …
return False @posita Do you have any good code base where you can benchmark this w.r.t. current implementation? |
I can probably do something crude with |
Okay, this was a bit more involved than I thought it would be, but I have some rough numbers. My approach was to create a metaclass deriving from from typing import _ProtocolMeta as ProtocolMeta
class ProtocolMetaShortCircuit(ProtocolMeta):
def __instancecheck__(cls, instance):
return super(ProtocolMeta, cls).__instancecheck__(instance) or super(
ProtocolMetaShortCircuit, cls
).__instancecheck__(instance) This is used to create a special version of the big, scary (non-caching) @runtime_checkable
class SupportsLotsOfNumberStuffShortCircuit(
SupportsLotsOfNumberStuff,
Protocol,
metaclass=ProtocolMetaShortCircuit,
): pass
# ABC registration
assert isinstance(Fraction(), SupportsLotsOfNumberStuffShortCircuit)
SupportsLotsOfNumberStuffShortCircuit.register(Fraction)
# Direct derivation
class FractionShortCircuit(Fraction, SupportsLotsOfNumberStuffShortCircuit): pass Initial results:
For built-in types, the short-circuit implementation imposes a small overhead of ~300 ns on my machine (which could likely be reduced), and is about twice as fast¹ for You can play around with the notebook in Binder: I tinkered with using this metaclass in my battery of unit tests for ¹ The registry check is likely O(1) due to caching (if I'm reading that implementation right). |
… of runtime-checkable protocols
…` twice in `_ProtocolMeta.__instancecheck__`
…e in `_ProtocolMeta.__instancecheck__` (#103141) Speed up `isinstance()` calls against runtime-checkable protocols
Improve performance of `isinstance()` checks against runtime-checkable protocols
…callable_member_only`
…nly` at protocol class creation time, not during `isinstance()` checks (#103160)
According to my benchmarks (PGO-optimised, non-debug build on Windows), The question is whether >>> from typing import *
>>> @runtime_checkable
... class HasX(Protocol):
... x: int
...
>>> class Foo: ...
>>> f = Foo()
>>> isinstance(f, HasX)
False
>>> f.x = 42
>>> isinstance(f, HasX) # evaluates to `True` on `main`, but would be `False` if we cached the whole call to _ProtocolMeta.__instancecheck__` The resultant behaviour if we cache the whole call can be pretty surprising, in my opinion -- and it would be a big behaviour change, given that protocols have been in the stdlib for some time now. But, in some ways, the new behaviour would be consistent with the behaviour we already get from >>> from collections.abc import Iterable
>>> class Foo:
... def __iter__(self):
... yield from [1, 2, 3]
...
>>> isinstance(Foo(), Iterable)
True
>>> del Foo.__iter__
>>> isinstance(Foo(), Iterable)
True
>>> iter(Foo())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Foo' object is not iterable I'm curious if @carljm or @hauntsaninja have any thoughts here. |
I don't like the cached semantics; they seem surprising to me too. So given that we are already making this a lot faster than it previously was, I'd be inclined to call it fast enough for now. Except: the fact that ABCMeta already does the caching does give me pause. That's clearly the closest analogue to Protocol in the stdlib, and it makes sense to match its behavior. So... I don't know :) I think if I had to make the call, I would leave it as is. But I wouldn't object to either choice. |
If I were designing this API from scratch, I would probably cache calls to I'll add NEWS/Whatsnew/docs, and then close this out. |
For other readers of this issue: note that it is also now clearly documented that runtime-checkable protocols probably shouldn't be used for performance-critical code: |
…r protocols that only have callable members
I wanted to have accurate numbers to put in Whatsnew, so I did some more benchmarking. Unfortunately, while this from typing import *
from dataclasses import dataclass
@runtime_checkable
class Foo(Protocol):
a: int
@dataclass
class Bar:
a: int
isinstance(Bar(42), Foo) ...This from typing import *
from dataclasses import dataclass
@runtime_checkable
class Foo(Protocol):
a: int
b: int
@dataclass
class Bar:
a: int
b: int
isinstance(Bar(42, 42), Foo) ...This one is a little slower... from typing import *
from dataclasses import dataclass
@runtime_checkable
class Foo(Protocol):
a: int
b: int
c: int
@dataclass
class Bar:
a: int
b: int
c: int
isinstance(Bar(42, 42, 42), Foo) ...And this one is much slower: from typing import *
from dataclasses import dataclass
@runtime_checkable
class Foo(Protocol):
a: int
b: int
c: int
d: int
@dataclass
class Bar:
a: int
b: int
c: int
d: int
isinstance(Bar(42, 42, 42, 42), Foo) This all makes sense, because the more members the protocol has, the more Lines 2037 to 2045 in 23cf1e2
As we know, The good news is that I think most protocols in the wild are generally pretty simple, and don't have very many members. However, this still isn't great. Ideally, we'd make |
Here is a short profiling script with some expensive from typing import *
import cProfile
@runtime_checkable
class Foo(Protocol):
a: int
b: int
c: int
d: int
e: int
f: int
g: int
h: int
i: int
j: int
class Bar:
def __init__(self):
for attrname in 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j':
setattr(self, attrname, 42)
bars = [Bar() for _ in range(100_000)]
cProfile.run("""\
for bar in bars:
isinstance(bar, Foo)
"""
) And here are the profiling results:
We're spending a lot of time in |
Following #103318, we're now faster than 3.11 for protocols with 4 members or fewer (but slower than 3.11 for protocols with 5+ members). New profiling results:
|
Following #103321, we're now faster than 3.11 for protocols with 6 members or fewer (but slower than 3.11 for protocols with 7+ members). New profiling results:
I think that might be just about acceptable now. We're still spending a lot of time in |
…nGH-103347) (cherry picked from commit 800382a) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
#103348) Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Many thanks to everybody (and especially @carljm!) for the time spent reviewing these PRs! |
Thanks for all the investigation and experimentation! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
_get_protocol_attrs
twice in_ProtocolMeta.__instancecheck__
#103141typing._get_protocol_attrs
#103152_ProtocolMeta.__instancecheck__
#103159_get_protocol_attrs
and_callable_members_only
at protocol class creation time, not duringisinstance()
checks #103160typing._ProtocolMeta.__instancecheck__
#103280typing._ProtocolMeta.__instancecheck__
: Exit early for protocols that only have callable members #103310The text was updated successfully, but these errors were encountered: