Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in semantics and much worse performance for enum members. #93910

Open
markshannon opened this issue Jun 16, 2022 · 36 comments
Open

Change in semantics and much worse performance for enum members. #93910

markshannon opened this issue Jun 16, 2022 · 36 comments
Assignees
Labels
3.11 3.12 performance release-blocker stdlib

Comments

@markshannon
Copy link
Member

@markshannon markshannon commented Jun 16, 2022

Given the enum:

class Colours(Enum):
    RED = 1

In Python 3.9 and 3.10:

>>> Colours.__dict__["RED"] is Colours.RED
True
>>> Colours.RED.RED
<Colours.RED: 1>

In Python 3.11:

>>> Colours.__dict__["RED"] is Colours.RED
False
>>> Colours.RED.RED
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mark/repos/cpython/Lib/enum.py", line 198, in __get__
    raise AttributeError(
    ^^^^^^^^^^^^^^^^^^^^^
AttributeError: <enum 'Colours'> member has no attribute 'RED'

While these might seem like minor semantic changes, there is also a large performance impact.
Lookup of Colours.RED is simple and efficient in 3.10, but involves a lot of indirection and dispatching through the enum.property class in 3.11.
The performance impact is likely to get worse in 3.12, as we optimize more kinds of attributes.

Introduced in c314e60, I believe.

@pablogsal
@ethanfurman

@markshannon markshannon added type-bug 3.11 3.12 labels Jun 16, 2022
@markshannon markshannon added deferred-blocker performance labels Jun 16, 2022
@AlexWaygood AlexWaygood added the stdlib label Jun 16, 2022
@ethanfurman ethanfurman removed type-bug deferred-blocker labels Jun 22, 2022
@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 22, 2022

Those changes are intentional, and remedy an issue present since 3.5. As for performance, the work @markshannon and others have done has already had a considerable impact on making enum faster in 3.12.

@ethanfurman ethanfurman self-assigned this Jun 22, 2022
@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 22, 2022

What issue does the change remedy?

I'm puzzled by your claim that the work we've done has sped up enums for 3.12? It hasn't. Without c314e60, it would have.

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 22, 2022

It looks like the change in Colours.RED.RED has been brought up before: #87328
Failing on 3.11 seems to be in violation of PEP 387. A warning should be issued until 3.12.

What I don't understand is why this change is necessary at all.
What benefits does it bring? Accessing class attributes from instances is fairly universal, why should enums be special?

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 22, 2022

The rudimentary timings I did on my system for member access showed enums steadily getting faster over the releases, with a big jump in performance between 3.11 and 3.12.

The Colors.RED.RED behavior was illegal/missing in 3.4, added in 3.5 for performance (but warned against), and SC permission recieved for skipping the programmatic warning in 3.11

Enums are special, and this change brings them back into line with the original, intended, specification.

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 24, 2022

The performance of enum lookup is much slower than it should be in 3.11.
It would be a shame to have to advise people not to use enums if they care about performance.

The SC permission pertains to changes in repr(), str(), and format(), AFAICT. No mention is made of the the change in semantics of attribute lookup and performance.

@warsaw Could you confirm?

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 24, 2022

I'm seeing an approx x9 slowdown on attribute access.

class Color(Enum):
    RED = "Red"
    BLUE = "Blue"
    GREEN = "Green"

class FastColor:
    RED = Color.RED
    BLUE = Color.BLUE
    GREEN = Color.GREEN
    
def f():
    for _ in range(1000):
        Color.RED
        Color.BLUE
        Color.GREEN
        
import timeit
print(timeit.timeit('f()', number=10000, globals=globals()))

Color = FastColor
print(timeit.timeit('f()', number=10000, globals=globals()))

With an optimized, but non pgo, no lto, build.

0.8739701807498932
0.09454824822023511

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 24, 2022

OOI, why is it necessary that code like if self == self.START: no longer works?

@pablogsal
Copy link
Member

@pablogsal pablogsal commented Jun 24, 2022

I am marking this as a release blocker so we can discuss this properly. If we don't reach consensus quickly, please, let's send an email to python-dev so we can collectively reach an agreement.

@hroncok
Copy link
Contributor

@hroncok hroncok commented Jun 24, 2022

For the record, in #87328 (comment) @ethanfurman said:

DeprecationWarning will be active in 3.10 and 3.11 with removal in 3.12.

This was not followed, for reasons I don't understand.

This once again breaks real projects, e.g. googleapis/proto-plus-python#326

@JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Jun 24, 2022

The performance of enum lookup is much slower than it should be in 3.11. It would be a shame to have to advise people not to use enums if they care about performance.

The SC permission pertains to changes in repr(), str(), and format(), AFAICT. No mention is made of the the change in semantics of attribute lookup and performance.

@warsaw Could you confirm?

The permission was granted in python/steering-council#80 (linked in the last line of the linked message).

But I don't think the SC's permission means we must make the change in 3.11. A 9x performance regression is very severe and doesn't seem worth the minor benefit of disallowing Color.RED.GREEN.

@gpshead
Copy link
Member

@gpshead gpshead commented Jun 25, 2022

Agreed, the existing checked in solution to getting rid of Color.RED.BLUE behavior wart makes the default case performance in all code worse. So we shouldn't "fix" the wart that way. Within 3.11-beta the best thing to do is revert that change.

Finding a well performing attribute of an attribute wart fix remains for a future version, if it ever gets fixed.

@warsaw
Copy link
Member

@warsaw warsaw commented Jun 25, 2022

The SC permission pertains to changes in repr(), str(), and format(), AFAICT. No mention is made of the the change in semantics of attribute lookup and performance.
@warsaw Could you confirm?

That SC exception very specifically was for removing Color.RED.BLUE in 3.11 without a deprecation warning in the code. At the time, there was no discussion about any potential performance impact.

The SC rejection notice for PEP 663 had this to say about str() and format():

One aspect of the PEP we do agree with is the alignment of IntEnum’s str() and format(). It is confusing that they give different results, and that is worth the small compatibility breakage to fix. We are uncertain whether it’s better to change str to be more like format or vice versa. Our recommendation is to take this discussion to python-dev and see if consensus on this topic can be reached.

This one's a little less definitive (others on that edition of the SC can also chime in; this was a unanimous decision and agreed upon communication). This was saying that we thought it would be good to align IntEnum's str() and format() output in 3.11, although we urged consensus building on python-dev in order to decide how exactly to resolve that misalignment. We thought it was worth a compatibility break (implied: without warning or transition period) to align those in 3.11.

But I don't think the SC's permission means we must make the change in 3.11. A 9x performance regression is very severe and doesn't seem worth the minor benefit of disallowing Color.RED.GREEN.

Personally, I agree with this, but I don't have a vote on the SC any more. I would argue to not incur the performance loss in 3.11 and take the time to remove the wart performantly in 3.12, if that's even possible.

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 26, 2022

@markshannon How did you measure the performance? If you give me the same steps so I have something to work towards I'll try to remedy the performance impact.

Out of curiosity, are regular Python properties slower, or not speeding up, with the work being done? The new Enum members are basically properties, so I'm a bit surprised they're so much slower

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 26, 2022

Using the script above building with ./configure (non-debug, no-pgo, no-lto build) for 3.10 and 3.11 on ubuntu linux.

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 26, 2022

We haven't sped up regular properties up for 3.11, but have for 3.12. #93912

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 26, 2022

Could someone point me to a reference explaining why accessing one enum value from another is considered a wart?

I'm curious what is wrong with code like if state == state.START: or while state != state.END: that one might write in generic state machine code.

@AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Jun 26, 2022

Could someone point me to a reference explaining why accessing one enum value from another is considered a wart?

It can lead to confusing behaviour, as is detailed in the documentation here:

>>> from enum import Enum
>>> class FieldTypes(Enum):
...     name = 0
...     value = 1
...     size = 2
...
>>> FieldTypes.value.size
<FieldTypes.size: 2>
>>> FieldTypes.size.value
2

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 26, 2022

That a.b.c != a.c.b is only surprising if you expect the . operator to be commutative for some reason.
It would be better to recommend avoid using value as an enum member, IMO. Possibly even rejecting value at class creation time.

If you pick silly enough names, weird things will happen.

>>> class FieldTypes(Enum):
...     __class__ = 0

@gpshead
Copy link
Member

@gpshead gpshead commented Jun 26, 2022

Could someone point me to a reference explaining why accessing one enum value from another is considered a wart?

Because it violates common sense if you think of enums as merely being named values. No other values allow you to get access to other same type values as attributes. getattr(3, '2') is an error, not 2. getattr(tuple, 'dict') is an error, not dict. When I define a class with multiple attributes, I cannot access those attributes from one another.

Regardless, going back to https://peps.python.org/pep-0435/ whence our enum arrived in Python: It was never explicitly specified.

If you take "The type of an enumeration member is the enumeration it belongs to" from PEP-435 to its logical conclusion, then Enum values being instances of the type would indeed imply attribute access on the value gives access to the other values if you presume normal Python class behavior. But as a required implementation detail: enums are not "normal" classes (too much dunder magic and metaclass sorcery involved).

Blocking such infinite nested attribute access cycles strives for an ideal to avoid confusion in code. The very first code error snippet over in googleapis/proto-plus-python#326 is an example of some code that was allowed to unintentionally be written in a confusing manner because attributes of enum values were allowed to look up enum values. It allows there to be more than one way to do things.

So that is why I consider it a wart. The reality of the spec is that it was underspecified and can be interpreted as a requirement or intended side effect API based on what was specified.


I'm just adding context. I still think we should revert the behavior change in favor of performance.

@AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Jun 26, 2022

I'm just adding context. I still think we should revert the behavior change in favor of performance.

(To be clear, if the performance impact is this severe, I concur.)

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 27, 2022

Saying enums are 9x slower is a bit disingenuous, because that implies that they are slower than they used to be. On my machine, compairing enums to enums and not to standard classes, their speed is:

3.5 2.24
3.6 2.32
3.7 2.96
3.8 2.47
3.9 2.02
3.10 2.37
3.11 6.99
3.12 4.40

So in 3.11 they are roughly three times slower than they used to be, but in 3.12 that improves to only two times slower than they used to be.

We can certainly work on getting performance even better, but it seems to me that that is hardly cause for panic.

@gpshead
Copy link
Member

@gpshead gpshead commented Jun 27, 2022

compairing enums to enums and not to standard classes, their speed is:

It entirely unclear to me what you claim to be comparing in that comment @ethanfurman. Can you clarify and include microbenchmark reproduction code/instructions? (I'd expect comparing enums to enums to uselessly always be 1.0 by definition.)

@gpshead
Copy link
Member

@gpshead gpshead commented Jun 27, 2022

A draft PR reverting the change leading to the microbenchmark performance hit on the 3.11 branch to send through non-micro pyperformance seems like it'd provide meaningful decision making data.

@hroncok
Copy link
Contributor

@hroncok hroncok commented Jun 27, 2022

If you consider keeping this backward-incompatible change in 3.11, please make sure it is loudly communicated via https://docs.python.org/3.11/whatsnew/3.11.html

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 27, 2022

@gpshead I'm using the Enum portion of Mark's code on the various versions of Python to show actual performance changes. Comparing Enum to normal class access in one version only shows the difference between Enum and normal class access -- it says nothing about any actual change in performance of Enum itself.

Put another way, Mark's original test is similar to comparing a row boat to a jet-ski in one year, and the next year comparing the row boat to a speed boat, and then saying the row boat suffered massive performance degradation.

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 27, 2022

I'll try to get a PR done tomorrow -- today is my 30th wedding anniversary. :-) .

@pablogsal
Copy link
Member

@pablogsal pablogsal commented Jun 27, 2022

today is my 30th wedding anniversary. :-) .

Congrats 🎉 Enjoy the day!! :)

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jun 28, 2022

I don't know about rowboats and jet-skis, but here's a table comparing normal classes and enums for 3.9 to 3.12 with the same build on the same machine.
Non-debug, no-pgo,no-lto build. Fastest of 3 on dedicated core. i9-8950HK CPU @ 2.90GHz. Ubuntu, gcc .9.4.0.

The code
from enum import Enum
import sys
import timeit

class Color(Enum):
    RED = "Red"
    BLUE = "Blue"
    GREEN = "Green"

class FastColor:
    RED = Color.RED
    BLUE = Color.BLUE
    GREEN = Color.GREEN
    
class OldStyleColor:
    RED = "Red"
    BLUE = "Blue"
    GREEN = "Green"
    
def f():
    for _ in range(1000):
        Color.RED
        Color.BLUE
        Color.GREEN
        
print(sys.version)
        
print(f"{timeit.timeit('f()', number=10000, globals=globals()):.2f}")

Color = FastColor
print(f"{timeit.timeit('f()', number=10000, globals=globals()):.2f}")

Color = OldStyleColor
print(f"{timeit.timeit('f()', number=10000, globals=globals()):.2f}")
Version Enum "Fast Enum" Normal class
3.9 3.40 1.15 1.15
3.10 3.40 1.20 1.21
3.11 11.85 1.23 1.22
3.12 (main) 7.41 1.11 0.37

I believe the reason that "Fast Enum" is slower than the normal class for 3.12, is that Python classes are mutable, so might become a descriptor. We know that the str class will never become a descriptor, so we can skip that step.

P.S. I still don't see what's wrong with accessing class members through instances. It's basically how methods work (sort of).

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 28, 2022

@markshannon Thanks for taking that in good humor. :)

#94392 has the patch for storing members directly in the class dict -- it appears to have zero impact on the timings.

If somebody could run this through the macro performance benchmarks I would appreciate it.

hroncok added a commit to hroncok/proto-plus-python that referenced this issue Jun 28, 2022
There are two changes:

Changes in the actual code:

 - _member_names changed from a list to a dict in python/cpython#28907
    - we instance-check and remove by list-specific or dict-specific way

Change in the tests only:

 - accessing other enum members via instance attributes is no longer possible
    - we access them via the class instead
    - we leave the original test in a try-except block

Some of the Python enum changes might get reverted,
see python/cpython#93910
But the fix is backwards compatible.
hroncok added a commit to hroncok/proto-plus-python that referenced this issue Jun 28, 2022
There are two changes:

Changes in the actual code:

 - _member_names changed from a list to a dict in python/cpython#28907
    - we instance-check and remove by list-specific or dict-specific way

Change in the tests only:

 - accessing other enum members via instance attributes is no longer possible
    - we access them via the class instead
    - we leave the original test in a try-except block

Some of the Python enum changes might get reverted,
see python/cpython#93910
But the fix is backwards compatible.

Fixes googleapis#326
@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jun 30, 2022

Given that the performance decrease is only 3x, and using the members directly from __dict__ appears to make no difference, does this still need to be a release blocker?

@JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Jul 1, 2022

I think a 3x performance regression is still severe enough to be a release blocker. Accessing an enum member is not an uncommon operation, and slowing it down will make people think twice about using enums.

@markshannon
Copy link
Member Author

@markshannon markshannon commented Jul 1, 2022

Compared to normal classes, enums are almost 10x slower. That is definitely too slow.
I don't think the fact that they were 3x slower in 3.10, is an excuse for them being even slower in 3.11.

@mdboom
Copy link
Contributor

@mdboom mdboom commented Jul 1, 2022

If somebody could run this through the macro performance benchmarks I would appreciate it.

Pyperformance benchmark results

Basically "same", which isn't surprising, given that none of the benchmarks spend significant time in enums. This, IMHO, is more indicative of insufficient benchmark coverage than this not being a real problem.

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jul 3, 2022

@markshannon 3.9 and 3.10 are the same performance level, the 3x slowdown is in 3.11, and that improves to a 2x slowdown in 3.12. (Oh, you were comparing to normal classes -- I compare with previous Python versions of Enum.)

The best solution is to rewrite Enum in C, and while I could maintain it once converted, it would take me far more hours to do the conversion than I have available.

@ethanfurman
Copy link
Member

@ethanfurman ethanfurman commented Jul 3, 2022

Also, in cases where Enum performance is critical, the members can be exported to the module namespace:

The code
import time
import enum

class Color(enum.Enum):
    RED = "Red"
    BLUE = "Blue"
    GREEN = "Green"
RED, BLUE, GREEN = Color

class FastColor:
    RED = Color.RED
    BLUE = Color.BLUE
    GREEN = Color.GREEN

def f():
    for _ in range(10000000):
        Color.RED
        Color.BLUE
        Color.GREEN

def g():
    for _ in range(10000000):
        FastColor.RED
        FastColor.BLUE
        FastColor.GREEN
def h():
    for _ in range(10000000):
        RED
        BLUE
        GREEN
   
import timeit
print(round(timeit.timeit('f()', number=1, globals=globals()), 2), end='  ')
print(round(timeit.timeit('g()', number=1, globals=globals()), 2), end='  ')
print(round(timeit.timeit('h()', number=1, globals=globals()), 2), end='\n\n')
Version Enum "Fast Enum" Global Enum
3.9 2.31 0.80 0.40
3.10 2.57 0.86 0.49
3.11 7.22 0.70 0.21
3.12 4.50 0.69 0.15

@encukou
Copy link
Member

@encukou encukou commented Jul 5, 2022

The clock for 3.11 is ticking, the python-dev discussion requested by the release manager never happened, so we discussed this issue in the Steering Council and decided that a 3× slowdown in 3.11 (and a 10× slowdown compared to normal classes) is too much to remove this wart.

@ethanfurman, can you please roll back the change, and restore the 3.10 behavior (and performance)?

— Petr, onbehalf of the Steering Council

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 3.12 performance release-blocker stdlib
Projects
Status: No status
Development

No branches or pull requests

10 participants