Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39102: Increase Enum performance up to 10x times (3x average) #17669

Open
wants to merge 21 commits into
base: master
from

Conversation

@MrMrRobat
Copy link

MrMrRobat commented Dec 20, 2019

Increase Enum performance

https://bugs.python.org/issue39102

Partial benchmark result, to see more, check full benchmark result and source code

ATTR_ACCESS:
Testing with 10000 repeats, result is average of 10 tests:

    >>> NewEnum.foo  # 0.6069349023270748 ms
    <NewEnum.foo: 1>

    >>> OldEnum.foo  # 4.5089948174334005 ms
    <OldEnum.foo: 1>

    >>> NewEnum.foo.value  # 0.965516420197984 ms
    1

    >>> OldEnum.foo.value  # 5.234090615261106 ms
    1

    >>> NewEnum.value.value  # 0.6496817059283957 ms
    3

    >>> OldEnum.value.value  # 19.718928336346387 ms
    3

    >>> try:     NewEnum.value.value = 'new' except: pass  # 4.842046525117963 ms
    AttributeError("Can't set attribute")

    >>> try:     OldEnum.value.value = 'new' except: pass  # 24.484108522222897 ms
    AttributeError("can't set attribute")


NewEnum: total: 7.0641796 ms, average: 1.1652198 ms (Fastest)
OldEnum: total: 53.9461223 ms, average: 10.3317085 ms, ~ x8.87 times slower than NewEnum 

Full benchmark result and source code

Changes

  • Optimized Enum.__new__
  • Remove EnumMeta.__getattr__,
  • Store Enum.name and .value in members __dict__ for faster access
  • Deprecate getting values from Enum._name_ and ._value_, should use public .name and .value now
  • Replace Enum._member_names_ with ._unique_member_map_ for faster lookups and iteration
  • Replace _EmumMeta._member_names and ._last_values with .members mapping (deprecation warning on using old attrs)
  • Various minor improvements
  • Add support for direct setting and getting class attrs on DynamicClassAttribute without need to use slow __getattr__

Also, does compatibility with python <3.6 need to be maintained?
If not, we could use type annotations for name and values attrs. (However I wasn't able to import typing.Any in enum module due to circular import)
Ok, I see that there are no type annotations in stdlib at all, but the question is still valid, because if compatibility doesn't need to be maintained, we could use the fact that python 3.8 dicts and sets which are preserve order to speed things up even more.

https://bugs.python.org/issue39102

@MrMrRobat MrMrRobat requested a review from ethanfurman as a code owner Dec 20, 2019
@the-knights-who-say-ni

This comment has been minimized.

Copy link

the-knights-who-say-ni commented Dec 20, 2019

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@MrMrRobat

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

MrMrRobat and others added 2 commits Dec 20, 2019
@auvipy
auvipy approved these changes Dec 22, 2019
super().__setitem__(key, value)

@property

This comment has been minimized.

Copy link
@auvipy

auvipy Dec 22, 2019

the deprecation part could be addressed in a separate commit.

This comment has been minimized.

Copy link
@MrMrRobat

MrMrRobat Dec 22, 2019

Author

I don't quite understand how it should look like, though. Should I change anything now?

Just realized that _value_ and _name_ attrs remained independent from
name and value and could be different. This commit fixes it.
MrMrRobat added a commit to MrMrRobat/fastenum that referenced this pull request Dec 22, 2019
Run builtin python tests directly from python test module
Refactor patch applying
Copy link
Member

ethanfurman left a comment

Sorry a proper review is taking so long -- there is a lot to go over here.
What I can say so far:

  • name and value must show up in a member's dir();
  • _leading_underscore_names and plain_names must not be used in EnumMeta nor Enum as then the user cannot use those same names -- that's why _sunder_ and __dunder__ names are used instead;

I do have some private code that I haven't merged in yet that does away with DynamicClassAttribute -- I'll try to get that done in the next few weeks. I think a lot of the work you have done here will still be applicable.

Thank you for your efforts!

@MrMrRobat

This comment has been minimized.

Copy link
Author

MrMrRobat commented Dec 23, 2019

  • name and value must show up in a member's dir();

Yeah, looks like I just deleted them by mistake. Fixed yesterday, pushed now :)

  • _leading_underscore_names and plain_names must not be used in EnumMeta nor Enum as then the user cannot use those same names -- that's why _sunder_ and __dunder__ names are used instead;

Oh, I get it. So I guess it will be fine to return list(_unique_member_map_) or rename _member_map to __member_map__?
First one is obviously slower, second one doesn't require any additional names. Since _member_names_ considered deprecated and not used internally, the first one seems ok for me. What do you think?

I do have some private code that I haven't merged in yet that does away with DynamicClassAttribute -- I'll try to get that done in the next few weeks.

Does it mean that this PR won't be merged until your work is done?

I think a lot of the work you have done here will still be applicable.

Thanks, great to hear that :)

Oh, and I have a question from https://bugs.python.org/issue39102, maybe it should be mentioned here:

And another thing to think about: maybe we can calculate values returned by __str__, __repr__ and __invert__ once on member creation, since they not supposed to change during its life?
For example __invert__ on Flag does a lot of work on every call:

    def __invert__(self):
        cls = self.__class__
        members, uncovered = _decompose(cls, self._value_)
        inverted = cls(0)
        for m in cls:
            if m not in members and not (m._value_ & self._value_):
                inverted = inverted | m
        return cls(inverted)
@auvipy
auvipy approved these changes Dec 23, 2019
@@ -369,7 +366,7 @@ def __members__(cls):
return MappingProxyType(cls._member_map_)

def __repr__(cls):
return "<enum %r>" % cls.__name__
return f'<enum {cls.__name__!r}'

This comment has been minimized.

Copy link
@animalize

animalize Dec 24, 2019

Contributor

miss a > here.
(I didn't review the other code.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.