gh-92777: Add LOAD_METHOD_LAZY_DICT #92778

Fidget-Spinner · 2022-05-13T12:33:55Z

Fixes #92777. Specialize LOAD_METHOD for lazy dictionaries. This accounts for 40% of the misses.

I'm sad that I missed 3.11 beta freeze for this specialization. It's straightforward and is likely to account for the majority of LOAD_METHOD in real world code since lazy __dict__ is now commonplace.

Fidget-Spinner · 2022-05-13T12:40:51Z

Hah, looks like I was wrong, it wasn't that straightforward after all :).

…nner/cpython into load_method_lazy_dict

markshannon

A few minor issues, but generally looks good.
What are the stats for the LOAD_METHOD_LAZY_DICT instruction?

Python/ceval.c

Python/specialize.c

Python/ceval.c

Fidget-Spinner · 2022-05-13T13:46:59Z

How do you collect stats for pyperformance and create that nice table on faster-cpython? I'm frankly clueless (I only know how to use the one that dumps stats out to the terminal or file). Sorry.

On test suite code, I get a 0.3% improvement on hits and 0.6% more misses. But I want to point out that typing loves messing around with __dict__, so test_typing may not be representative. Pyperformance will likely see a noticeable bump in hits with less misses.

./python -m test test_typing test_re test_dis test_zlib


Before:
opcode[160].specializable : 1
    opcode[160].specialization.success : 1395
    opcode[160].specialization.failure : 1146
    opcode[160].specialization.hit : 1026699
    opcode[160].specialization.deferred : 78612
    opcode[160].specialization.miss : 8664
    opcode[160].specialization.deopt : 156
    opcode[160].execution_count : 18435
    opcode[160].specialization.failure_kinds[0] : 63
    opcode[160].specialization.failure_kinds[1] : 39
    opcode[160].specialization.failure_kinds[2] : 505
    opcode[160].specialization.failure_kinds[4] : 406
    opcode[160].specialization.failure_kinds[9] : 4
    opcode[160].specialization.failure_kinds[10] : 3
    opcode[160].specialization.failure_kinds[17] : 33
    opcode[160].specialization.failure_kinds[18] : 23
    opcode[160].specialization.failure_kinds[19] : 1
    opcode[160].specialization.failure_kinds[22] : 69


After:
opcode[160].specializable : 1
    opcode[160].specialization.success : 1399
    opcode[160].specialization.failure : 1107
    opcode[160].specialization.hit : 1029041
    opcode[160].specialization.deferred : 76217
    opcode[160].specialization.miss : 8717
    opcode[160].specialization.deopt : 157
    opcode[160].execution_count : 18488
    opcode[160].specialization.failure_kinds[0] : 63
    opcode[160].specialization.failure_kinds[2] : 505
    opcode[160].specialization.failure_kinds[4] : 406
    opcode[160].specialization.failure_kinds[9] : 4
    opcode[160].specialization.failure_kinds[10] : 3
    opcode[160].specialization.failure_kinds[17] : 33
    opcode[160].specialization.failure_kinds[18] : 23
    opcode[160].specialization.failure_kinds[19] : 1
    opcode[160].specialization.failure_kinds[22] : 69

Fidget-Spinner · 2022-05-13T14:02:31Z

Wow looks like my expectations were proven wrong by the stats again, after removing test_typing, I get a 0.18% increase in hits at the expense of 0.45% more misses. So test_typing was actually bolstering the numbers!

The part of the stdlib I've found that frequently uses this instruction is the _io module's objects. But I don't know how to get stats on those as their tests use subprocesses.

I'm not feeling too confident about this optimization now. It seems like something that would boost our pyperformance numbers but maybe not in the real world?

./python -m test test_re test_dis test_zlib


Before:
opcode[160].specializable : 1
    opcode[160].specialization.success : 2113
    opcode[160].specialization.failure : 4126
    opcode[160].specialization.hit : 5506338
    opcode[160].specialization.deferred : 306030
    opcode[160].specialization.miss : 44980
    opcode[160].specialization.deopt : 745
    opcode[160].execution_count : 59474
    opcode[160].specialization.failure_kinds[0] : 365
    opcode[160].specialization.failure_kinds[1] : 172
    opcode[160].specialization.failure_kinds[2] : 770
    opcode[160].specialization.failure_kinds[4] : 2417
    opcode[160].specialization.failure_kinds[9] : 12
    opcode[160].specialization.failure_kinds[10] : 4
    opcode[160].specialization.failure_kinds[17] : 150
    opcode[160].specialization.failure_kinds[18] : 58
    opcode[160].specialization.failure_kinds[19] : 2
    opcode[160].specialization.failure_kinds[22] : 139
    opcode[160].specialization.failure_kinds[23] : 37



After:
opcode[160].specializable : 1
    opcode[160].specialization.success : 2127
    opcode[160].specialization.failure : 3954
    opcode[160].specialization.hit : 5516541
    opcode[160].specialization.deferred : 295625
    opcode[160].specialization.miss : 45182
    opcode[160].specialization.deopt : 748
    opcode[160].execution_count : 59676
    opcode[160].specialization.failure_kinds[0] : 365
    opcode[160].specialization.failure_kinds[2] : 770
    opcode[160].specialization.failure_kinds[4] : 2417
    opcode[160].specialization.failure_kinds[9] : 12
    opcode[160].specialization.failure_kinds[10] : 4
    opcode[160].specialization.failure_kinds[17] : 150
    opcode[160].specialization.failure_kinds[18] : 58
    opcode[160].specialization.failure_kinds[19] : 2
    opcode[160].specialization.failure_kinds[22] : 139
    opcode[160].specialization.failure_kinds[23] : 37

markshannon · 2022-05-13T14:43:57Z

Generating the table is somewhat manual and hacky. I mean to automate it, but for now here's the procedure:

Create a new branch and cherry-pick this commit: faster-cpython@a9c92c0
Run pyperformance compile on that branch. I use this config.ini file: https://gist.github.com/markshannon/26f4e8db2b715c991eee1508f430f6b2 You will need to modify it for your machine and repo.
While it is in the installing phase, create /tmp/py_stats and clear it out rm -r /tmp/py_stats/*
About the time that the installing phase finishes and the benchmarks start, one final rm -r /tmp/py_stats/*

The table is created by running ./python Tools/scripts/summarize_stats.py

Add LOAD_METHOD_LAZY_DICT

08f528d

Fidget-Spinner requested a review from markshannon as a code owner May 13, 2022

bedevere-bot added the awaiting core review label May 13, 2022

📜🤖 Added by blurb_it.

33c1ec5

Fidget-Spinner added 2 commits May 13, 2022

Set dict offset in cache

7983e4a

Merge branch 'load_method_lazy_dict' of https://github.com/Fidget-Spi…

7504c18

…nner/cpython into load_method_lazy_dict

AlexWaygood added type-feature performance labels May 13, 2022

markshannon reviewed May 13, 2022

View changes

Python/ceval.c Outdated Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

Python/ceval.c Outdated Show resolved Hide resolved

Fidget-Spinner added 2 commits May 13, 2022

Address Mark's review

78e10e8

Use double backticks for rST

6a07fa4

python / cpython Public

gh-92777: Add LOAD_METHOD_LAZY_DICT #92778

gh-92777: Add LOAD_METHOD_LAZY_DICT #92778

Fidget-Spinner commented May 13, 2022 •

edited

Fidget-Spinner commented May 13, 2022

markshannon left a comment

Fidget-Spinner commented May 13, 2022 •

edited

Fidget-Spinner commented May 13, 2022 •

edited

markshannon commented May 13, 2022

python / cpython Public

gh-92777: Add LOAD_METHOD_LAZY_DICT #92778

Are you sure you want to change the base?

gh-92777: Add LOAD_METHOD_LAZY_DICT #92778

Conversation

Fidget-Spinner commented May 13, 2022 • edited

Fidget-Spinner commented May 13, 2022

markshannon left a comment

Fidget-Spinner commented May 13, 2022 • edited

Fidget-Spinner commented May 13, 2022 • edited

markshannon commented May 13, 2022

Fidget-Spinner commented May 13, 2022 •

edited

Fidget-Spinner commented May 13, 2022 •

edited

Fidget-Spinner commented May 13, 2022 •

edited