gh-92041: Improve performance of inspect.getmodule #92042

mdeck · 2022-04-29T03:18:31Z

getmodule maintains a cache of module names & associated file names.
Whenever it encounters a module filename not in its cache, it iterates
over sys.modules and repopulates its cache.

An object which has no associated module filename will trigger this
repopulation loop every time getmodule is called. The cost is easily
seen with inspect.getstack

def run_test(module):
    # cols have stack depths
    # rows have len(sys.modules)

    def print_line(h, *vals):
        if not h:
            print("%5s" % h, *["%6i" % v for v in vals])
        else:
            print("%5s" % h, *["%6.1f" % v for v in vals])

    def add_modules(n):
        import random
        import sys
        for _ in range(n):
            sys.modules[f"foo_{random.randint(0,2**64)}"] = module

    def measure(depth):
        import inspect
        import timeit

        def nest(level):
            if level > 0:
                return nest(level-1)
            else:
                dur = timeit.timeit(lambda: inspect.stack(), number=1)
                return dur * 1000

        return nest(depth)

    import sys
    print_line("", *[2**e for e in range(7)])
    for i in range(1, 5):
        add_modules(n=10**i)
        times = [measure(depth=2**e) for e in range(7)]
        print_line(len(sys.modules), *times)

python 3.11.0a7+ optimized build ====

# sys has no __file__ and doesn't enter getmodule cache
>>> import sys
>>> run_test(sys)
           1      2      4      8     16     32     64
   87    2.1    0.8    1.0    1.3    2.0    3.3    6.1
  187    2.0    2.5    2.7    3.6    5.5    9.5   16.8
 1187   13.6   15.7   19.1   26.3   39.6   67.4  123.3
11187  134.1  150.3  183.4  252.0  383.5  750.3 1189.8
>>> exit()

>>> import random
>>> run_test(random)
           1      2      4      8     16     32     64
   87    2.1    0.7    0.9    1.2    1.7    2.9    5.6
  187    2.3    1.1    1.5    1.8    2.5    4.3    7.8
 1187   17.0    4.3    5.6    7.1   11.1   18.6   33.9
11187  164.6   37.4   45.7   61.6   94.7  161.0  297.7
>>> exit()

python 3.11.0a7+ optimized build, with changes ====

>>> import sys
>>> run_test(sys)
           1      2      4      8     16     32     64
   87    1.7    0.2    0.2    0.3    0.4    0.6    1.2
  187    0.2    0.2    0.2    0.2    0.3    0.5    0.9
 1187    0.2    0.2    0.2    0.3    0.3    0.5    0.9
11187    0.3    0.3    0.3    0.3    0.5    0.8    1.3
>>> exit()

>>> import random
>>> run_test(random)
           1      2      4      8     16     32     64
   87    1.8    0.2    0.2    0.3    0.4    0.6    1.2
  187    0.2    0.2    0.2    0.3    0.3    0.6    0.9
 1187    0.2    0.2    0.2    0.3    0.4    0.5    0.9
11187    0.3    0.2    0.2    0.3    0.3    0.6    1.0
>>> exit()

Issue: Slow inspect.getmodule #92041

ghost · 2022-04-29T03:18:32Z

All commit authors signed the Contributor License Agreement.

bedevere-bot · 2022-04-29T03:18:33Z

Every change to Python requires a NEWS entry.

Please, add it using the blurb_it Web app or the blurb command-line tool.

bedevere-bot · 2022-04-29T04:21:24Z

Every change to Python requires a NEWS entry.

Please, add it using the blurb_it Web app or the blurb command-line tool.

eendebakpt

Looks good. Is the new case also covered by tests?

Lib/inspect.py

mdeck · 2022-04-30T13:02:33Z

The recursion-break test suggested code objects should be distinguished by object instance rather than by their hash alone. I'm using id & hash together as best effort to uniquely identify code objects that have no module. I assume identity comparison with weakref would be the ideal solution, but the performance is O(N) rather than O(1), with N = len(sys.modules). The identity approximation seemed a reasonable trade-off to me.

mdeck · 2022-05-01T18:04:24Z

Rebased past some bad upstream that broke CI.

You can view the weakref implementation I tested here..
I'm not sure why the performance degraded so much with weakref.

Also worth noting, the tests I've shown were run by pasting into interactive console. With same test code in a module that is executed, it will run faster (though still slower than with this fix) -- inspect.stack ends up calling getmodule on each stack frame. This is slower with interactive console frames, as they are moduleless and don't get cached.

mdeck · 2022-05-01T18:16:10Z

Also it seems as though the CI has stalled since last update. Not sure if I need to make some random change and force a re-run, or if it's not running because I'm a "first-time contributor" -- the description is a bit vague. 4 expected checks, 2 workflows awaiting approval.. not sure what a 'workflow' vs a 'check' is.

mdeck · 2022-05-01T19:02:31Z

run_test shows indirect changes via inspect.stack. Here are numbers with getmodule directly. One change to test:
dur = timeit.timeit(lambda: inspect.getmodule(inspect.currentframe()), number=1)

Reminder, columns are stack depth, rows are len(sys.modules), numbers are milliseconds.

Without changes:

>>> run_test(sys)
           1      2      4      8     16     32     64
   89    2.5    0.2    0.2    0.2    0.2    0.2    0.2
  189    0.7    0.6    0.6    0.6    0.6    0.6    0.6
 1189    5.1    5.1    5.0    5.1    5.5    5.0    5.1
11189   49.0   49.3   50.0   49.3   50.6   49.5   49.7
>>>

With this PR changes:

>>> run_test(sys)
           1      2      4      8     16     32     64
   89    2.5    0.0    0.0    0.0    0.0    0.0    0.0
  189    0.0    0.0    0.0    0.0    0.0    0.0    0.0
 1189    0.0    0.0    0.0    0.0    0.0    0.0    0.0
11189    0.0    0.0    0.0    0.0    0.0    0.0    0.0
>>>

Also testing with weakref version:

>>> run_test(sys)
           1      2      4      8     16     32     64
   91    2.5    0.0    0.0    0.0    0.0    0.0    0.0
  191    0.0    0.0    0.0    0.0    0.0    0.0    0.0
 1191    0.0    0.0    0.0    0.0    0.0    0.0    0.0
11191    0.0    0.0    0.0    0.0    0.0    0.0    0.0
>>>

Interesting! getmodule performance looks equivalent with weakref! So why did I see slower inspect.stack times when using weakref? Let's revisit the inspect.stack times, using original run_test code.

With this PR changes:

>>> run_test(sys)
           1      2      4      8     16     32     64
   89    3.8    0.4    0.6    0.6    0.8    1.3    2.2
  189    0.4    0.4    0.5    0.6    0.8    1.6    2.2
 1189    0.4    0.4    0.4    0.5    1.0    1.3    2.2
11189    0.4    0.5    0.5    0.5    1.0    1.3    2.2
>>>

With weakref version:

>>> run_test(sys)
           1      2      4      8     16     32     64
   91    4.1    0.5    0.6    0.7    1.0    1.5    2.6
  191    0.7    0.7    1.0    1.0    1.2    1.8    2.8
 1191    2.9    3.0    3.5    3.1    3.5    3.9    5.0
11191   26.4   25.3   25.2   25.4   25.6   26.2   27.3
>>>

I will take a closer look at what else inspect.stack is doing, to try and explain this.

mdeck · 2022-05-01T19:25:46Z

Also the getmodule times were all 0.0; Here are those numbers in microseconds, rather than milliseonds.

With this PR changes:

>>> run_test(sys)
           1      2      4      8     16     32     64
   89 2447.6    6.2    5.5    6.1    7.1    4.4    7.3
  189    5.8    4.2    4.2    4.7    4.2    4.2    3.9
 1189    4.8    4.4    3.9    4.0    3.7    4.5    4.1
11189   15.3    4.6    4.1    4.0    4.2    4.4    4.1

With weakref impl:

>>> run_test(sys)
           1      2      4      8     16     32     64
   91 2546.2    9.6    8.0    7.3    8.7    6.8   10.8
  191    8.4    6.6    6.1    6.4    5.9    6.5    5.9
 1191    7.1    6.1    6.0    5.8    5.9    5.9   12.9
11191   22.6    7.4    6.6    6.2    5.9    6.2    6.0

At higher module counts, either implementation uses less microseconds than the current implementation in milliseconds..

mdeck · 2022-05-01T21:47:20Z

The docs indicate weakref doesn't work with code objects. They appear to work with simple tests in interactive console. However, debugging the weakref implementation I linked earlier, the _moduleless cache fills with dead weakrefs indexed by the same code object id(). The current use of id ^ hash seems optimal.

I have previously explored defining _moduleless as an LRU cache. When testing I got the sense getmodule returning None was rare, so the LRU complexity was unnecessary. I am curious if that was a safe assumption - perhaps there are use cases that involve dynamic generated code objects being queried. I may explore that again to see the additional complexity & cost. Interested in any thoughts here.

mdeck · 2022-05-03T06:02:00Z

Hmm, looks like I need a maintainer to approve running tests. I can't seem to trigger them despite pressing all the buttons.

eendebakpt · 2022-11-12T09:45:24Z

Hmm, looks like I need a maintainer to approve running tests. I can't seem to trigger them despite pressing all the buttons.

@mdeck could you merge/rebase to current main? Perhaps that will trigger the tests to build

Misc/NEWS.d/next/Library/2022-04-29-04-24-06.gh-issue-92041.B9JTOl.rst

Lib/inspect.py

bdraco · 2023-06-01T23:36:15Z

Home Assistant startup has 7,184,347 calls to ismodule at startup from this function.

bedevere-app · 2024-02-12T08:42:19Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

mdeck · 2024-02-12T09:22:44Z

Here, sys._getframe() is moduleless, and is cached:

$ git checkout main
Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
$ ./python -m timeit -s 'import inspect, sys' 'inspect.getmodule(sys._getframe())'
2000 loops, best of 5: 115 usec per loop

$ git checkout fix_inspect
Switched to branch 'fix_inspect'
Your branch is up to date with 'origin/fix_inspect'.
$ ./python -m timeit -s 'import inspect, sys' 'inspect.getmodule(sys._getframe())'
100000 loops, best of 5: 3.36 usec per loop

inspect.stack() durations with various len(sys.modules) and call stack depths. We use -c to ensure a moduleless frame is in call stack (initial import test3 is <string> frame).

$ git checkout main
Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
$ ./python -c 'import test3'
                        2         8        64  <- stack depth
            100   8110 us   2662 us   6045 us
           1000  72495 us  11043 us  12711 us
          10000 726973 us  96112 us 100057 us
len sys.modules
                        2         8        64  <- stack depth
            100   2433 us   2335 us   4473 us
           1000   9680 us  10715 us  13248 us
          10000  92595 us  89032 us  95080 us
len sys.modules
resource.struct_rusage(ru_utime=1.451359, ru_stime=0.04727, ru_maxrss=16676, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=2974, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=72, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=0, ru_nivcsw=3)

$ git checkout fix_inspect
Switched to branch 'fix_inspect'
Your branch is up to date with 'origin/fix_inspect'.
$ ./python -c 'import test3'
len moduleless: 0
                        2         8        64  <- stack depth
            100   8015 us   1117 us   3504 us
           1000    907 us   1066 us   3952 us
          10000   1833 us   1137 us   4820 us
len sys.modules
len moduleless: 6
                        2         8        64  <- stack depth
            100   1529 us   1083 us   3530 us
           1000   1073 us   1580 us   4290 us
          10000   1155 us   1106 us   4663 us
len sys.modules
resource.struct_rusage(ru_utime=0.220348, ru_stime=0.009914, ru_maxrss=18916, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=3466, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=272, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=0, ru_nivcsw=0)

# test3.py
import gc
import inspect
import random
import time
import sys 

def run_test(module):
    def print_line(h, *vals):
        if not h:
            print("%15s" % h, *["%9i" % v for v in vals], " <- stack depth")
        else:
            print("%15s" % h, *["%6.0f us" % v for v in vals])
    #
    len_sys_modules = [100, 1_000, 10_000]
    def add_modules(n):
        while len(sys.modules) < n:
            sys.modules[f"foo_{random.randint(0,2**64)}"] = module
    #
    depths = [2, 8, 64]
    def nest(depth):
        if depth > 0:
            return nest(depth-1)
        t = time.time()
        _ = inspect.stack()
        dur = time.time() - t
        return dur * 1000 * 1000
    #
    gc.collect()
    if hasattr(inspect, "_moduleless"):
        print(f"len moduleless: {len(inspect._moduleless)}")
        inspect._moduleless.clear()
    #
    orig_modules = sys.modules.copy()
    print_line("", *depths)
    for len_modules in len_sys_modules:
        add_modules(n=len_modules)
        times = [nest(depth) for depth in depths]
        print_line(len(sys.modules), *times)
        sys.modules = orig_modules.copy()
    print("len sys.modules")

run_test(random)
run_test(sys)

import resource
print(resource.getrusage(resource.RUSAGE_SELF))

bedevere-bot added the awaiting review label Apr 29, 2022

mdeck changed the title ~~Improve performance of inspect.getmodule~~ gh-92041: Improve performance of inspect.getmodule Apr 29, 2022

mdeck force-pushed the fix_inspect branch from a2d2f08 to 074dce7 Compare April 29, 2022 04:21

eendebakpt reviewed Apr 29, 2022

View reviewed changes

Lib/inspect.py Outdated Show resolved Hide resolved

eendebakpt reviewed Apr 29, 2022

View reviewed changes

Lib/inspect.py Show resolved Hide resolved

eendebakpt approved these changes Apr 29, 2022

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting review labels Apr 29, 2022

mdeck force-pushed the fix_inspect branch 6 times, most recently from 9bbf165 to 725a326 Compare April 30, 2022 18:26

mdeck requested review from berkerpeksag, pablogsal, lysnikolaou and a team as code owners April 30, 2022 18:26

mdeck marked this pull request as draft April 30, 2022 18:43

mdeck force-pushed the fix_inspect branch from 725a326 to 91c8363 Compare April 30, 2022 20:26

mdeck marked this pull request as ready for review April 30, 2022 20:49

mdeck force-pushed the fix_inspect branch from 91c8363 to 1b9a95a Compare May 3, 2022 05:04

mdeck closed this May 3, 2022

mdeck reopened this May 3, 2022

mdeck closed this May 3, 2022

mdeck reopened this May 3, 2022

AlexWaygood added performance Performance or resource usage stdlib Python modules in the Lib dir labels May 4, 2022

pablogsal removed their request for review May 5, 2022 10:51

mdeck force-pushed the fix_inspect branch from 11520f1 to 86299d7 Compare May 7, 2022 02:34

mdeck mentioned this pull request May 8, 2022

Slow inspect.getmodule #92041

Open

hauntsaninja reviewed Dec 22, 2022

View reviewed changes

Misc/NEWS.d/next/Library/2022-04-29-04-24-06.gh-issue-92041.B9JTOl.rst Outdated Show resolved Hide resolved

hauntsaninja reviewed Feb 20, 2023

View reviewed changes

Lib/inspect.py Show resolved Hide resolved

mwarkentin mentioned this pull request Jan 26, 2024

ref: replace inspect.stack() with a more performant iteration of the current stack getsentry/sentry#63965

Merged

Add lru cache for moduleless objects

7ed9351

mdeck force-pushed the fix_inspect branch from af2d609 to 7ed9351 Compare February 12, 2024 08:42

📜🤖 Added by blurb_it.

f727217

bdraco mentioned this pull request Feb 15, 2024

SkyConnect zha discovery blocks the event loop for over 1s home-assistant/core#110660

Closed

bdraco mentioned this pull request Mar 2, 2024

Avoid trying to import platforms that do not exist home-assistant/core#112028

Merged

20 tasks

Uh oh!

gh-92041: Improve performance of inspect.getmodule #92042

Are you sure you want to change the base?

gh-92041: Improve performance of inspect.getmodule #92042

Conversation

mdeck commented Apr 29, 2022 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Apr 29, 2022 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-bot commented Apr 29, 2022

Uh oh!

bedevere-bot commented Apr 29, 2022

Uh oh!

eendebakpt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mdeck commented Apr 30, 2022

Uh oh!

mdeck commented May 1, 2022

Uh oh!

mdeck commented May 1, 2022

Uh oh!

mdeck commented May 1, 2022

Uh oh!

mdeck commented May 1, 2022

Uh oh!

mdeck commented May 1, 2022

Uh oh!

mdeck commented May 3, 2022

Uh oh!

eendebakpt commented Nov 12, 2022

Uh oh!

Uh oh!

Uh oh!

bdraco commented Jun 1, 2023

Uh oh!

bedevere-app bot commented Feb 12, 2024

Uh oh!

mdeck commented Feb 12, 2024

Uh oh!

Uh oh!

mdeck commented Apr 29, 2022 •

edited by bedevere-bot

Loading

ghost commented Apr 29, 2022 •

edited by ghost

Loading