Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-33587: inspect.getsource: reorder stat on file in linecache #6805

Merged
merged 4 commits into from Aug 26, 2022

Conversation

pankajp
Copy link
Contributor

@pankajp pankajp commented May 14, 2018

The check for os.path.exists() on source file is postponed in
inspect.getsourcefile() until needed avoiding an expensive filesystem stat call.

https://bugs.python.org/issue33587

@pankajp
Copy link
Contributor Author

pankajp commented May 14, 2018

This looks like a simple change to me. If I were to create a bug on bpo, what should it mention? Would something like "inspect.getsource() does unnecessary filesystem stat when file is already in linecache" suffice?

@matrixise
Copy link
Member

matrixise commented May 14, 2018

Hi @pankajp

Thank you for your Pull Requests

but we need an issue from bpo and an explanation about your change.

Thank you,

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented May 14, 2018

getmodule() is a complex function, it can perform multiple calls of os.path.realpath() which can perform multiple expensive filesystem stat calls.

Please open an issue on the bug tracker for discussion and provide your benchmark results.

@pankajp pankajp changed the title inspect.getsource: reorder stat on file in linecache bpo-33587: inspect.getsource: reorder stat on file in linecache May 20, 2018
@pankajp
Copy link
Contributor Author

pankajp commented May 20, 2018

Thanks for your advise @matrixise and @serhiy-storchaka . I have added a bug https://bugs.python.org/issue33587 and posted a cProfile performance comparison there. Which is the canonical place to discuss the patch (performance numbers and other things), over here at github or at bpo?

Here's the patch performance difference before and after the patch:

test_code.py:

import inspect


class A():
    def __init__(self):
        self.stack = inspect.stack()
        self.children = []

    def add_child(self, cls):
        child = cls()
        self.children.append(child)
        return child

class B(A):
    pass

class C(A):
    pass

node_classes = [A, B, C]

def load_nodes(N):
    root = A()
    for i in range(N):
        child_i = root.add_child(node_classes[i%len(node_classes)])
        for j in range(N):
            child_j = child_i.add_child(node_classes[(i*j)%len(node_classes)])

load_nodes(20)

Before:

Sun May 20 21:42:32 2018    prof1.stat

         1188991 function calls (1188851 primitive calls) in 4.821 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     18/1    0.006    0.000    4.824    4.824 {built-in method builtins.exec}
        1    0.000    0.000    4.824    4.824 inspect_stack_perf.py:1(<module>)
        1    0.001    0.001    4.790    4.790 inspect_stack_perf.py:22(load_nodes)
      421    0.001    0.000    4.787    0.011 inspect_stack_perf.py:5(__init__)
      421    0.001    0.000    4.786    0.011 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1492(stack)
      421    0.011    0.000    4.785    0.011 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1464(getouterframes)
     4630    0.031    0.000    4.770    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1425(getframeinfo)
      420    0.001    0.000    4.739    0.011 inspect_stack_perf.py:9(add_child)
    13994    4.159    0.000    4.159    0.000 {built-in method nt.stat}
     4630    0.042    0.000    3.223    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:760(findsource)
     9322    0.043    0.000    2.960    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:680(getsourcefile)
     9322    0.011    0.000    2.832    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\genericpath.py:16(exists)
     4630    0.016    0.000    1.339    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\linecache.py:53(checkcache)
     4630    0.097    0.000    0.364    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:714(getmodule)
     4692    0.008    0.000    0.135    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:702(getabsfile)
     4754    0.008    0.000    0.091    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:538(abspath)
     4754    0.036    0.000    0.074    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:471(normpath)
   162600    0.071    0.000    0.071    0.000 {built-in method builtins.hasattr}
...

After the patch:

Sun May 20 21:39:44 2018    prof2.stat

         2639991 function calls (2639727 primitive calls) in 2.841 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     18/1    0.005    0.000    2.844    2.844 {built-in method builtins.exec}
        1    0.000    0.000    2.844    2.844 inspect_stack_perf.py:1(<module>)
        1    0.001    0.001    2.814    2.814 inspect_stack_perf.py:22(load_nodes)
      421    0.001    0.000    2.812    0.007 inspect_stack_perf.py:5(__init__)
      421    0.001    0.000    2.811    0.007 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1492(stack)
      421    0.010    0.000    2.810    0.007 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1464(getouterframes)
      420    0.001    0.000    2.802    0.007 inspect_stack_perf.py:9(add_child)
     4630    0.029    0.000    2.795    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1425(getframeinfo)
     4630    0.040    0.000    2.380    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:760(findsource)
     4674    1.631    0.000    1.631    0.000 {built-in method nt.stat}
     4630    0.014    0.000    1.630    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\linecache.py:53(checkcache)
13952/13890    0.281    0.000    0.907    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:714(getmodule)
9322/9260    0.038    0.000    0.703    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:680(getsourcefile)
    13952    0.018    0.000    0.259    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:702(getabsfile)
    14014    0.017    0.000    0.215    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:538(abspath)
   478166    0.177    0.000    0.177    0.000 {built-in method builtins.hasattr}
...

Total runtime reduced from 4.8 s to 2.8 s, and the major difference can be seen in the nt.stat call. This is on a machine with SSD. In workplace where python is installed in cluster on nfs, I'm sure the performance is worse.

@taleinat
Copy link
Contributor

taleinat commented Jun 10, 2018

@pankajp, the place to discuss proposed changes and bugs is indeed the issue tracker.

The check for os.path.exists() on source file is postponed in
inspect.getsourcefile() until needed avoiding an expensive filesystem
stat call and PEP 302 module loader check is moved last for performance
since it is an uncommon case.
Copy link
Contributor

@taleinat taleinat left a comment

This LGTM.

@iritkatriel
Copy link
Member

iritkatriel commented May 21, 2021

LGTM, but the merge conflicts need to be resolved.

Lib/inspect.py Outdated Show resolved Hide resolved
@mdboom
Copy link
Contributor

mdboom commented Aug 5, 2022

@pankajp: While I understand it's been a long time, it looks still relevant. Are you interested in rebasing this and re-running the benchmarks to get this merged?

@iritkatriel iritkatriel merged commit c1581a9 into python:main Aug 26, 2022
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet