Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the linux perf profiler to see Python calls #96143

Open
pablogsal opened this issue Aug 20, 2022 · 5 comments
Open

Allow the linux perf profiler to see Python calls #96143

pablogsal opened this issue Aug 20, 2022 · 5 comments
Assignees
Labels
3.12 type-feature A feature request or enhancement

Comments

@pablogsal
Copy link
Member

pablogsal commented Aug 20, 2022

The linux perf profiler is a very powerful tool but unfortunately is not able to see Python calls (only the C stack) and therefore it cannot be used (neither its very complete ecosystem) to profile Python applications and extensions.

Turns out that node and the JVM have developed a way to leverage the perf profiler for the Java and javascript frames. They use their JIT compilers to generate a unique area in memory where they place assembly code that in turn calls the frame evaluator function. This JIT compiled areas are unique per function/code object. They use the perf maps (perf allows to place a map in /temp/perf-PID.map with information mapping the JIT-ed areas to a string that identifies them and this allows perf to map java/javascript names to the JIT-ed areas, basically showing the non-native function names on the stack.

We can do a simple version of this idea in Python by using a very simple JIT compiler that compiles a assembly template that is the used to jump to PyEval_EvalFrameDefault and we can place the code names and filenames in the special perf file. This allows perf to see Python calls as well:

perf_names

And this works with all the tools in the perf ecosystem, like flamegraphs:

perf_flame

See also:
https://www.brendangregg.com/Slides/KernelRecipes_Perf_Events.pdf

@pablogsal
Copy link
Member Author

pablogsal commented Aug 21, 2022

Is also very easy to transform these into python-only flamegraphs by filtering the py:: prefix:
perf

miss-islington pushed a commit that referenced this issue Aug 30, 2022
⚠️  ⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️ ⚠️ 

If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**. 

If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
pablogsal added a commit that referenced this issue Aug 30, 2022
…#96433)

* gh-96132: Add some comments and minor fixes missed in the original PR

* Update Doc/using/cmdline.rst

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Aug 30, 2022
@gpshead
Copy link
Member

gpshead commented Aug 31, 2022

  • A Linux buildbot with PYTHONPERFSUPPORT=1 and relevant CFLAGS=-fno-omit-frame-pointer needs to be setup.
  • Something needs to garbage collect it's /tmp/perf-$pid.map files as well.

miss-islington pushed a commit that referenced this issue Sep 1, 2022
minor missed test cleanup to use the modern API from the big review.

Automerge-Triggered-By: GH:gpshead
@pablogsal
Copy link
Member Author

pablogsal commented Sep 1, 2022

Something needs to garbage collect it's /tmp/perf-$pid.map files as well.

That's really up to the user unfortunately. The files must be available after the process finishes and at report time so I don't see what we can do that automatically cleans these files because we don't know when the user has finished with them.


Or do you mean in the buildbot? In that case, tests are deleting created files already so they should not be polluting the machine so these won't pile up in buildbots.

@gpshead
Copy link
Member

gpshead commented Sep 2, 2022

Or do you mean in the buildbot? In that case, tests are deleting created files already so they should not be polluting the machine so these won't pile up in buildbots.

Yes I was talking about the desired buildbot config. A tmpwatcher of some form set to tmp files over a few hours old is likely sufficient. With the environment variable set to enable perf everywhere, a single test run probably has hundreds if not thousands of PIDs. ;)

(This is where the buildbot design really shows age. A fresh container per buildbot worker test session would make sense.)

@pablogsal
Copy link
Member Author

pablogsal commented Sep 2, 2022

Yes I was talking about the desired buildbot config. A tmpwatcher of some form set to tmp files over a few hours old is likely sufficient. With the environment variable set to enable perf everywhere, a single test run probably has hundreds if not thousands of PIDs. ;)

Ah in that case I don't think is needed. Tests check the perf files before and after the tests and deleted any new file that matches PIDs that have been spawned during tests. I will revise this logic to ensure it works correctly when running parallel test suites but I think that should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants