bpo-37958: Adding get_profile_dict to pstats #15495

Olshansk · 2019-08-26T00:35:48Z

pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate.

The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict.

The output of the following script:

import cProfile, pstats
import pprint
from pstats import func_std_string, f8

def fib(n):
    if n == 0:
        return 0
    if n == 1:
        return 1
    return fib(n-1) + fib(n-2)

pr = cProfile.Profile()
pr.enable()
fib(5)
pr.create_stats()

ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime')

def get_profile_dict(self, keys_filter=None):
    """
        Returns a dict where the key is a function name and the value is a dict
        with the following keys:
            - ncalls
            - tottime
            - percall_tottime
            - cumtime
            - percall_cumtime
            - file_name
            - line_number

        keys_filter can be optionally set to limit the key-value pairs in the
        retrieved dict.
    """
    pstats_dict = {}
    func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys())

    if not func_list:
        return pstats_dict

    pstats_dict["total_tt"] = float(f8(self.total_tt))
    for func in func_list:
        cc, nc, tt, ct, callers = self.stats[func]
        file, line, func_name = func
        ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc))
        tottime = float(f8(tt))
        percall_tottime = -1 if nc == 0 else float(f8(tt/nc))
        cumtime = float(f8(ct))
        percall_cumtime = -1 if cc == 0 else float(f8(ct/cc))
        func_dict = {
            "ncalls": ncalls,
            "tottime": tottime, # time spent in this function alone
            "percall_tottime": percall_tottime,
            "cumtime": cumtime, # time spent in the function plus all functions that this function called,
            "percall_cumtime": percall_cumtime,
            "file_name": file,
            "line_number": line
        }
        func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter }
        pstats_dict[func_name] = func_dict_filtered

    return pstats_dict

pp = pprint.PrettyPrinter(depth=6)
pp.pprint(get_profile_dict(ps))

will produce:

{"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0,
                                                      'file_name': '~',
                                                      'line_number': 0,
                                                      'ncalls': '1',
                                                      'percall_cumtime': 0.0,
                                                      'percall_tottime': 0.0,
                                                      'tottime': 0.0},
 'create_stats': {'cumtime': 0.0,
                  'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py',
                  'line_number': 50,
                  'ncalls': '1',
                  'percall_cumtime': 0.0,
                  'percall_tottime': 0.0,
                  'tottime': 0.0},
 'fib': {'cumtime': 0.0,
         'file_name': 'get_profile_dict.py',
         'line_number': 5,
         'ncalls': '15/1',
         'percall_cumtime': 0.0,
         'percall_tottime': 0.0,
         'tottime': 0.0},
 'total_tt': 0.0}

As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks.

https://bugs.python.org/issue37958

Automerge-Triggered-By: @gpshead


        Adding get_profile_dict to pstats

the-knights-who-say-ni · 2019-08-26T00:35:50Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

pablogsal · 2019-08-26T22:37:33Z

Hi @Olshansk and thank you for opening a PR. Before we can review, could you:

Open an issue on the Python issue tracker so we can discuss the feature (https://bugs.python.org/)
Sign the CLA (check out @the-knights-who-say-ni comment).


        📜🤖 Added by blurb_it.

Olshansk · 2019-08-27T03:58:01Z

@the-knights-who-say-ni I've signed the CLA

@pablogsal I've created https://bugs.python.org/issue37958, but not sure if I filled everything out correctly. I also mentioned it in the title and created a blurb (Olshansk@2bdb8ab)

Olshansk · 2019-08-29T17:28:58Z

@pablogsal I was wondering whom I should message to get some feedback about this idea?


        Cleaned up the code so it's potentially prod ready


        Fix documentation

gpshead

This needs unittest coverage (see Lib/test/test_pstats.py).

bedevere-bot · 2019-09-25T22:01:48Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Misc/NEWS.d/next/Library/2019-08-27-03-57-25.bpo-37958.lRORI3.rst

Lib/pstats.py

WIP


        Reply to all comments

- Move to using dataclasses - Add comments/documentation - Add tests

Olshansk · 2019-12-08T23:48:46Z

I have made the requested changes; please review again.

Sorry for being MIA and taking so long to followup on this, but I've replied to all the comments and would appreciate if you take another look!

bedevere-bot · 2019-12-08T23:48:49Z

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

Olshansk · 2019-12-29T14:20:23Z

@gpshead Just wanted to follow up on this.

Lib/test/test_pstats.py

Lib/pstats.py

bedevere-bot · 2020-01-08T23:31:51Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.


        Follow up on comments

- Return correct default value - Make tests output better errors with assertIn

Olshansk · 2020-01-09T03:15:46Z

I have made the requested changes; please review again.

bedevere-bot · 2020-01-09T03:15:50Z

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.


        adjust news entry wording


        fix my news typo :)

miss-islington · 2020-01-15T22:47:25Z

@Olshansk: Status check is done, and it's a failure ❌ .

miss-islington · 2020-01-15T22:51:52Z

@Olshansk: Status check is done, and it's a success ✅ .

Olshansk · 2020-01-16T17:28:33Z

This is awesome. Thanks a lot for your help @gpshead for helping me get this in!


        bpo-37958: Adding get_profile_dict to pstats (pythonGH-15495)

@gpshead

pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead


        bpo-37958: Adding get_profile_dict to pstats (pythonGH-15495)

@gpshead

pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead


        bpo-37958: Adding get_profile_dict to pstats (pythonGH-15495)

@gpshead

pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead

Adding get_profile_dict to pstats

Loading status checks…

a7c7d49

the-knights-who-say-ni added the CLA not signed label Aug 26, 2019

bedevere-bot added the awaiting review label Aug 26, 2019

the-knights-who-say-ni added CLA signed and removed CLA not signed labels Aug 27, 2019

Olshansk changed the title ~~Adding get_profile_dict to pstats~~ bpo37958 - Adding get_profile_dict to pstats Aug 27, 2019

📜🤖 Added by blurb_it.

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

2bdb8ab

Olshansk changed the title ~~bpo37958 - Adding get_profile_dict to pstats~~ bpo-37958: Adding get_profile_dict to pstats Sep 23, 2019

Daniel Olshansky added 2 commits Sep 23, 2019

Cleaned up the code so it's potentially prod ready

Loading status checks…

abacfed

Fix documentation

Loading status checks…

858c4cd

gpshead requested changes Sep 25, 2019

View changes

bedevere-bot removed the awaiting review label Sep 25, 2019

bedevere-bot added the awaiting changes label Sep 25, 2019

gpshead requested changes Sep 25, 2019

View changes

Misc/NEWS.d/next/Library/2019-08-27-03-57-25.bpo-37958.lRORI3.rst Outdated Show resolved Hide resolved

Lib/pstats.py Outdated Show resolved Hide resolved

Lib/pstats.py Outdated Show resolved Hide resolved

Daniel Olshansky and others added 2 commits Dec 8, 2019

WIP

Loading status checks…

448a1e5

Reply to all comments

Loading status checks…

21d3002

- Move to using dataclasses - Add comments/documentation - Add tests

bedevere-bot added awaiting change review and removed awaiting changes labels Dec 8, 2019

bedevere-bot requested a review from gpshead Dec 8, 2019

gpshead requested changes Jan 8, 2020

View changes

Lib/test/test_pstats.py Outdated Show resolved Hide resolved

Lib/pstats.py Outdated Show resolved Hide resolved

bedevere-bot removed the awaiting change review label Jan 8, 2020

bedevere-bot added the awaiting changes label Jan 8, 2020

gpshead self-assigned this Jan 8, 2020

Follow up on comments

Loading status checks…

04792f4

- Return correct default value - Make tests output better errors with assertIn

bedevere-bot added awaiting change review and removed awaiting changes labels Jan 9, 2020

bedevere-bot requested a review from gpshead Jan 9, 2020

gpshead added 2 commits Jan 15, 2020

adjust news entry wording

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

b0a8bab

fix my news typo :)

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

4818b0e

gpshead approved these changes Jan 15, 2020

View changes

bedevere-bot added awaiting merge and removed awaiting change review labels Jan 15, 2020

gpshead added the 🤖 automerge label Jan 15, 2020

bedevere-bot removed the awaiting merge label Jan 15, 2020

python / cpython

bpo-37958: Adding get_profile_dict to pstats #15495

bpo-37958: Adding get_profile_dict to pstats #15495

Olshansk commented Aug 26, 2019 •

edited by miss-islington

This comment has been minimized.

the-knights-who-say-ni commented Aug 26, 2019

This comment has been minimized.

pablogsal commented Aug 26, 2019

This comment has been minimized.

Olshansk commented Aug 27, 2019

This comment has been minimized.

Olshansk commented Aug 29, 2019

gpshead left a comment

This comment has been minimized.

bedevere-bot commented Sep 25, 2019

This comment has been minimized.

Olshansk commented Dec 8, 2019

This comment has been minimized.

bedevere-bot commented Dec 8, 2019

This comment has been minimized.

Olshansk commented Dec 29, 2019

This comment has been minimized.

bedevere-bot commented Jan 8, 2020

This comment has been minimized.

Olshansk commented Jan 9, 2020

This comment has been minimized.

bedevere-bot commented Jan 9, 2020

This comment has been minimized.

miss-islington commented Jan 15, 2020

This comment has been minimized.

miss-islington commented Jan 15, 2020

This comment has been minimized.

Olshansk commented Jan 16, 2020

python / cpython

Sponsor python/cpython

Join GitHub today

bpo-37958: Adding get_profile_dict to pstats #15495

bpo-37958: Adding get_profile_dict to pstats #15495

Conversation

Olshansk commented Aug 26, 2019 • edited by miss-islington

This comment has been minimized.

the-knights-who-say-ni commented Aug 26, 2019

This comment has been minimized.

pablogsal commented Aug 26, 2019

This comment has been minimized.

Olshansk commented Aug 27, 2019

This comment has been minimized.

Olshansk commented Aug 29, 2019

gpshead left a comment

This comment has been minimized.

bedevere-bot commented Sep 25, 2019

This comment has been minimized.

Olshansk commented Dec 8, 2019

This comment has been minimized.

bedevere-bot commented Dec 8, 2019

This comment has been minimized.

Olshansk commented Dec 29, 2019

This comment has been minimized.

bedevere-bot commented Jan 8, 2020

This comment has been minimized.

Olshansk commented Jan 9, 2020

This comment has been minimized.

bedevere-bot commented Jan 9, 2020

This comment has been minimized.

miss-islington commented Jan 15, 2020

This comment has been minimized.

miss-islington commented Jan 15, 2020

This comment has been minimized.

Olshansk commented Jan 16, 2020

Olshansk commented Aug 26, 2019 •

edited by miss-islington