Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upbpo-37958: Adding get_profile_dict to pstats #15495
Conversation
This comment has been minimized.
This comment has been minimized.
the-knights-who-say-ni
commented
Aug 26, 2019
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
This comment has been minimized.
This comment has been minimized.
Hi @Olshansk and thank you for opening a PR. Before we can review, could you:
|
This comment has been minimized.
This comment has been minimized.
@the-knights-who-say-ni I've signed the CLA @pablogsal I've created https://bugs.python.org/issue37958, but not sure if I filled everything out correctly. I also mentioned it in the title and created a blurb (Olshansk@2bdb8ab) |
This comment has been minimized.
This comment has been minimized.
@pablogsal I was wondering whom I should message to get some feedback about this idea? |
This needs unittest coverage (see |
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Sep 25, 2019
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Misc/NEWS.d/next/Library/2019-08-27-03-57-25.bpo-37958.lRORI3.rst
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
I have made the requested changes; please review again. Sorry for being MIA and taking so long to followup on this, but I've replied to all the comments and would appreciate if you take another look! |
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Dec 8, 2019
Thanks for making the requested changes! @gpshead: please review the changes made to this pull request. |
This comment has been minimized.
This comment has been minimized.
@gpshead Just wanted to follow up on this. |
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Jan 8, 2020
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
- Return correct default value - Make tests output better errors with assertIn
This comment has been minimized.
This comment has been minimized.
I have made the requested changes; please review again. |
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Jan 9, 2020
Thanks for making the requested changes! @gpshead: please review the changes made to this pull request. |
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Jan 15, 2020
@Olshansk: Status check is done, and it's a failure |
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Jan 15, 2020
@Olshansk: Status check is done, and it's a success |
01602ae
into
python:master
This comment has been minimized.
This comment has been minimized.
This is awesome. Thanks a lot for your help @gpshead for helping me get this in! |
pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead
pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead
pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate. The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict. The output of the following script: ``` import cProfile, pstats import pprint from pstats import func_std_string, f8 def fib(n): if n == 0: return 0 if n == 1: return 1 return fib(n-1) + fib(n-2) pr = cProfile.Profile() pr.enable() fib(5) pr.create_stats() ps = pstats.Stats(pr).sort_stats('tottime', 'cumtime') def get_profile_dict(self, keys_filter=None): """ Returns a dict where the key is a function name and the value is a dict with the following keys: - ncalls - tottime - percall_tottime - cumtime - percall_cumtime - file_name - line_number keys_filter can be optionally set to limit the key-value pairs in the retrieved dict. """ pstats_dict = {} func_list = self.fcn_list[:] if self.fcn_list else list(self.stats.keys()) if not func_list: return pstats_dict pstats_dict["total_tt"] = float(f8(self.total_tt)) for func in func_list: cc, nc, tt, ct, callers = self.stats[func] file, line, func_name = func ncalls = str(nc) if nc == cc else (str(nc) + '/' + str(cc)) tottime = float(f8(tt)) percall_tottime = -1 if nc == 0 else float(f8(tt/nc)) cumtime = float(f8(ct)) percall_cumtime = -1 if cc == 0 else float(f8(ct/cc)) func_dict = { "ncalls": ncalls, "tottime": tottime, # time spent in this function alone "percall_tottime": percall_tottime, "cumtime": cumtime, # time spent in the function plus all functions that this function called, "percall_cumtime": percall_cumtime, "file_name": file, "line_number": line } func_dict_filtered = func_dict if not keys_filter else { key: func_dict[key] for key in keys_filter } pstats_dict[func_name] = func_dict_filtered return pstats_dict pp = pprint.PrettyPrinter(depth=6) pp.pprint(get_profile_dict(ps)) ``` will produce: ``` {"<method 'disable' of '_lsprof.Profiler' objects>": {'cumtime': 0.0, 'file_name': '~', 'line_number': 0, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'create_stats': {'cumtime': 0.0, 'file_name': '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py', 'line_number': 50, 'ncalls': '1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'fib': {'cumtime': 0.0, 'file_name': 'get_profile_dict.py', 'line_number': 5, 'ncalls': '15/1', 'percall_cumtime': 0.0, 'percall_tottime': 0.0, 'tottime': 0.0}, 'total_tt': 0.0} ``` As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks. https://bugs.python.org/issue37958 Automerge-Triggered-By: @gpshead
Olshansk commentedAug 26, 2019
•
edited by miss-islington
pstats is really useful or profiling and printing the output of the execution of some block of code, but I've found on multiple occasions when I'd like to access this output directly in an easily usable dictionary on which I can further analyze or manipulate.
The proposal is to add a function called get_profile_dict inside of pstats that'll automatically return this data the data in an easily accessible dict.
The output of the following script:
will produce:
As an example, this can be used to generate a stacked column chart using various visualization tools which will assist in easily identifying program bottlenecks.
https://bugs.python.org/issue37958
Automerge-Triggered-By: @gpshead