Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-100344: Provide C implementation for asyncio.current_task #100345

Merged
merged 11 commits into from Dec 22, 2022

Conversation

itamaro
Copy link
Contributor

@itamaro itamaro commented Dec 19, 2022

Implementing it in C makes it about 4x-6x faster

Microbenchmark:

# bench.py

import time
import timeit
import asyncio

ITERS: int = 10**6
NANO: int = 10**9
NANO_PER_ITER: float = NANO / ITERS

async def main():
   # avoid load attr noise
   py_current_task = asyncio.tasks._py_current_task
   c_current_task = asyncio.tasks._c_current_task

   asyncio.current_task() # warmup
   py_current_task() # warmup
   c_current_task() # warmup

   print(
      "current_task: {}ns".format(timeit.timeit(py_current_task, number=ITERS, timer=time.process_time) * NANO_PER_ITER)
   )
   print(
      "current_task: {}ns".format(timeit.timeit(c_current_task, number=ITERS, timer=time.process_time) * NANO_PER_ITER)
   )

asyncio.run(main())

a few runs on MacBook Pro
2.4 GHz 8-Core Intel Core i9
64 GB 2667 MHz DDR4:

debug build:

~/work/pyexe/main-dbg ⌚ 9:57:34
$ ./python.exe bench.py
[py] current_task: 606.234ns
[c] current_task: 104.47699999999993ns

~/work/pyexe/main-dbg ⌚ 9:57:59
$ ./python.exe bench.py
[py] current_task: 631.856ns
[c] current_task: 110.22500000000002ns

~/work/pyexe/main-dbg ⌚ 9:58:08
$ ./python.exe bench.py
[py] current_task: 637.746ns
[c] current_task: 105.03899999999999ns

~/work/pyexe/main-dbg ⌚ 9:58:16
$ ./python.exe bench.py
[py] current_task: 621.3169999999999ns
[c] current_task: 103.01300000000002ns

opt build:

~/work/pyexe/main-opt ⌚ 10:33:17
$ ./python.exe bench.py
[py] current_task: 128.743ns
[c] current_task: 31.997999999999998ns

~/work/pyexe/main-opt ⌚ 10:33:24
$ ./python.exe bench.py
[py] current_task: 126.388ns
[c] current_task: 32.64599999999998ns

~/work/pyexe/main-opt ⌚ 10:33:26
$ ./python.exe bench.py
[py] current_task: 137.053ns
[c] current_task: 32.066999999999986ns

~/work/pyexe/main-opt ⌚ 10:33:28
$ ./python.exe bench.py
[py] current_task: 131.17800000000003ns
[c] current_task: 32.06600000000001ns

Co-authored-by: @pranavtbhat

Implementing it in C makes it about 4x-6x faster

Microbenchmark:

```
 # bench.py

import time
import timeit
import asyncio

ITERS: int = 10**6
NANO: int = 10**9
NANO_PER_ITER: float = NANO / ITERS

async def main():
   # avoid load attr noise
   py_current_task = asyncio.tasks._py_current_task
   c_current_task = asyncio.tasks._c_current_task

   asyncio.current_task() # warmup
   py_current_task() # warmup
   c_current_task() # warmup

   print(
      "current_task: {}ns".format(timeit.timeit(py_current_task, number=ITERS, timer=time.process_time) * NANO_PER_ITER)
   )
   print(
      "current_task: {}ns".format(timeit.timeit(c_current_task, number=ITERS, timer=time.process_time) * NANO_PER_ITER)
   )

asyncio.run(main())
```

a few runs on MacBook Pro
2.4 GHz 8-Core Intel Core i9
64 GB 2667 MHz DDR4:

debug build:

```
~/work/pyexe/main-dbg  9:57:34
$ ./python.exe bench.py
[py] current_task: 606.234ns
[c] current_task: 104.47699999999993ns

~/work/pyexe/main-dbg  9:57:59
$ ./python.exe bench.py
[py] current_task: 631.856ns
[c] current_task: 110.22500000000002ns

~/work/pyexe/main-dbg  9:58:08
$ ./python.exe bench.py
[py] current_task: 637.746ns
[c] current_task: 105.03899999999999ns

~/work/pyexe/main-dbg  9:58:16
$ ./python.exe bench.py
[py] current_task: 621.3169999999999ns
[c] current_task: 103.01300000000002ns
```

opt build:

```
~/work/pyexe/main-opt  10:33:17
$ ./python.exe bench.py
[py] current_task: 128.743ns
[c] current_task: 31.997999999999998ns

~/work/pyexe/main-opt  10:33:24
$ ./python.exe bench.py
[py] current_task: 126.388ns
[c] current_task: 32.64599999999998ns

~/work/pyexe/main-opt  10:33:26
$ ./python.exe bench.py
[py] current_task: 137.053ns
[c] current_task: 32.066999999999986ns

~/work/pyexe/main-opt  10:33:28
$ ./python.exe bench.py
[py] current_task: 131.17800000000003ns
[c] current_task: 32.06600000000001ns
```
@bedevere-bot
Copy link

bedevere-bot commented Dec 19, 2022

🤖 New build scheduled with the buildbot fleet by @pablogsal for commit 76cbbc6 🤖

If you want to schedule another build, you need to add the :hammer: test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Dec 19, 2022
@pablogsal
Copy link
Member

pablogsal commented Dec 19, 2022

Congrats @itamaro! Your PR has been selected to try the new refleak buildbot label! 😉

@itamaro
Copy link
Contributor Author

itamaro commented Dec 20, 2022

Your PR has been selected to try the new refleak buildbot label! 😉

Such an honor, thank you @pablogsal 😛

Checking out a few of the additional refleak test runs, I saw (across different buildbots): timeout, failure of test_nntplib, and failure of test_logging.
While none of these indicates new refleaks nor seems obviously related to this PR, it did nudge me to take a closer look, and I did find refleaks using:

./python.exe -m test test_asyncio.test_tasks -v -m "*current_task*" -R 3:3
...
OK
.
test_asyncio.test_tasks leaked [2990, 2988, 2990] references, sum=8968
test_asyncio.test_tasks leaked [896, 894, 894] memory blocks, sum=2684
test_asyncio.test_tasks failed (reference leak)

== Tests result: FAILURE ==

1 test failed:
    test_asyncio.test_tasks

Total duration: 23.6 sec
Tests result: FAILURE

I think the issue is not decrefing the loop when getting it from _asyncio_get_running_loop_impl. I'm pushing a fix that made it pass for me locally - let's see what the buildbots think 🤞

@gvanrossum
Copy link
Member

gvanrossum commented Dec 20, 2022

I'm waiting for the benchmark test suggested by Mark in the issue.

@itamaro
Copy link
Contributor Author

itamaro commented Dec 20, 2022

I'm waiting for the benchmark test suggested by Mark in the issue.

replied on the issue and pushed the python optimization @markshannon suggested for ~40% speedup.
my suggestion is to do both optimizations 😄

Lib/asyncio/tasks.py Outdated Show resolved Hide resolved
@kumaraditya303 kumaraditya303 self-assigned this Dec 21, 2022
@kumaraditya303 kumaraditya303 added the performance Performance or resource usage label Dec 21, 2022
Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

The tests need to be adjusted to test both pure python and c implementation. Also looks like there are merge conflicts.

@bedevere-bot
Copy link

bedevere-bot commented Dec 21, 2022

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@itamaro
Copy link
Contributor Author

itamaro commented Dec 21, 2022

I have made the requested changes; please review again

@bedevere-bot
Copy link

bedevere-bot commented Dec 21, 2022

Thanks for making the requested changes!

@kumaraditya303: please review the changes made to this pull request.

Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

LGTM, I'll wait for @gvanrossum's review before merging.

Copy link
Member

@gvanrossum gvanrossum left a comment

@kumaraditya303 If you agree go ahead and merge.

@itamaro
Copy link
Contributor Author

itamaro commented Dec 22, 2022

Is it still ok to push another change to revert the Python implementation as discussed on the issue, after approval?

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Dec 22, 2022

Is it still ok to push another change to revert the Python implementation as discussed on the issue, after approval?

That's fine, I'll review anyways before merging so go ahead!

Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

LGTM

@kumaraditya303 kumaraditya303 merged commit 4cc63e0 into python:main Dec 22, 2022
16 checks passed
@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Dec 22, 2022

Merged, thanks for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expert-asyncio performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants