Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate the asyncio child watchers system and policy system #94597

Open
graingert opened this issue Jul 6, 2022 · 27 comments
Open

deprecate the asyncio child watchers system and policy system #94597

graingert opened this issue Jul 6, 2022 · 27 comments
Assignees
Labels
3.12 expert-asyncio stdlib Python modules in the Lib dir

Comments

@graingert
Copy link
Contributor

graingert commented Jul 6, 2022

Deprecate the child watchers system and the policy system in favour of
asyncio.Runner(loop_factory=asyncio.ProactorEventLoop/asyncio.SelectorEventLoop/uvloop.new_event_loop)
that would be deprecating:

asyncio.get_event_loop() # already deprecated unless the loop is running 
asyncio.set_event_loop()  # asyncio.set_event_loop(None) should probably be exempt
asyncio.get_event_loop_policy()
asyncio.set_event_loop_policy()  # asyncio.set_event_loop_policy(None) should probably be exempt
asyncio.set_child_watcher()
asyncio.get_child_watcher()  # `_make_subprocess_transport` will instead attempt to use os.open_pidfd and fallback to starting a thread

I'd also like to introduce a new API: asyncio.EventLoop implemented as:

if sys.platform == "win32":
    EventLoop = ProactorEventLoop
else:
    EventLoop = SelectorEventLoop

asyncio.new_event_loop() will issue a DeprecationWarning if the current policy is not the default policy, and then in 3 releases become an alias of asyncio.EventLoop

Originally posted by @graingert in #93896 (comment)

@graingert
Copy link
Contributor Author

graingert commented Jul 6, 2022

Some background to this proposed deprecation:

This is a more comprehensive version of #82772

@serhiy-storchaka proposed deprecating set_event_loop() in #93453 (comment)

But maybe we should first deprecate set_event_loop()? It will be a no-op now.

@asvetlov noted that
get_event_loop_policy().get_event_loop() was not deprecated by oversight in #83710 (comment)

IMHO, asyncio.set_event_loop() and policy.get_event_loop()/policy.set_event_loop() are not deprecated by oversight.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 6, 2022

We are in dire need of more asyncio experts. @1st1 this isn't urgent but would be nice to have your perspective in time to do this in 3.12.

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Jul 6, 2022

For 3.12 IMO we should deprecate MultiLoopWatcher #82504 and others which have race condition and other issue. Once that's done we may deprecate the entire child watcher system but just removing MultiLoopWatcher would be a good start.

@graingert
Copy link
Contributor Author

graingert commented Jul 7, 2022

@kumaraditya303 #94648

@gvanrossum
Copy link
Member

gvanrossum commented Oct 6, 2022

We discussed this at the sprint and we agree that there are many things wrong with the child watchers and the policy system.

Deprecating child watchers: @1st1 thinks these should be done per loop (like uvloop does), not globally by the policy. Much more discussion on this topic is already in #82772. Bottom line, we agree to deprecate it, details remain to be seen.

Deprecating policies: Yes please. The policies no longer serve a real purpose. Loops are always per thread, there is no need to have a "current loop" when no loop is currently running. The only thing we still need is a loop factory, so perhaps instead of an API for getting/setting a global "policy", we could have an API for getting/setting a global "loop factory".

I'm fine with the EventLoop alias (it ties up a loose end), but I recommend that the API for creating a new event loop (when not using runners) should be asyncio.new_event_loop(), not asyncio.EventLoop().

We should totally deprecate set_event_loop() (even with None argument). At that point we can make get_event_loop() an alias for get_running_loop() (or the other way around -- I prefer calling get_event_loop() :-).

@graingert
Copy link
Contributor Author

graingert commented Oct 7, 2022

so perhaps instead of an API for getting/setting a global "policy", we could have an API for getting/setting a global "loop factory".

I disagree, that's the most painful part of the policy system that I'm looking to deprecate here in favor of passing an explicit loop_factory to asyncio.Runner. The behavior of Runner should be to pick the best event loop by default, if people need to change the behavior of Runner they should pass an explicit factory, if they really need to patch this behavior globally for the whole process they can use a monkey patch

@gvanrossum
Copy link
Member

gvanrossum commented Oct 7, 2022

that's the most painful part of the policy system

What exactly is most painful? That there's a global default for something? To me the painful thing is that the policy system is over-engineered, you have to create a class that overrides new_event_loop, instantiate it, and call set_event_loop_policy with the instance. That is just classic Java. You shouldn't need to have to create a class, just a function.

I totally agree that most people should use run() or Runner, but I disagree that we should deprecate all other workflows. To me, Runner is just a convenience class.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 7, 2022

I discussed this with Yury and he convinced me that we don't need a global loop factory. Instead we should just have a loop_factory=None keyword arg to asyncio.run().

We still have to come up with a way to transition to a world where child watching is per-loop instead of global though.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 7, 2022

Good news. @1st1 has a simple refactoring of PidfdChildWatcher that makes it independent from the main loop -- just like ThreadedChildWatcher. Once that is merged (PR is forthcoming) we can also merge @kumaraditya303's PR GH-98024, and then we can start deprecating all other child watcher implementations.

We can then also deprecate set_child_watcher() (both the asyncio function and the policy method) and eventually we can move the child watcher out of the policy. There's some hand-waving here because in theory people could subclass the default policy class and override get_child_watcher() to construct their own child watcher -- we'll have to deprecate that too somehow.

But all this builds a road to a world where policies are no longer needed and eventually no longer exist.

@graingert
Copy link
Contributor Author

graingert commented Oct 7, 2022

@gvanrossum I have a PR for Pidfdchildwatcher already #94184

@graingert
Copy link
Contributor Author

graingert commented Oct 7, 2022

It would also be good for the child watchers to be responsible for calling the callback on the event loop thread. Currently the callback needs to defensively call call_soon_threadsafe when it's redundant eg the Pidfdchildwatcher

Eg

self.call_soon_threadsafe(self.call_soon, transp._process_exited, returncode)

@gvanrossum
Copy link
Member

gvanrossum commented Oct 8, 2022

I just read the comments in GH-93453: "Make get_event_loop() an alias of get_running_loop()". This makes me want to go slow with the whole "deprecate policies" part. (I am still fine with deprecating watchers ASAP.)

Maybe we could start by deprecating just set_event_loop_policy(), hence making the policy (eventually) just a global singleton that stores some state (in particular the current thread's loop, even if it's not running)?

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Oct 8, 2022

Let's deprecate child watchers first and then we can think about policy since it will require more discussion.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 8, 2022

Maybe we can also deprecate set_child_watcher now?

@graingert
Copy link
Contributor Author

graingert commented Oct 8, 2022

I'd like to see set_child_watcher and get_child_watcher deprecated with a private _get_child_watcher added temporarily that returns whatever was set until set/get_child_watcher is removed

@gvanrossum
Copy link
Member

gvanrossum commented Oct 8, 2022

But we can't guarantee that the private _get_child_watcher is supported as long as set_event_loop_policy can be called. We could of course just check whether it exists on the policy object and call it only if it exists, otherwise call the public API.

Would it be acceptable if set_child_watcher was a no-op that reported a warning? Then the unix event loop could just do its own event handling and never use the policy to get the watcher. (IIUC uvloop already doesn't use the watcher API.)

@graingert
Copy link
Contributor Author

graingert commented Oct 8, 2022

Would it be acceptable if set_child_watcher was a no-op that reported a warning?

Probably not, I think people would still expect to get whatever was set during the deprecation period. Maybe the asyncio subprocess API could just ignore it like uvloop does

@gvanrossum
Copy link
Member

gvanrossum commented Oct 9, 2022

Probably not, I think people would still expect to get whatever was set during the deprecation period. Maybe the asyncio subprocess API could just ignore it like uvloop does

But then wouldn't they also expect the watcher they set to be used?

Who knows, we need to do some searching. A possible approach would be to deprecate get_child_watcher() and set_child_watcher() but keep their implementation the same, but also stop calling get_child_watcher() in _make_subprocess_transport(). (Arguably if we do this, we should simplify the implementation and always set up a ThreadedChildWatcher in _init_watcher().)

@gvanrossum
Copy link
Member

gvanrossum commented Oct 9, 2022

Searching for set_child_watcher I found gbulb, a library that integrates the GLib main event loop with asyncio. (I actually found glibcoro first, and its README mentions gbulb.)

The importance of this find is that gbulb defines its own child watcher class that integrates with the GLib event loop. They also have a custom policy that manages this watcher, and (of course) a custom event loop. I have a feeling they really need to use their custom watcher in their event loop, because of how GLib works (although I don't know anything about GLib).

I'm guessing in the long run they can refactor their code to avoid using get/set_child_watcher, but the deprecation might be inconvenient for them. (Then again they override the policy methods so they wouldn't get the deprecation warnings.)

This is the first non-trivial mention of set_event_handler I've found (there are lots of dummies and copies around -- a lot of people somehow emulate asyncio).

@gvanrossum
Copy link
Member

gvanrossum commented Oct 9, 2022

There's also an intriguing custom watcher in chaperone but this package appears unmaintained (last commits in 2016). It appears a modified clone of FastChildWatcher.

carljm added a commit to carljm/cpython that referenced this issue Oct 9, 2022
* main:
  Minor edits to the Descriptor HowTo Guide (pythonGH-24901)
  Fix link to Lifecycle of a Pull Request in CONTRIBUTING (python#98102)
  pythonGH-94597: deprecate `SafeChildWatcher`, `FastChildWatcher` and `MultiLoopChildWatcher` child watchers  (python#98089)
  Auto-cancel old builds when new commit pushed to branch (python#98009)
  pythongh-95011: Migrate syslog module to Argument Clinic (pythonGH-95012)
carljm added a commit to carljm/cpython that referenced this issue Oct 9, 2022
* main: (5519 commits)
  Minor edits to the Descriptor HowTo Guide (pythonGH-24901)
  Fix link to Lifecycle of a Pull Request in CONTRIBUTING (python#98102)
  pythonGH-94597: deprecate `SafeChildWatcher`, `FastChildWatcher` and `MultiLoopChildWatcher` child watchers  (python#98089)
  Auto-cancel old builds when new commit pushed to branch (python#98009)
  pythongh-95011: Migrate syslog module to Argument Clinic (pythonGH-95012)
  pythongh-68686: Retire eptag ptag scripts (python#98064)
  pythongh-97922: Run the GC only on eval breaker (python#97920)
  GitHub Workflows security hardening (python#96492)
  Add `@ezio-melotti` as codeowner for `.github/`. (python#98079)
  pythongh-97913 Docs: Add walrus operator to the index (python#97921)
  [doc] Fix broken links to C extensions accelerating stdlib modules (python#96914)
  pythongh-97822: Fix http.server documentation reference to test() function (python#98027)
  pythongh-91052: Add PyDict_Unwatch for unwatching a dictionary (python#98055)
  pythonGH-98023: Change default child watcher to PidfdChildWatcher on supported systems (python#98024)
  pythonGH-94182: Run the PidfdChildWatcher on the running loop (python#94184)
  pythongh-92886: make test_ast pass with -O (assertions off) (pythonGH-98058)
  pythongh-92886: make test_coroutines pass with -O (assertions off) (pythonGH-98060)
  pythongh-57179: Add note on symlinks for os.walk (python#94799)
  pythongh-94808: Fix regex on exotic platforms (python#98036)
  pythongh-90085: Remove vestigial -t and -c timeit options (python#94941)
  ...
@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Oct 9, 2022

We need a better plan for this, here's my plan:

mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
@gvanrossum
Copy link
Member

gvanrossum commented Oct 11, 2022

Are you thinking of doing all of these in 3.12 except the final checkbox? If so we should probably do "asyncio ignores set child watcher and instead always uses PidfdChildWatcher or ThreadedChildWatcher" next, before "Deprecate all child watcher configuration methods and functions ..."

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Oct 12, 2022

I first want to get agreement on whether we should raise DeprecationWarning when overriding the child watcher or just ignore it. I analyzed this with https://cs.github.com/ and it is used mostly to workaround old asyncio bug where ThreadedChildWatcher wasn't used by default instead some other was used. This isn't an issue since about Python 3.8.

gvanrossum pushed a commit that referenced this issue Oct 15, 2022
This is the next step for deprecating child watchers.

Until we've removed the API completely we have to use it, so this PR is mostly suppressing a lot of warnings when using the API internally.

Once the child watcher API is totally removed, the two child watcher implementations we actually use and need (Pidfd and Thread) will be turned into internal helpers.
@gvanrossum
Copy link
Member

gvanrossum commented Oct 15, 2022

I think the remaining three tasks cannot be done until 3.14.

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Oct 16, 2022

This is done for 3.12, now onto the policy minefield.

carljm added a commit to carljm/cpython that referenced this issue Oct 17, 2022
* main: (31 commits)
  pythongh-95913: Move subinterpreter exper removal to 3.11 WhatsNew (pythonGH-98345)
  pythongh-95914: Add What's New item describing PEP 670 changes (python#98315)
  Remove unused arrange_output_buffer function from zlibmodule.c. (pythonGH-98358)
  pythongh-98174: Handle EPROTOTYPE under macOS in test_sendfile_fallback_close_peer_in_the_middle_of_receiving (python#98316)
  pythonGH-98327: Reduce scope of catch_warnings() in _make_subprocess_transport (python#98333)
  pythongh-93691: Compiler's code-gen passes location around instead of holding it on the global compiler state (pythonGH-98001)
  pythongh-97669: Create Tools/build/ directory (python#97963)
  pythongh-95534: Improve gzip reading speed by 10% (python#97664)
  pythongh-95913: Forward-port int/str security change to 3.11 What's New in main (python#98344)
  pythonGH-91415: Mention alphabetical sort ordering in the Sorting HOWTO (pythonGH-98336)
  pythongh-97930: Merge with importlib_resources 5.9 (pythonGH-97929)
  pythongh-85525: Remove extra row in doc (python#98337)
  pythongh-85299: Add note warning about entry point guard for asyncio example (python#93457)
  pythongh-97527: IDLE - fix buggy macosx patch (python#98313)
  pythongh-98307: Add docstring and documentation for SysLogHandler.createSocket (pythonGH-98319)
  pythongh-94808: Cover `PyFunction_GetCode`, `PyFunction_GetGlobals`, `PyFunction_GetModule` (python#98158)
  pythonGH-94597: Deprecate child watcher getters and setters (python#98215)
  pythongh-98254: Include stdlib module names in error messages for NameErrors (python#98255)
  Improve speed. Reduce auxiliary memory to 16.6% of the main array. (pythonGH-98294)
  [doc] Update logging cookbook with an example of custom handling of levels. (pythonGH-98290)
  ...
@gvanrossum
Copy link
Member

gvanrossum commented Oct 18, 2022

FWIW it seems that Jupyter has a legitimate reason to override the default policy (with another one of the predefined ones), see #93453 (comment).

@graingert
Copy link
Contributor Author

graingert commented Oct 18, 2022

FWIW it seems that Jupyter has a legitimate reason to override the default policy (with another one of the predefined ones), see #93453 (comment).

The WindowsSelectorEventLoopPolicy shouldn't be needed with tornado 6.2 where it runs a selector in a background thread so the main event loop can be the ProactorEventLoop

Labels
3.12 expert-asyncio stdlib Python modules in the Lib dir
Projects
Status: Todo
Development

No branches or pull requests

4 participants