New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-35378: Fix multiprocessing.Pool references #11627
Conversation
Fix a reference issue inside multiprocessing.Pool that caused the pool to remain alive if it was deleted without being closed or terminated explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An implementation question below.
Lib/multiprocessing/pool.py
Outdated
@@ -656,13 +682,14 @@ def __exit__(self, exc_type, exc_val, exc_tb): | |||
|
|||
class ApplyResult(object): | |||
|
|||
def __init__(self, cache, callback, error_callback): | |||
def __init__(self, pool, callback, error_callback, cache=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the separate cache
argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a leftover from the last PR.
Originally, in the last PR, I made the cache as a keyword argument to decouple the cache from the pool, as this allows a broader set of strategies (in the future) to have these separated and therefore reducing dependency cycles and lifetime chains.
In this PR I am targeting the simplest solution, so I am going to remove the cache argument and just get the cache out of the pool.
In edee524 I have reverted to use always the cache from the pool.
Lib/multiprocessing/pool.py
Outdated
@@ -701,16 +729,16 @@ def _set(self, i, obj): | |||
|
|||
class MapResult(ApplyResult): | |||
|
|||
def __init__(self, cache, chunksize, length, callback, error_callback): | |||
ApplyResult.__init__(self, cache, callback, | |||
def __init__(self, pool, chunksize, length, callback, error_callback, cache=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check my answer in the previous comment.
It seems like this chang is more for https://bugs.python.org/issue34172 no ? |
Does this change affect the following code?
|
This is combining https://bugs.python.org/issue35378 and https://bugs.python.org/issue34172 because both issues neet to be solved at the same time (one fix does not have total sense without the other). This PR breaks the reference cycle and ties the lifetime of the iterators so there is no change in behaviour.
No, that's the point of solving both issues (bpos) at the same time: it preserves backwards compatibility: This patch
python3.7
python3.6
|
@@ -0,0 +1,3 @@ | |||
Fix a reference issue inside :class:`multiprocessing.Pool` that caused | |||
the pool to remain alive if it was deleted without being closed or | |||
terminated explicitly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that your change also adds a strong reference to the pool in iterators. Would you mind to also explain that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks fine to me, though it would be better if there could be some tests.
Thanks for the review! I agree that tests should be desirable but sadly, the reason there is no test is that any test that automatically checks this behaviour needs to eliminate the pool before joining the pool to check that the pool object is garbaged collected/does not hang but doing this will potentially leak threads and processes and is full of races. I would advise against anything that risks more races on multiprocessing testing. Being said that if someone knows of a reliable way of testing this, I am happy to implement it. :) |
Well, if your fix is right, the threads and processes shouldn't leak, they should just finish after a while. So the test suite's helpers for reaping threads and processes should do their job, no? (assuming we call them) |
But the test forces a situation in which the leaks happen because you need to destroy the pool without joining it before. This PR prevents a deadlock situation and a big leak, not a perfect finalization of the pool in every situation. And the hang happens mostly when you don't finalize properly the pool. |
Thanks for the explanation. Then I'd say it's good for merging :-) |
return self._ctx.Process(*args, **kwds) | ||
@staticmethod | ||
def Process(ctx, *args, **kwds): | ||
return ctx.Process(*args, **kwds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this need to change? I feel like pool.Process
existed solely as a convenience method, and this breaks any code that used it. Instead of changing this function, you could just use ctx.Process
instead of self.Process
below
Use a strong reference between the Pool and associated iterators
Rework PR bpo-34172: multiprocessing.Pool leaks resources after being deleted #8450 to eliminate a cycle in the Pool.
There is no test in this PR because any test that automatically tests this behaviour needs to eliminate the pool before joining the pool to check that the pool object is garbaged collected/does not hang. But doing this will potentially leak threads and processes (see https://bugs.python.org/issue35413).
https://bugs.python.org/issue35378