bpo-35378: Fix multiprocessing.Pool references #11627

pablogsal · 2019-01-21T02:13:44Z

Use a strong reference between the Pool and associated iterators
Rework PR bpo-34172: multiprocessing.Pool leaks resources after being deleted #8450 to eliminate a cycle in the Pool.

There is no test in this PR because any test that automatically tests this behaviour needs to eliminate the pool before joining the pool to check that the pool object is garbaged collected/does not hang. But doing this will potentially leak threads and processes (see https://bugs.python.org/issue35413).

https://bugs.python.org/issue35378

…results

Fix a reference issue inside multiprocessing.Pool that caused the pool to remain alive if it was deleted without being closed or terminated explicitly.

pitrou

An implementation question below.

pitrou · 2019-01-22T09:54:59Z

Lib/multiprocessing/pool.py

@@ -656,13 +682,14 @@ def __exit__(self, exc_type, exc_val, exc_tb):

 class ApplyResult(object):

-    def __init__(self, cache, callback, error_callback):
+    def __init__(self, pool, callback, error_callback, cache=None):


Why the separate cache argument?

This is a leftover from the last PR.

Originally, in the last PR, I made the cache as a keyword argument to decouple the cache from the pool, as this allows a broader set of strategies (in the future) to have these separated and therefore reducing dependency cycles and lifetime chains.

In this PR I am targeting the simplest solution, so I am going to remove the cache argument and just get the cache out of the pool.

In edee524 I have reverted to use always the cache from the pool.

pitrou · 2019-01-22T09:55:14Z

Lib/multiprocessing/pool.py

@@ -701,16 +729,16 @@ def _set(self, i, obj):

 class MapResult(ApplyResult):

-    def __init__(self, cache, chunksize, length, callback, error_callback):
-        ApplyResult.__init__(self, cache, callback,
+    def __init__(self, pool, chunksize, length, callback, error_callback, cache=None):


Check my answer in the previous comment.

vstinner · 2019-01-23T16:54:21Z

It seems like this chang is more for https://bugs.python.org/issue34172 no ?

vstinner · 2019-01-23T16:55:58Z

Does this change affect the following code?


import multiprocessing

def the_test():
    print("Begin")
    for x in multiprocessing.Pool().imap(int,
            ["4", "3"]):
        print(x)
    print("End")

the_test()

Ref: https://bugs.python.org/issue34172#msg330864

pablogsal · 2019-01-23T22:45:50Z

It seems like this chang is more for https://bugs.python.org/issue34172 no ?

This is combining https://bugs.python.org/issue35378 and https://bugs.python.org/issue34172 because both issues neet to be solved at the same time (one fix does not have total sense without the other). This PR breaks the reference cycle and ties the lifetime of the iterators so there is no change in behaviour.

Does this change affect the following code?

No, that's the point of solving both issues (bpos) at the same time: it preserves backwards compatibility:

This patch

Begin
/home/pablogsal/github/cpython/Lib/multiprocessing/pool.py:234: ResourceWarning: unclosed running multiprocessing pool <multiprocessing.pool.Pool state=RUN pool_size=12>
  _warn(f"unclosed running multiprocessing pool {self!r}",
ResourceWarning: Enable tracemalloc to get the object allocation traceback
4
3
End

python3.7

Begin
4
3
End

python3.6

Begin
4
3
End

vstinner · 2019-01-24T11:29:30Z

Misc/NEWS.d/next/Library/2019-01-21-02-15-20.bpo-35378.4oF03i.rst

@@ -0,0 +1,3 @@
+Fix a reference issue inside :class:`multiprocessing.Pool` that caused
+the pool to remain alive if it was deleted without being closed or
+terminated explicitly.


I understand that your change also adds a strong reference to the pool in iterators. Would you mind to also explain that?

pablogsal · 2019-02-07T00:05:58Z

@pitrou @vstinner I have added a note about the strong references in the NEWS entry. Could you review/accept the PR? I would like to be able to merge this early in the dev cycle so this fix can be tested better.

pitrou

The PR looks fine to me, though it would be better if there could be some tests.

pablogsal · 2019-02-07T16:01:24Z

The PR looks fine to me, though it would be better if there could be some tests.

Thanks for the review! I agree that tests should be desirable but sadly, the reason there is no test is that any test that automatically checks this behaviour needs to eliminate the pool before joining the pool to check that the pool object is garbaged collected/does not hang but doing this will potentially leak threads and processes and is full of races.

I would advise against anything that risks more races on multiprocessing testing.

Being said that if someone knows of a reliable way of testing this, I am happy to implement it. :)

pitrou · 2019-02-07T16:04:45Z

Well, if your fix is right, the threads and processes shouldn't leak, they should just finish after a while. So the test suite's helpers for reaping threads and processes should do their job, no? (assuming we call them)

pablogsal · 2019-02-07T16:12:38Z

Well, if your fix is right, the threads and processes shouldn't leak, they should just finish after a while. So the test suite's helpers for reaping threads and processes should do their job, no? (assuming we call them)

But the test forces a situation in which the leaks happen because you need to destroy the pool without joining it before. This PR prevents a deadlock situation and a big leak, not a perfect finalization of the pool in every situation. And the hang happens mostly when you don't finalize properly the pool.

pitrou · 2019-02-07T16:30:05Z

Thanks for the explanation. Then I'd say it's good for merging :-)

eric-wieser · 2019-06-19T17:41:43Z

Lib/multiprocessing/pool.py

-        return self._ctx.Process(*args, **kwds)
+    @staticmethod
+    def Process(ctx, *args, **kwds):
+        return ctx.Process(*args, **kwds)


Why did this need to change? I feel like pool.Process existed solely as a convenience method, and this breaks any code that used it. Instead of changing this function, you could just use ctx.Process instead of self.Process below

pablogsal added 4 commits January 21, 2019 01:43

bpo-35378: Link the lifetime of the pool to the pool's iterators and …

b89c67b

…results

Use a longer (parametrised) timeout for the slower buildbots

514cd68

Remove test as it can leak resources

c7f0260

multiprocessing.Pool leaks resources after being deleted

2b29f41

Fix a reference issue inside multiprocessing.Pool that caused the pool to remain alive if it was deleted without being closed or terminated explicitly.

the-knights-who-say-ni added the CLA signed label Jan 21, 2019

bedevere-bot added the awaiting merge label Jan 21, 2019

pablogsal requested review from vstinner and pitrou January 21, 2019 02:13

pablogsal added type-bug An unexpected behavior, bug, or error tests Tests in the Lib/test dir labels Jan 21, 2019

pablogsal force-pushed the bpo35378 branch from a718011 to c47dd29 Compare January 21, 2019 02:16

Add News entry

ae81354

pablogsal force-pushed the bpo35378 branch from c47dd29 to ae81354 Compare January 21, 2019 02:20

pitrou reviewed Jan 22, 2019

View reviewed changes

Always get the cache from the pool

edee524

vstinner reviewed Jan 24, 2019

View reviewed changes

Make a note about strong references in the NEWS

b0a5c89

pitrou approved these changes Feb 7, 2019

View reviewed changes

pablogsal merged commit 3766f18 into python:master Feb 11, 2019

bedevere-bot removed the awaiting merge label Feb 11, 2019

pablogsal deleted the bpo35378 branch February 11, 2019 17:29

eric-wieser reviewed Jun 19, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-35378: Fix multiprocessing.Pool references #11627

bpo-35378: Fix multiprocessing.Pool references #11627

pablogsal commented Jan 21, 2019 •

edited by bedevere-bot

pitrou left a comment

pitrou Jan 22, 2019

pablogsal Jan 22, 2019 •

edited

pitrou Jan 22, 2019

pablogsal Jan 22, 2019

vstinner commented Jan 23, 2019

vstinner commented Jan 23, 2019

pablogsal commented Jan 23, 2019 •

edited

vstinner Jan 24, 2019

pablogsal commented Feb 7, 2019

pitrou left a comment

pablogsal commented Feb 7, 2019

pitrou commented Feb 7, 2019

pablogsal commented Feb 7, 2019 •

edited

pitrou commented Feb 7, 2019

eric-wieser Jun 19, 2019

bpo-35378: Fix multiprocessing.Pool references #11627

bpo-35378: Fix multiprocessing.Pool references #11627

Conversation

pablogsal commented Jan 21, 2019 • edited by bedevere-bot

pitrou left a comment

Choose a reason for hiding this comment

pitrou Jan 22, 2019

Choose a reason for hiding this comment

pablogsal Jan 22, 2019 • edited

Choose a reason for hiding this comment

pitrou Jan 22, 2019

Choose a reason for hiding this comment

pablogsal Jan 22, 2019

Choose a reason for hiding this comment

vstinner commented Jan 23, 2019

vstinner commented Jan 23, 2019

pablogsal commented Jan 23, 2019 • edited

This patch

python3.7

python3.6

vstinner Jan 24, 2019

Choose a reason for hiding this comment

pablogsal commented Feb 7, 2019

pitrou left a comment

Choose a reason for hiding this comment

pablogsal commented Feb 7, 2019

pitrou commented Feb 7, 2019

pablogsal commented Feb 7, 2019 • edited

pitrou commented Feb 7, 2019

eric-wieser Jun 19, 2019

Choose a reason for hiding this comment

pablogsal commented Jan 21, 2019 •

edited by bedevere-bot

pablogsal Jan 22, 2019 •

edited

pablogsal commented Jan 23, 2019 •

edited

pablogsal commented Feb 7, 2019 •

edited