Skip to content

bpo-40402: Fix race condition in multiprocessing.connection.Connection #19790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

remilapeyre
Copy link
Contributor

@remilapeyre remilapeyre commented Apr 29, 2020

@corona10 corona10 requested a review from pitrou April 29, 2020 14:42
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting this PR! The solution looks good, just some small comments.

t.join()
self.assertTrue(isinstance(self.exc, OSError))
self.assertEqual(str(self.exc), "handle is closed")
del self.exc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially added that to break the reference cycle with the exception without giving it much though. I tried removing it and apparently it's needed or the tests will fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the failure look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gave this (https://github.com/python/cpython/runs/630755938#step:6:1387):

Ran 359 tests in 139.628s

OK (skipped=34)

== Tests result: ENV CHANGED ==

401 tests OK.

...

2 tests altered the execution environment:
    test_multiprocessing_forkserver test_multiprocessing_spawn

...

Total duration: 16 min
Tests result: ENV CHANGED
make: *** [buildbottest] Error 3
##[error]Process completed with exit code 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Is it possible to get any details when one gets the "ENV CHANGED" message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Is it possible to get any details when one gets the "ENV CHANGED" message?

Search for "Warning -- ". I found:

2020-04-29T19:02:23.0562020Z Warning -- Unraisable exception

But the unraisable exception is not logged. I modified recently libregrtest to log "Warning -- Unraisable exception" into sys.stderr since some tests love to redirect sys.stderr.

Recently, I had a similar issue with test_concurrent_futures. First, no "Warning -- " was logged. I modified libregrtest which showed "Warning -- Unraisable exception" but not the actual exception. In fact, many tests redirected sys.stderr. I modified tests to not redirect sys.stderr when it was not needed.

Here are my notes how to debug a race condition:
https://pythondev.readthedocs.io/unstable_tests.html

You may try:

./python -m test.bisect_cmd -o bisect --fail-env-changed test_multiprocessing_forkserver -v

And stress your machine in the meanwhile to make race conditions more likely. I love use "./python -m test -j10 -r" in another terminal. Use a number higger than 10 if it's not enough.

Good luck ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vstinner, I did that and apparently it's this new test that cause an issue, the only result is

test.test_multiprocessing_forkserver.TestInvalidHandle.test_closed_handled

Here's what happens when I run only this test:

(venv) ➜  cpython git:(bpo-40402) ✗ ./python -m test --fail-env-changed  test_multiprocessing_forkserver -m test_closed_handled -v          
== CPython 3.9.0a5+ (heads/bpo-40402-dirty:3221eb94a5, May 3 2020, 00:07:56) [Clang 11.0.3 (clang-1103.0.32.29)]
== macOS-10.15.4-x86_64-i386-64bit little-endian
== cwd: /Users/remi/src/cpython/build/test_python_87640
== CPU count: 12
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 3.13 Run tests sequentially
0:00:00 load avg: 3.13 [1/1] test_multiprocessing_forkserver
test_closed_handled (test.test_multiprocessing_forkserver.TestInvalidHandle) ... ok
Warning -- Unraisable exception
Exception ignored in: <_io.BytesIO object at 0x107353210>
Traceback (most recent call last):
  File "/Users/remi/src/cpython/Lib/test/support/__init__.py", line 1562, in gc_collect
    gc.collect()
BufferError: Existing exports of data: object cannot be re-sized

----------------------------------------------------------------------

Ran 1 test in 1.123s

OK
test_multiprocessing_forkserver failed (env changed)

== Tests result: ENV CHANGED ==

1 test altered the execution environment:
    test_multiprocessing_forkserver

Total duration: 1.5 sec
Tests result: ENV CHANGED

It seems like manually deleting the reference to the exception fix the issue. I'm not sure what I need to do next.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@csabella
Copy link
Contributor

@remilapeyre, it looks like you've made all the requested changes. Is this ready to be reviewed again? Thanks!

@remilapeyre
Copy link
Contributor Author

Hi @csabella, thanks for the ping. I had another look at it and I think that given the bisect result the current form is correct.

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pitrou, @vstinner: please review the changes made to this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants