Skip to content

bpo-32608: Adding a server into socketserver that handles client connection in new "multiprocessing.Process" processes. #5258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

deliberist
Copy link

@deliberist deliberist commented Jan 21, 2018

…e passed into sub-processes created by the multiprocessing package.
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

@deliberist
Copy link
Author

I signed the CLA yesterday, does it take a few days for that to be seen?

The :class:`ForkingMixIn` class is used in the same way, except that the server
will spawn a new process for each request.
The :class:`ForkingMixIn` and :class:`ProcessingMixIn` classes can be used in
the same way, except that the server will spawn a new process for each request.
Available only on POSIX platforms that support :func:`~os.fork`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably ProcessingMixIn can be used on Windows, since multiprocessing.Process can

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, that was my intent.

Since the multiprocessing.Process object is more portable, it makes sense to potentially use that if a developer wanted to. I didn't think removing the existing ForkingMixIn class would be productive since that would break anything that uses it, and there is probably still good use cases to use a lighter weight "os.fork()" implementation on POSIX systems.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had meant to say there was that the documentation was incorrect by implying that ProcessingMixIn was posix-only

Copy link
Author

@deliberist deliberist Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand now, it wasn't necessarily documentation that I changed but more documentation that I didn't change. I'm updating the rst file now.

If interested, the specific comment for this comment is:
15ec16b

…ggesting ProcessingMixIn (and associated server classes) can still be used on non-POSIX systems.
@deliberist
Copy link
Author

deliberist commented Feb 10, 2018

For my own sanity, I'm dropping some useful commands I used when building/testing these changes.

Linux builds:

cd cpython
./configure --with-pydebug
make -j
./python -m test.pythoninfo
./python -m test
./python -m test -v test.test_socketserver -uall

Windows builds:

cd cpython
.\PCbuild\clean.bat
.\PCbuild\build.bat -e
.\PCbuild\win32\python.exe -m test.pythoninfo
.\PCbuild\win32\python.exe -m test
.\PCbuild\win32\python.exe -m test -v test.test_socketserver -uall
.\PCbuild\rt.bat -q -uall -u-cpu -rwW --slowest --timeout=1200 --fail-env-changed -j0

@deliberist
Copy link
Author

deliberist commented Feb 10, 2018

Can someone with more Python on Windows experience be able to explain to me why the AppVeyor build is failing?

To me, it looks like the AppVeyor is failing on the test_threading module (which I did not modify) and the test_socketserver (which I did modify). However, I do not understand the circumstances where the test is failing on Windows but not on my LinuxMint 18.1 dev machine.

I would like this pull-request to get merged as soon as we can, but I doubt it will make it much farther with one of the build checks failing.

-- Thanks

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for posting this PR. I posted some comments below.
Also, it seems the tests fail on Windows (see the AppVeyor CI failure).

@@ -99,11 +99,12 @@ server classes.


.. class:: ForkingMixIn
ProcessingMixIn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add a versionadded marker for the new class.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

:class:`ForkingMixIn` and the Forking classes mentioned below are
only available on POSIX platforms that support :func:`~os.fork`.

.. class:: ForkingTCPServer
ForkingUDPServer
ProcessingTCPServer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here (need a versionadded marker).

@@ -969,6 +969,13 @@ Changes in the Python API
work as expected on all platforms.
(Contributed by Yury Selivanov in :issue:`32331`.)

* New :mod:`socketserver` classes were added called
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet will have to go into 3.8.rst instead. (also, make sure to add a "Contributed by..." note in parentheses as in the other entries).

# above max_children.
while len(self.active_children) >= self.max_children:
try:
pid, _ = os.waitpid(-1, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using waitpid is ok with the multiprocessing version. Instead, use either one of the Process methods (perhaps join()) or the multiprocessing.connection.wait function if you need to wait on multiple objects (see https://docs.python.org/3/library/multiprocessing.html#multiprocessing.connection.wait ).

# The ProcessingTCPServer and ProcessingUDPServer both modify the
# environment, specifically the "multiprocessing.process._dangling"
# WeakSet() of created processes.
self._multiproc_dangling = multiprocessing.process._dangling.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a bug. Perhaps you need to call Process.join at some point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more, I think you're right. I've refactored the logic a little bit so that the tests do not need to clean up dangling processes.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@deliberist
Copy link
Author

Hi Antoine (pitrou). I've made the changes you suggested. However the CI builds fail on Windows because of dangling threads that I cannot seem to track down why. Do you have a few minutes have a second set of eyes looking at the errors?

Locally, I print a traceback in threading.py when a new thread is added to the threading._dangling WeakSet. This tells me where the code is when it creates the threads that are still dangling at the end of the test_socketserver tests.

I added this into threading.py, in Thread.__init__():

        import sys
        import traceback
        print()
        print('\tNAME = {}'.format(self._name))
        traceback.print_stack(file=sys.stdout)

This bug is close to being complete, I cannot figure out why the Process objects are creating these extra threads that are not being cleaned up.

-- Thanks!

@deliberist
Copy link
Author

I have looked at the code over and over again. I keep coming to the conclusion that the reason the AppVeyor tests are failing only on Windows is because there is a problem with the way Windows handles multiple processes. For some reason, on Windows there is a dangling thread laying around after the unit tests run. I just cannot find exactly where and why a thread gets created but never cleaned up in the Thread._dangling object.

This scenario does not happen on Linux systems, which leads me to believe this problem is a bit out of scope of this ticket.

Can someone look and confirm? I'd really like to close this ticket so it can be merged into the main baseline, however the failing tests on Windows seems to be preventing this ticket from advancing.

@vstinner
Copy link
Member

vstinner commented Jun 8, 2018

AppVeyor spotted a dangling thread:

0:03:19 [286/416/2] test_socketserver failed (env changed)
test_forking_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_forking_not_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_sync_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_sync_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_all (test.test_socketserver.MiscTestCase) ... ok
test_shutdown_request_called_if_verify_request_false (test.test_socketserver.MiscTestCase) ... ok
test_ForkingTCPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUDPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUnixDatagramServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ForkingUnixStreamServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ProcessingTCPServer (test.test_socketserver.SocketServerTest) ... creating server
ADDR = ('127.0.0.1', 4730)
CLASS = <class 'socketserver.ProcessingTCPServer'>
server running
test client 0
test client 1
test client 2
waiting for server
done
Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Dangling thread: <Thread(Thread-3, started daemon 2932)>
Dangling thread: <_MainThread(MainThread, started 3804)>
ok
(...)
Warning -- threading._dangling was modified by test_socketserver
  Before: <_weakrefset.WeakSet object at 0x03330ED0>
  After:  <_weakrefset.WeakSet object at 0x03330E50>

1 test altered the execution environment:
    test_socketserver

@deliberist
Copy link
Author

@vstinner, Yup I see that there is a dangling thread. I cannot see what is different when running on Windows that produces the dangling thread that does not happen on Linux.

@csabella csabella requested a review from pitrou May 18, 2019 23:55
rbprogrammer added 3 commits March 16, 2021 19:48
Updating fork with that of the upstream.
Updating socketserver_and_multiprocessing with that of master.
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Aug 14, 2022
@deliberist
Copy link
Author

deliberist commented Aug 14, 2022

FWIW, this could still be merged in, I just never had the time or experience to figure out why the Windows CI tests failed. I've tested the branch on Windows directly and it passed. I do not understand why the CI was failing, but directly on a Windows host passed.

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 15, 2022
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes stale Stale PR or inactive for long period of time. topic-multiprocessing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants