Description
Bug report
This seems like another specific instance of the general issue identified in #50970.
If multiprocessing.Pool.map_async
is used with maxtasksperchild
and a value returned by a task is of a class not currently imported by the calling process, it can lead to a hang. Here is an example that reliably hangs for me, but which exits cleanly if ElementTree is imported at the top level.
#!/usr/bin/env python
import os
import multiprocessing
def worker(num: int):
from xml.etree.ElementTree import ElementTree
print(f"Worker {num} with pid {os.getpid()}")
return ElementTree()
def main(cores: int = 4, num: int = 6):
pool = multiprocessing.Pool(processes=cores, maxtasksperchild=1)
barList = list(pool.map_async(worker, list(range(num))).get())
print(barList)
if __name__ == "__main__":
main()
Running py-spy dump
on one of the workers shows this backtrace:
Process 47102: python ./demo_core.py
Python v3.10.4 (/usr/bin/python3.10)
Thread 47102 (idle): "Thread-1 (_handle_workers)"
acquire (<frozen importlib._bootstrap>:120)
__enter__ (<frozen importlib._bootstrap>:171)
_find_and_load (<frozen importlib._bootstrap>:1024)
worker (demo_core.py:8)
mapstar (multiprocessing/pool.py:48)
worker (multiprocessing/pool.py:125)
run (multiprocessing/process.py:108)
_bootstrap (multiprocessing/process.py:315)
_launch (multiprocessing/popen_fork.py:71)
__init__ (multiprocessing/popen_fork.py:19)
_Popen (multiprocessing/context.py:277)
start (multiprocessing/process.py:121)
_repopulate_pool_static (multiprocessing/pool.py:326)
_maintain_pool (multiprocessing/pool.py:337)
_handle_workers (multiprocessing/pool.py:513)
run (threading.py:946)
_bootstrap_inner (threading.py:1009)
_bootstrap (threading.py:966)
My guess (without any further proof) is that the main process receives a pickled ElementTree and starts importing the module. Concurrently, another thread realises it needs to start a new worker, so does a fork()
. The child process has a half-imported, locked ElementTree module, and tries to import it again, leading to a deadlock.
Note that this is nothing to do with ElementTree - I get the same behaviour with numpy. I chose ElementTree as a reasonably complex module (to maximise the window for the race condition) with a picklable class.
Personally I consider the fork
model of multiprocessing
to be dangerous and requiring of care to ensure all worker tasks are created before doing anything that can conceivably create threads, and definitely a bad combination with maxtasksperchild
. So I won't shed any tears if the resolution is "won't fix, don't do that". But #50970 (comment) seems to suggest that @vstinner has some appetite for addressing such issues and hence I'm filing this.
Your environment
- CPython versions tested on: 3.8.10, 3.10.4
- Operating system and architecture: Ubuntu 20.04, x86_64
Metadata
Metadata
Assignees
Projects
Status