Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library multiprocess leaks named resources. #90549

Closed
jxdabc mannequin opened this issue Jan 15, 2022 · 15 comments
Closed

Library multiprocess leaks named resources. #90549

jxdabc mannequin opened this issue Jan 15, 2022 · 15 comments
Labels
3.8 3.9 3.10 3.11 expert-multiprocessing performance stdlib

Comments

@jxdabc
Copy link
Mannequin

@jxdabc jxdabc mannequin commented Jan 15, 2022

BPO 46391
Nosy @pitrou, @benjaminp, @1st1, @applio, @arhadthedev, @jxdabc, @BarkingBad
PRs
  • #30617
  • Files
  • screen.png: screenshot
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-01-15.18:04:57.278>
    labels = ['3.8', '3.9', '3.10', 'performance', '3.11', 'library']
    title = 'Library multiprocess leaks named resources.'
    updated_at = <Date 2022-02-18.19:39:09.294>
    user = 'https://github.com/jxdabc'

    bugs.python.org fields:

    activity = <Date 2022-02-18.19:39:09.294>
    actor = 'BarkingBad'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2022-01-15.18:04:57.278>
    creator = 'milestonejxd'
    dependencies = []
    files = ['50580']
    hgrepos = []
    issue_num = 46391
    keywords = ['patch']
    message_count = 5.0
    messages = ['410654', '410657', '410853', '411456', '413504']
    nosy_count = 8.0
    nosy_names = ['pitrou', 'benjamin.peterson', 'sbt', 'yselivanov', 'davin', 'arhadthedev', 'milestonejxd', 'BarkingBad']
    pr_nums = ['30617']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue46391'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

    @jxdabc
    Copy link
    Mannequin Author

    @jxdabc jxdabc mannequin commented Jan 15, 2022

    Repo is the standard tool that Google uses to build Android, Chrome OS, Chromium, etc, written in Python.

    Many Repo users have encountered resource leak warnings with Python 3.9.
    https://bugs.chromium.org/p/gerrit/issues/detail?id=14934&q=component%3Arepo&can=5

    I did some work and found that the problem is not caused by the code of Repo, but a bug of the Python library, multiprocess.

    To make it simple, the Python script below leaks named resource even when exiting normally. (And be unlinked by resource_tracker with a warning. )

    import multiprocessing as mp
    
    global_resource = mp.Semaphore()
    
    def submain(): pass
    
    if __name__ == '__main__':
        p = mp.Process(target=submain)
        p.start()
        p.join()
    

    Tested on macOS with Python 3.9.7

    python test.py
    resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown.

    This bug will 100% reproduce when then main module uses named resources as global variables and uses spawn context, which is the case of Repo on macOS.

    This is caused by multiprocess::BaseProcess::_bootstrap.

    When a new process is started with multiprocessing.Process.start() in spawn context.

    1. The main module is reloaded in the subprocess (for pickle) in multiprocessing::spawn::_main.
    2. Named resources (such as the semaphore above) in the main module resister their _cleanup into multiprocessing::util::_finalizer_registry, which unlink themselves.
    3. multiprocess::BaseProcess::_bootstrap then clears _finalizer_registry.

    When a subprocess is spawned, it is no need to clear util::_finalizer_registry (and no need to call util::_run_after_forkers). Disable clearing _finalizer_registry (and disable call to _run_after_forkers) should fix this bug without breaking anything else.

    And I uploaded a MR.

    @jxdabc jxdabc mannequin added 3.9 stdlib performance labels Jan 15, 2022
    @arhadthedev
    Copy link
    Mannequin

    @arhadthedev arhadthedev mannequin commented Jan 15, 2022

    I added core devs related to multiprocessing into a nosy list so they got a notification and the PR will be evaluated and merged faster. FYI, the devs are Davin Potts and Antoine Pitrou (as per <https://devguide.python.org/experts/\>).

    @jxdabc
    Copy link
    Mannequin Author

    @jxdabc jxdabc mannequin commented Jan 18, 2022

    loop file owner.

    @jxdabc
    Copy link
    Mannequin Author

    @jxdabc jxdabc mannequin commented Jan 24, 2022

    update,

    Confirmed to affect all versions above 3.8+.

    It's really annoying.

    Command line tools like Repo get a warning after every (success) command.
    And mess up with shell.

    -------------------------------------------------------------------

    $ repo status .
    project ksbox/                                  branch dev
    $ /usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
      warnings.warn('resource_tracker: There appear to be %d '

    $
    $
    $
    $ repo sync .
    Fetching: 100% (1/1), done in 1.184s
    Garbage collecting: 100% (1/1), done in 0.013s
    repo sync has finished successfully.
    $ /usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
    warnings.warn('resource_tracker: There appear to be %d '

    $
    $
    $
    $ repo status .
    project ksbox/ branch dev
    $ /usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
    warnings.warn('resource_tracker: There appear to be %d '

    @BarkingBad
    Copy link
    Mannequin

    @BarkingBad BarkingBad mannequin commented Feb 18, 2022

    Also hitting this problem. Would be grateful for fixing it, since there is already PR created

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @jxdabc
    Copy link
    Contributor

    @jxdabc jxdabc commented Apr 20, 2022

    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Apr 20, 2022

    @PierfrancescoSoffritti
    Copy link

    @PierfrancescoSoffritti PierfrancescoSoffritti commented Apr 27, 2022

    any news on this issue?

    @pchalasani
    Copy link

    @pchalasani pchalasani commented Apr 29, 2022

    Strangely, I see this error only when running inside PyCharm in debug mode, and it happens non-deterministically. When I run in cmd line it does not happen.

    @HFrost0
    Copy link

    @HFrost0 HFrost0 commented May 18, 2022

    This problem is related to spawn method to create a new process (which is the default way in windows and macos since python 3.8). Example below works fine without warning⬇️

    import multiprocessing as mp
    
    mp.set_start_method('fork')  # switch to fork
    global_resource = mp.Semaphore()
    
    
    def submain():
        pass
    
    
    if __name__ == '__main__':
        p = mp.Process(target=submain)
        p.start()
        p.join()

    To avoid this warning in another way, create global_resource inside if __name__ == '__main__' to avoid re-creation in subprocess. This question may be helpful

    import multiprocessing as mp
    
    
    def submain():
        pass
    
    
    if __name__ == '__main__':
        global_resource = mp.Semaphore()
        p = mp.Process(target=submain)
        p.start()
        p.join()

    Test environment is macos python 3.9, example above can be a temp solution, hoping MR fix this soon.

    @jxdabc
    Copy link
    Contributor

    @jxdabc jxdabc commented May 19, 2022

    @HFrost0

    Thanks for your workaround and tracking down.

    Unfortunately, in the case of Google Repo. The global resource is not at the main module. Instead a module which the main module imports creates a global_resource. So, the workaround won't help in this case. : (

    @jxdabc
    Copy link
    Contributor

    @jxdabc jxdabc commented May 25, 2022

    cc @pitrou @gpanders
    Any idea or workaround?

    @stevewatanabe
    Copy link

    @stevewatanabe stevewatanabe commented Jun 1, 2022

    It would be nice to have this fixed....this warning appears on most of the repo commands I use :-(

    @jxdabc
    Copy link
    Contributor

    @jxdabc jxdabc commented Jun 2, 2022

    It would be nice to have this fixed....this warning appears on most of the repo commands I use :-(

    me, too :-(

    pitrou added a commit that referenced this issue Jun 9, 2022
    …awn (#30617)
    
    Co-authored-by: XD Trol <milestonejxd@gmail.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 9, 2022
    …ing spawn (pythonGH-30617)
    
    Co-authored-by: XD Trol <milestonejxd@gmail.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    (cherry picked from commit 30610d2)
    
    Co-authored-by: Leo Trol <milestone.jxd@gmail.com>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 9, 2022
    …ing spawn (pythonGH-30617)
    
    Co-authored-by: XD Trol <milestonejxd@gmail.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    (cherry picked from commit 30610d2)
    
    Co-authored-by: Leo Trol <milestone.jxd@gmail.com>
    miss-islington added a commit that referenced this issue Jun 10, 2022
    …awn (GH-30617)
    
    Co-authored-by: XD Trol <milestonejxd@gmail.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    (cherry picked from commit 30610d2)
    
    Co-authored-by: Leo Trol <milestone.jxd@gmail.com>
    miss-islington added a commit that referenced this issue Jun 10, 2022
    …awn (GH-30617)
    
    Co-authored-by: XD Trol <milestonejxd@gmail.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    (cherry picked from commit 30610d2)
    
    Co-authored-by: Leo Trol <milestone.jxd@gmail.com>
    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Jun 10, 2022

    Okay, this is fixed now. Thanks @jxdabc for the contribution!

    @pitrou pitrou closed this as completed Jun 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 3.9 3.10 3.11 expert-multiprocessing performance stdlib
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants