bpo-38119: Fix shmem resource tracking #21516

vinay0410 · 2020-07-17T08:47:47Z

Eliminates the use of the resource tracker on shared memory segments but preserves its use on semaphores, etc. where it is appropriately applied.

Adds a test that fails with the current code (an independent Python process started via subprocess accesses shared memory created by a different Python process, then exits normally, expecting the original creator process to continue to be able to access that shared memory) but passes with this patch to defend against future regressions.

https://bugs.python.org/issue38119

vinay0410 · 2020-07-17T08:50:27Z

@tiran , This is a duplicate of ~~#16139~~ #15989 , which had conflicts, so I fixed them and opened this.
These changes will also fix following issues:
https://bugs.python.org/issue39959

gvanrossum · 2020-07-27T22:44:32Z

Davin (@applio) can you review and approve this? What was the hold-up with your own fix? (This is a copy that fixes the merge conflicts.) Or alternately fix the conflicts in #16139 so that can be merged?

vinay0410 · 2020-07-28T05:15:33Z

Correction, I mentioned the wrong pull request earlier.
This is a duplicate of #15989

DamianBarabonkovQC · 2020-08-03T15:57:55Z

I would like to ask about this issue to get some more context, because maybe I may be missing something. I have noticed this exact same problem as in issue38119. Is removing the resource tracking entirely from SharedMemory the preferred way forward?

I think that there is some benefit to this resource tracking. I'm giving you a heads up that I'm currently working on a patch that resolves this issue and retains the resource tracker. I'll submit it as a PR when it is done.

gvanrossum · 2020-08-03T16:14:43Z

We don't actually know what went on in @applio's head when he wrote this patch, and he doesn't seem to respond to email. But, @DamianBarabonkovQC, before you do more work, can you explain what you are planning to do? (Honestly I personally know nothing about the resource tracker, I am just hoping to help find a resolution of one type or another.) Probably the better place to do your write-up would be the in bpo-38119.

vinay0410 · 2020-08-03T18:31:12Z

@DamianBarabonkovQC , I would say removing resource tracker is not the best way to go forward, but it is the easiest way to fix some of the major issues plaguing shared_memory. You can see my comment detailing why unix's shared memory will still be inconsistent with windows.

Windows uses a reference counting mechanism to cleanup shared memory segment, it only destroys/unlinks the shared memory segment if all the process using it are no longer running.

I have mentioned some methods, how we can implement a reference counting mechanism to make unix's implement consistent with windows here.

Having said that, there are still major issues in shared_memory which this PR will fix. Consider the scenario mentioned here.
A similar scenario has also been mentioned here.

DamianBarabonkovQC · 2020-08-03T19:12:18Z

@vinay0410, to this matter I agree that this PR is a good fix at present. I'm just thinking forward.

I like the idea of a reference count at the start of the shared memory segment, for its simplicity and clarity. My solution involves the /tmp directory, but was was mentioned in your post, that might not always work depending on the user's settings for /tmp.

vinay0410 · 2020-08-03T19:23:23Z

@DamianBarabonkovQC , Completely agree, I think for now this should be merged to fix some of the critical issues, and then we should have a discussion with core developers about a better and robust solution to tackle this problem.

DamianBarabonkovQC · 2020-08-06T16:02:09Z

Thanks for your input @gvanrossum to this matter. Both Vinay and I agree that for the time being, a temporary patch to this issue would be to simply remove the resource tracking temporarily from the SharedMemory since it is a source of issue with the current implementation. How could we get this patch accepted?

Moving forward, I have began a discussion detailing a potential solution here. This would not be very invasive proposal, that should implement resource tracking correctly.

I would really appreciate your thoughts on this manner, since maybe my proposal is more invasive than I think, or there is a better approach.

gvanrossum · 2020-08-06T16:33:18Z

You need to get someone to review it who actually understand shared memory. Since @tiran hasn't responded to your review request and @applio has not responded to repeated pleas, I'm not sure who to try next (I don't qualify, I don't understand this code at all). Maybe you can email @ericsnowcurrently and get him interested?

vinay0410 · 2020-08-28T05:13:28Z

Hi @gvanrossum , Thanks for suggesting possible reviewers. I dropped and email to @ericsnowcurrently 2 weeks ago, but he hasn't opened it yet. Do you have any other suggestions to get this PR reviewed, since this is relatively critical issue in multiprocessing.shared_memory.

vinay0410 · 2020-08-28T05:14:32Z

reopened to re-trigger CI-CD.

gvanrossum · 2020-08-28T06:01:39Z

Next step is a post on python-dev.

On Thu, Aug 27, 2020 at 22:13 Vinay Sharma ***@***.***> wrote: Hi @gvanrossum <https://github.com/gvanrossum> , Thanks for suggesting possible reviewers. I dropped and email to @ericsnowcurrently <https://github.com/ericsnowcurrently> 2 weeks ago, but he hasn't opened it yet. Do you have any other suggestions to get this PR reviewed, since this is relatively critical issue in multiprocessing.shared_memory. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21516 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWCWMWFDAGXVTEZHNGCBUTSC44ILANCNFSM4O526PVA> . --

--Guido (mobile)

…e tracker.

…ndle after any one independent process exit.

vinay0410 · 2020-10-03T04:52:10Z

Hi @gvanrossum , I posted on python-dev mailing list around a month ago (https://mail.python.org/archives/list/python-dev@python.org/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/) but haven't received any response.
Are there any other next steps ?

gvanrossum · 2020-10-03T05:01:40Z

Hi Vinay, In a few weeks (Oct 19-23) we'll have the (virtual) core dev sprint. Unlike other years, Davin Potts hasn't signed up, but I will try to get some other core dev to help decide and review this. Could you send me a reminder around the 19th?

…

--Guido

On Fri, Oct 2, 2020 at 9:52 PM Vinay Sharma ***@***.***> wrote: Hi @gvanrossum <https://github.com/gvanrossum> , I posted on python-dev mailing list around a month ago ( ***@***.***/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/) but haven't received any response. Are there any other next steps ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21516 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWCWMTDRTTTIQH3PJM6Q3LSI2UYNANCNFSM4O526PVA> .

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>

vinay0410 · 2020-10-03T05:02:47Z

Sure, I will send you a reminder on October 19th.

lthakur007 · 2020-10-14T15:50:48Z

Hello Team,
May I know when will this PR get merged?
Thank You!

vinay0410 · 2020-10-19T06:46:57Z

Hi @gvanrossum , As suggested I am reminding you to take up review of this PR in python core dev sprint.

gvanrossum · 2020-10-19T20:51:20Z

@vinay0410 Thanks for the ping. I've heard that @pitrou should be able to help out here.

pitrou · 2020-10-19T20:57:02Z

Removed @tiran as a reviewer since this is really because of the **/*sha* pattern in CODEOWNERS wrongly matches multiprocessing/shared_memory.py (the intent was to match the SHAxxx cryptographic hash algorithms).

pitrou · 2020-10-19T20:58:14Z

This code was originally added in #13222. I don't think we can remove it without some discussion. I'm cc'ing @pierreglaser as well as @tomMoral and @ogrisel to see if they have better suggestions to reconcile the various expectations here.

pitrou · 2020-10-19T21:01:07Z

(otherwise I'll recommend rejecting)

gvanrossum · 2020-10-19T21:02:38Z

Why reject? There is a real problem here (https://bugs.python.org/issue38119).

pitrou · 2020-10-19T21:05:14Z

Well, the problem is really differing expectations. In some cases, you want shared memory segments to be reliably collected when a process dies abruptly (Unix systems don't ensure this, so you have to do it yourself). In other cases, you want shared memory segments to outlive their known consumers. The proper solution, IMHO, would be to make it configurable so that everybody can have it their own way.

In any case, removing this code blindly would produce resource leaks in existing applications.

gvanrossum · 2020-10-19T21:17:04Z

Ah, I see. First thing I've seen in this entire thread from someone who actually understands both sides of the problem. Thanks!

vinay0410 · 2020-10-20T06:54:13Z

@pitrou , @gvanrossum ,
I agree removing this code will cause problems like memory leaks.
As @pitrou suggested to make it configurable, how about keeping (track=True) as an optional argument defaulting to True. If track=False no resource tracking will be used and user will have to destruct shared memory segment manually, else resource tracking will be used as it is being used now.

Another approach is to use Windows' mechanism of cleaning shared memory. Windows keeps reference count for every shared memory segment, and only destructs when reference count becomes 0. Using this will also make Unix's mechanism consistent with current windows' mechanism.
It will be difficult to implement the configurable approach in windows, because windows will destruct shared memory segments as soon as their reference count becomes 0, even if track=False.

tomMoral · 2020-10-20T06:55:07Z

The idea of the resource_tracker is to add a safety net for shared_memory. Correct clean up cannot always happen, either because processes can die abruptly or users don't realize they need to clean up. For long running machines -- such as server, or computing cluster -- these leaked resources can pile up and become a real issue. Moreover, this might be hard to debug, for instance if someone is having an error for shared memory full while it is caused by other users leaked ressources.

The resource_tracker avoids this by maybe being overly conservative but I don't think removing it completely is the right option. IMO, @pitrou's suggestion of making this configurable is probably the only way to easily allow both behaviours, while explicitly leaving the duty of cleaning the resource to the user -- adding an option that bypass the resource_tracker for users that wants to manage the shared memory them-self while still being conservative in the default case.

For the ref counting at the beginning of the file, the issue is that then, I don't see how to ensure that we detect the resource should be cleaned up if a process holding a reference on it dies.

vinay0410 · 2020-10-20T07:00:04Z

@tomMoral , for reference counting at the beginning of the file, one would need resource tracker to ensure that the shared memory segment is destructed as soon as reference count becomes 0, and resource tracker will be responsible for decrementing/incrementing reference count.

tomMoral · 2020-10-20T07:37:28Z

@vinay0410 The issue is with the child processes of the main process.

Only the main process is tracked by the resource_tracker. So if any of the child processes die without decrementing the ref count, how does the resource tracker knows how many child processes have incremented the ref count?

Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the resource_tracker that died -- but this is by no mean easy and can lead to resource leak or early destruction quite easily. So not sure if it is a solution that can work.

Just thinking out loud, maybe another solution would be to have a ref count of the number of resource_tracker that have a reference to the shared_memory? This would group the processes per block (main process and descendant), without changing the behaviour if there is only one block and having a clean up mechanism aware of the multiple block which would be tracked at the block level -- clean up (decref/unlink) once the main process dies.

vinay0410 · 2020-10-20T10:01:34Z

Just thinking out loud, maybe another solution would be to have a ref count of the number of resource_tracker that have a reference to the shared_memory? This would group the processes per block (main process and descendant), without changing the behaviour if there is only one block and having a clean up mechanism aware of the multiple block which would be tracked at the block level -- clean up (decref/unlink) once the main process dies.

@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it.

Is that what you were also thinking ?

Also, I think child process will have their own separate resource trackers, right ?
Because shared memory segments created by child processes are still being tracked (I think).

tomMoral · 2020-10-20T14:28:07Z

Also, I think child process will have their own separate resource trackers, right ?
Because shared memory segments created by child processes are still being tracked (I think).

Unless something weird happens, the resource tracker is shared accross all descendant of a main process (see here for instance). If a child process creates a new segment, it is also tracked in the same resource_tracker.

@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it.

Is that what you were also thinking ?

Actually, the issue is that passing the shared memory segment to a child process is the same as passing it to another process. The name of the segment is send and we create a new SharedMemory object. We cannot have different behavior between these 2. IMO, incrementing the refcount should be done by the ressource_tracker. This ensures that we have only one reference count per "group of processes" as the resource_tracker is shared by all these processes. And if a register event is sent for the same segment name, the refcount won't be increased.

vinay0410 · 2020-11-02T04:36:32Z

@tomMoral , should I then start with implementing refcount using resource tracker approach. Are you okay with the approach ?

tomMoral · 2020-11-03T07:21:00Z

I am okay with the approach of ref counting with the resource_tracker.

Just to summarize and make sure we are on the same page:

When creating a SharedMemory segment with create=True, reserve the space for an integer that will be the ref count.
Whenever a SharedMemory is instantiated, call register in the resource_tracker.
The resource_tracker handles incrementing the refcount and it only increments it when the name of the shared memory segment is unknown (once per group of processes). Not sure of which primitive can be used to count in an atomic way here? An option is to use a semaphore file (store the name a the beginning of the shm segment) but it feels wasteful.
When a group of process dies, if the segment still exists (it can have been unlinked by a direct call in another process), the refcount is decremented and if it reaches 0, the resource_tracker calls unlink on the segment.

This seems to provide a simple enough mechanism to ensure that shared memory segment are not left unattended, while providing the flexibility to have multiple unrelated processes using it.

A potential backward compat issue is with cross language application (for instance people using C program to access this shared memory). I don't know if there is a way to ensure that the refcount is not used as part of the shared memory segment.

vinay0410 · 2020-11-03T11:20:36Z

Not sure of which primitive can be used to count in an atomic way here? An option is to use a semaphore file (store the name a the beginning of the shm segment) but it feels wasteful.

@tomMoral , I was thinking of using the following primitives to count in an atomic way.
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

These primitives ensure cross process atomic updates. And these primitives (namely __atomic_store_n and __atomic_load_n) are already available in the python source code, used for implementing GIL.

cpython/Include/internal/pycore_atomic.h

Line 94 in 74b4eda

__atomic_store_n(&((ATOMIC_VAL)->_value), NEW_VAL, ORDER))

tomMoral · 2020-11-03T16:48:53Z

@vinay0410 as these are low-level C primitives, the refcount seems complicated to implement and maintain but maybe I just don't see how you would use them because the resource_tracker is implemented in pure python. What is your idea to call these primitive?

Also, do you think it is possible shift the mmap or the memoryview by a certain number of bytes to avoid writing on the refcount by mistake if it is at the beginning? Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size. Then the only danger comes from the fact that a C-program could very well overwrite the refcount but it is low level stuff so probably okay.

Note that it might be worth considering using a named semaphore (which is typically designed for such atomic counting). Adding at the end of the file the name of a semaphore and the length of the name to be able to read it. This would allow to implement everything with python (Semaphore and pickle). I can try to do a basic implem if you want.

pitrou · 2020-11-03T16:57:09Z

as these are low-level C primitives, the refcount seems complicated to implement and maintain

It would be easy enough to expose Python wrappers for the atomic intrinsics in the _multiprocessing module, IMHO.

Note that you need atomic increments and decrements, not mere loads and stores. This is provided in a portable way by the remarkable portable-snippets library: https://github.com/nemequ/portable-snippets/tree/master/atomic

Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size.

That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot).

cc @aeros, by the way.

vinay0410 · 2020-11-04T08:33:34Z

@pitrou ,

Note that you need atomic increments and decrements, not mere loads and stores.

That's correct I am planning to use, __atomic_add_fetch and __atomic_sub_fetch.

Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size.

That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot).

In

cpython/Lib/multiprocessing/shared_memory.py

Line 180 in db6434c

self._buf = memoryview(self._mmap)

, I was hoping to replace this with.

self._buf = memoryview(self._mmap)
self._ref_count = self._buf[0:4]
self._public_buf = self._buf[4:]

And then replace self._buf with self._public_buf here.

cpython/Lib/multiprocessing/shared_memory.py

Line 204 in db6434c

return self._buf

Finally I will pass self._ref_count to python functions which will atomically increment or decrement refcount implemented using c builtins.
cc: @tomMoral

vinay0410 · 2020-11-06T07:51:34Z

Hi @pitrou and @tomMoral , I have raised a PR #23174 , implementing the above solution. Also, I have not updated tests and documentation yet. Because, I just wanted to get some initial feedback on the approach I am using to implement the above solution.

I have added functions _posixshmem.shm_inc_refcount(memoryview) and _posixshmem.shm_dec_refcount(memorview). To atomically increment and decrement integer referenced by the passed memorview object.

aeros · 2020-11-08T23:10:32Z

@tomMoral

The issue is with the child processes of the main process.

Only the main process is tracked by the resource_tracker. So if any of the child processes die without decrementing the ref count, how does the resource tracker knows how many child processes have incremented the ref count?

Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the resource_tracker that died -- but this is by no mean easy and can lead to resource leak or early destruction quite easily. So not sure if it is a solution that can work.

I'm not strongly familiar enough with resource_tracker to know for certain if it would work, but one possible way to handle this could be via a SIGCHLD handler that decrements the internal refcount whenever a child processes dies or is stopped. Some precaution might have to be taken though to ensure that if a handler is already set by an external program that it's restored after all of the child processes have been accounted for.

vinay0410 requested a review from tiran as a code owner Jul 17, 2020

the-knights-who-say-ni added the CLA signed label Jul 17, 2020

bedevere-bot added the awaiting review label Jul 17, 2020

vinay0410 mentioned this pull request Aug 6, 2020

bpo-38119: Fix shmem resource tracking #15989

Open

vinay0410 closed this Aug 28, 2020

vinay0410 reopened this Aug 28, 2020

applio added 3 commits Sep 11, 2019

Eliminated premature destruction of shared memory segments by resourc…

dbdab82

…e tracker.

Add test verifying shared memory available to other processes with ha…

1bf0a83

…ndle after any one independent process exit.

Add blurb.

Loading status checks…

6d09ce7

vinay0410 force-pushed the vinay0410:fix_shmem_resource_tracking branch from a061cd5 to 6d09ce7 Aug 28, 2020

aeros requested a review from pitrou Oct 18, 2020

pitrou removed the request for review from tiran Oct 19, 2020

python / cpython

Sponsor python/cpython

Join GitHub today

GitHub is where the world builds software

bpo-38119: Fix shmem resource tracking #21516

bpo-38119: Fix shmem resource tracking #21516

Conversation

vinay0410 commented Jul 17, 2020 • edited

vinay0410 commented Jul 17, 2020 • edited

gvanrossum commented Jul 27, 2020

vinay0410 commented Jul 28, 2020

DamianBarabonkovQC commented Aug 3, 2020

gvanrossum commented Aug 3, 2020 • edited by bedevere-bot

vinay0410 commented Aug 3, 2020

DamianBarabonkovQC commented Aug 3, 2020

vinay0410 commented Aug 3, 2020

DamianBarabonkovQC commented Aug 6, 2020

gvanrossum commented Aug 6, 2020

vinay0410 commented Aug 28, 2020

vinay0410 commented Aug 28, 2020

gvanrossum commented Aug 28, 2020

vinay0410 commented Oct 3, 2020

gvanrossum commented Oct 3, 2020

vinay0410 commented Oct 3, 2020

lthakur007 commented Oct 14, 2020

vinay0410 commented Oct 19, 2020

gvanrossum commented Oct 19, 2020

pitrou commented Oct 19, 2020

pitrou commented Oct 19, 2020

pitrou commented Oct 19, 2020

gvanrossum commented Oct 19, 2020

pitrou commented Oct 19, 2020 • edited

gvanrossum commented Oct 19, 2020

vinay0410 commented Oct 20, 2020

tomMoral commented Oct 20, 2020 • edited

vinay0410 commented Oct 20, 2020

tomMoral commented Oct 20, 2020

vinay0410 commented Oct 20, 2020 • edited

tomMoral commented Oct 20, 2020 • edited

vinay0410 commented Nov 2, 2020

tomMoral commented Nov 3, 2020

vinay0410 commented Nov 3, 2020 • edited

tomMoral commented Nov 3, 2020

pitrou commented Nov 3, 2020 • edited

vinay0410 commented Nov 4, 2020

vinay0410 commented Nov 6, 2020 • edited

aeros commented Nov 8, 2020 • edited

Essential cookies

Always active

Analytics cookies

vinay0410 commented Jul 17, 2020 •

edited

vinay0410 commented Jul 17, 2020 •

edited

gvanrossum commented Aug 3, 2020 •

edited by bedevere-bot

pitrou commented Oct 19, 2020 •

edited

tomMoral commented Oct 20, 2020 •

edited

vinay0410 commented Oct 20, 2020 •

edited

tomMoral commented Oct 20, 2020 •

edited

vinay0410 commented Nov 3, 2020 •

edited

pitrou commented Nov 3, 2020 •

edited

vinay0410 commented Nov 6, 2020 •

edited

aeros commented Nov 8, 2020 •

edited