Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
bpo-38119: Fix shmem resource tracking #21516
Conversation
@tiran , This is a duplicate of |
Correction, I mentioned the wrong pull request earlier. |
I would like to ask about this issue to get some more context, because maybe I may be missing something. I have noticed this exact same problem as in issue38119. Is removing the resource tracking entirely from SharedMemory the preferred way forward? I think that there is some benefit to this resource tracking. I'm giving you a heads up that I'm currently working on a patch that resolves this issue and retains the resource tracker. I'll submit it as a PR when it is done. |
We don't actually know what went on in @applio's head when he wrote this patch, and he doesn't seem to respond to email. But, @DamianBarabonkovQC, before you do more work, can you explain what you are planning to do? (Honestly I personally know nothing about the resource tracker, I am just hoping to help find a resolution of one type or another.) Probably the better place to do your write-up would be the in bpo-38119. |
@DamianBarabonkovQC , I would say removing resource tracker is not the best way to go forward, but it is the easiest way to fix some of the major issues plaguing shared_memory. You can see my comment detailing why unix's shared memory will still be inconsistent with windows. Windows uses a reference counting mechanism to cleanup shared memory segment, it only destroys/unlinks the shared memory segment if all the process using it are no longer running. I have mentioned some methods, how we can implement a reference counting mechanism to make unix's implement consistent with windows here. Having said that, there are still major issues in shared_memory which this PR will fix. Consider the scenario mentioned here. |
@vinay0410, to this matter I agree that this PR is a good fix at present. I'm just thinking forward. I like the idea of a reference count at the start of the shared memory segment, for its simplicity and clarity. My solution involves the /tmp directory, but was was mentioned in your post, that might not always work depending on the user's settings for /tmp. |
@DamianBarabonkovQC , Completely agree, I think for now this should be merged to fix some of the critical issues, and then we should have a discussion with core developers about a better and robust solution to tackle this problem. |
Thanks for your input @gvanrossum to this matter. Both Vinay and I agree that for the time being, a temporary patch to this issue would be to simply remove the resource tracking temporarily from the SharedMemory since it is a source of issue with the current implementation. How could we get this patch accepted? Moving forward, I have began a discussion detailing a potential solution here. This would not be very invasive proposal, that should implement resource tracking correctly. I would really appreciate your thoughts on this manner, since maybe my proposal is more invasive than I think, or there is a better approach. |
You need to get someone to review it who actually understand shared memory. Since @tiran hasn't responded to your review request and @applio has not responded to repeated pleas, I'm not sure who to try next (I don't qualify, I don't understand this code at all). Maybe you can email @ericsnowcurrently and get him interested? |
Hi @gvanrossum , Thanks for suggesting possible reviewers. I dropped and email to @ericsnowcurrently 2 weeks ago, but he hasn't opened it yet. Do you have any other suggestions to get this PR reviewed, since this is relatively critical issue in |
reopened to re-trigger CI-CD. |
Next step is a post on python-dev.
On Thu, Aug 27, 2020 at 22:13 Vinay Sharma ***@***.***> wrote:
Hi @gvanrossum <https://github.com/gvanrossum> , Thanks for suggesting
possible reviewers. I dropped and email to @ericsnowcurrently
<https://github.com/ericsnowcurrently> 2 weeks ago, but he hasn't opened
it yet. Do you have any other suggestions to get this PR reviewed, since
this is relatively critical issue in multiprocessing.shared_memory.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21516 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWCWMWFDAGXVTEZHNGCBUTSC44ILANCNFSM4O526PVA>
.
--
--Guido (mobile)
|
…ndle after any one independent process exit.
a061cd5
to
6d09ce7
Hi @gvanrossum , I posted on python-dev mailing list around a month ago (https://mail.python.org/archives/list/python-dev@python.org/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/) but haven't received any response. |
Hi Vinay,
In a few weeks (Oct 19-23) we'll have the (virtual) core dev sprint. Unlike
other years, Davin Potts hasn't signed up, but I will try to get some other
core dev to help decide and review this. Could you send me a reminder
around the 19th?
…--Guido
On Fri, Oct 2, 2020 at 9:52 PM Vinay Sharma ***@***.***> wrote:
Hi @gvanrossum <https://github.com/gvanrossum> , I posted on python-dev
mailing list around a month ago (
***@***.***/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/)
but haven't received any response.
Are there any other next steps ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21516 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWCWMTDRTTTIQH3PJM6Q3LSI2UYNANCNFSM4O526PVA>
.
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
|
Sure, I will send you a reminder on October 19th. |
Hello Team, |
Hi @gvanrossum , As suggested I am reminding you to take up review of this PR in python core dev sprint. |
@vinay0410 Thanks for the ping. I've heard that @pitrou should be able to help out here. |
Removed @tiran as a reviewer since this is really because of the |
This code was originally added in #13222. I don't think we can remove it without some discussion. I'm cc'ing @pierreglaser as well as @tomMoral and @ogrisel to see if they have better suggestions to reconcile the various expectations here. |
(otherwise I'll recommend rejecting) |
Why reject? There is a real problem here (https://bugs.python.org/issue38119). |
Well, the problem is really differing expectations. In some cases, you want shared memory segments to be reliably collected when a process dies abruptly (Unix systems don't ensure this, so you have to do it yourself). In other cases, you want shared memory segments to outlive their known consumers. The proper solution, IMHO, would be to make it configurable so that everybody can have it their own way. In any case, removing this code blindly would produce resource leaks in existing applications. |
Ah, I see. First thing I've seen in this entire thread from someone who actually understands both sides of the problem. Thanks! |
@pitrou , @gvanrossum , Another approach is to use Windows' mechanism of cleaning shared memory. Windows keeps reference count for every shared memory segment, and only destructs when reference count becomes 0. Using this will also make Unix's mechanism consistent with current windows' mechanism. |
The idea of the The For the ref counting at the beginning of the file, the issue is that then, I don't see how to ensure that we detect the resource should be cleaned up if a process holding a reference on it dies. |
@tomMoral , for reference counting at the beginning of the file, one would need resource tracker to ensure that the shared memory segment is destructed as soon as reference count becomes 0, and resource tracker will be responsible for decrementing/incrementing reference count. |
@vinay0410 The issue is with the child processes of the main process. Only the main process is tracked by the Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the Just thinking out loud, maybe another solution would be to have a ref count of the number of |
@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it. Is that what you were also thinking ? Also, I think child process will have their own separate resource trackers, right ? |
Unless something weird happens, the resource tracker is shared accross all descendant of a main process (see here for instance). If a child process creates a new segment, it is also tracked in the same
Actually, the issue is that passing the shared memory segment to a child process is the same as passing it to another process. The name of the segment is send and we create a new |
@tomMoral , should I then start with implementing refcount using resource tracker approach. Are you okay with the approach ? |
I am okay with the approach of ref counting with the Just to summarize and make sure we are on the same page:
This seems to provide a simple enough mechanism to ensure that shared memory segment are not left unattended, while providing the flexibility to have multiple unrelated processes using it. A potential backward compat issue is with cross language application (for instance people using C program to access this shared memory). I don't know if there is a way to ensure that the refcount is not used as part of the shared memory segment. |
@tomMoral , I was thinking of using the following primitives to count in an atomic way. These primitives ensure cross process atomic updates. And these primitives (namely cpython/Include/internal/pycore_atomic.h Line 94 in 74b4eda |
@vinay0410 as these are low-level C primitives, the refcount seems complicated to implement and maintain but maybe I just don't see how you would use them because the Also, do you think it is possible shift the Note that it might be worth considering using a named semaphore (which is typically designed for such atomic counting). Adding at the end of the file the name of a semaphore and the length of the name to be able to read it. This would allow to implement everything with python ( |
It would be easy enough to expose Python wrappers for the atomic intrinsics in the Note that you need atomic increments and decrements, not mere loads and stores. This is provided in a portable way by the remarkable portable-snippets library: https://github.com/nemequ/portable-snippets/tree/master/atomic
That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot). cc @aeros, by the way. |
@pitrou ,
That's correct I am planning to use,
In cpython/Lib/multiprocessing/shared_memory.py Line 180 in db6434c
And then replace cpython/Lib/multiprocessing/shared_memory.py Line 204 in db6434c Finally I will pass self._ref_count to python functions which will atomically increment or decrement refcount implemented using c builtins. |
Hi @pitrou and @tomMoral , I have raised a PR #23174 , implementing the above solution. Also, I have not updated tests and documentation yet. Because, I just wanted to get some initial feedback on the approach I am using to implement the above solution. I have added functions |
I'm not strongly familiar enough with |
Eliminates the use of the resource tracker on shared memory segments but preserves its use on semaphores, etc. where it is appropriately applied.
Adds a test that fails with the current code (an independent Python process started via subprocess accesses shared memory created by a different Python process, then exits normally, expecting the original creator process to continue to be able to access that shared memory) but passes with this patch to defend against future regressions.
https://bugs.python.org/issue38119