bpo-28053: Allow custom reducer when using multiprocessing #15058
Conversation
You will notice that for the def process(self, *args, **kwargs):
return self._Process(*args, reducer=self.get_reducer(), **kwargs) This comes from the fact that the cpython/Lib/multiprocessing/context.py Lines 220 to 224 in 7397cda For the process class, passing the class Process(process.BaseProcess):
_start_method = None
@staticmethod
def _Popen(process_obj):
if self._ctx is not None:
return self._ctx.get_context()._Process._Popen(process_obj)
return _default_context.get_context()._Process._Popen(process_obj) with So implementation-wise, it is easier to simply pass the But I am opened to suggestions. |
@reducer.setter | ||
def reducer(self, reduction): | ||
globals()['reduction'] = reduction | ||
if not isinstance(reduction, original_reducer.AbstractReducer): |
The checks below aren't strict enough; they pass for a .register()
ed subclass of Abstract<Reducer|Pickler|Unpickler>
, but the code won't actually work with such a class due to depending on undocumented APIs.
I do not understand what you mean by "Undocumented API". This PR comes with documentation for the AbstractReducer
, AbstractPickler
, and AbstractUnpickler
.
What I mean is that the documentation suggests that the below code should work:
class DummyReducer:
def get_pickler_class():
return pickle.Pickler
def get_unpickler_class():
return pickle.Unpickler
AbstractReducer.register(DummyReducer)
multiprocessing.set_reducer(DummyReducer())
However, in practice, it appears that running that code (and then doing something that uses reduction) will produce a (rather cryptic) error message in some cases. Either the documentation should be clear that virtual subclasses don't work and the code should raise an error when a virtual subclass is passed, or the code should be updated to handle virtual subclasses.
What I mean is that the documentation suggests that the below code should work
Not really: the get_pickler_class
API documentation specifies that get_pickler_class
must return anAbstractPickler
subclass, but your DummyReducer
(which itself, should be an AbstractReducer
subclass) simply returns a Pickler
, which is not an AbstractPickler
subclass. Or maybe I misunderstood your message?
You didn't misunderstand anything, my example was accidentally bogus, and I misunderstood the code. The actual bug I was trying to point out appears not to exist with reducers, but instead with picklers.
class DummyReducer(AbstractReducer):
def get_pickler_class():
return DummyPickler
class DummyPickler(pickle.Pickler):
pass
AbstractPickler.register(DummyPickler)
multiprocessing.set_reducer(DummyReducer())
This seemingly harmless code appears to have the effect of breaking the passing of all of the custom types that multiprocessing
registers, in an entirely non-intuitive way, which I think should either be fixed or documented.
@pitrou what could I do to ease the review process of this PR? |
@pierreglaser The first necessary condition is for me to find some time... sorry :-/ |
@pitrou no worries :) |
'''Abstract base class for use in implementing a Reduction class | ||
suitable for use in replacing the standard reduction mechanism | ||
used in multiprocessing.''' | ||
ForkingPickler = ForkingPickler | ||
register = register | ||
dump = dump | ||
send_handle = send_handle |
None of the attributes defined between here and __init__
appear to be used.
Same comment as below.
|
||
def __init__(self, *args): | ||
register(type(_C().f), _reduce_method) | ||
register(type(list.append), _reduce_method_descriptor) | ||
register(type(int.__add__), _reduce_method_descriptor) | ||
register(functools.partial, _reduce_partial) | ||
register(socket.socket, _reduce_socket) | ||
|
||
@property | ||
def HAVE_SEND_HANDLE(self): |
This property does not appear to be used.
This is work taken from @pablogsal previous commits (see #9959). I assume there was a good reason for this, but he'll probably know better than me.
def get_unpickler_class(self): | ||
return AbstractUnpickler | ||
|
||
def register(self, type_, reduce_): |
This method (a) doesn't appear to be used and (b) doesn't necessarily work, because nothing requires the get_pickler_class
to define a register
method.
Regarding (b), this is stated in the documentation that get_pickler_class
should return a subclass of an AbstractPickler
, that itself (as a subclass of ForkingPickler
), implements a register
method. Regarding (a), same comment as above (this @pablogsal's work), but I'll think about whether this function can be removed.
@pablogsal friendly ping: I am curious to know what are your thoughts on this. |
@@ -0,0 +1 @@ | |||
Fix the implementation of custom reducers for in `multiprocessing`. |
'for in' Either a word is missing or there is an extra word.
To me, it looks like "for" is the extra word. I'd personally go with:
Fix the implementation of custom reducers for in `multiprocessing`. | |
Fix the implementation of custom reducers in :mod:`multiprocessing`. |
(Also adds the sphinx role :mod:
to provide an inline link to the module. It's optional for Misc/NEWS.d
, but we may as well include it with the typo fix. This makes it easier to move over to the "What's New" doc later.)
Rebasing is, as evidenced by repeated failures like this, error prone. Besides the obvious noise, removing the spurious review requests does not unsubscribe people. If you were just trying to update your branch to your current repository/fork master (once synchronized), rebasing is also not needed. After checking out the branch, |
This PR is a follow up on #9959. It implements the suggestions I made here #9959 (comment).
I'm making it a new PR to get the CI working now and gather separate feedback.
context
toProcess
objectsAbstractPickler
intoAbstractPickler
andAbstractUnpickler
https://bugs.python.org/issue28053
The text was updated successfully, but these errors were encountered: