Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

always serialize the function using cloudpickle #281

Open
wants to merge 1 commit into
base: master
from

Conversation

@basnijholt
Copy link
Member

@basnijholt basnijholt commented May 28, 2020

Description

This was a suggestion of @jbweston some time ago.

Using this, the learners are picklable regardless of the serialization package chosen.

This also means that we can e.g. use lambda functions when using a ProcessPoolExecutor and do not need loky per se.

I am making these changes now because I am finding that in some cases it's not preferable to use cloudpickle to serialize simple dictionaries (see cloudpipe/cloudpickle#375).

Checklist

  • Fixed style issues using pre-commit run --all (first install using pip install pre-commit)
  • pytest passed

Type of change

Check relevant option(s).

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
This was a suggestion of @jbweston some time ago.

Using this, the leaners are picklable regardless of the serialization package chosen.
@codecov-commenter
Copy link

@codecov-commenter codecov-commenter commented May 28, 2020

Codecov Report

Merging #281 into master will increase coverage by 0.15%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #281      +/-   ##
==========================================
+ Coverage   80.57%   80.72%   +0.15%     
==========================================
  Files          34       34              
  Lines        4591     4601      +10     
  Branches      828      828              
==========================================
+ Hits         3699     3714      +15     
+ Misses        769      764       -5     
  Partials      123      123              
Impacted Files Coverage Δ
adaptive/learner/average_learner.py 86.51% <100.00%> (+0.30%) ⬆️
adaptive/learner/integrator_learner.py 91.42% <100.00%> (+0.05%) ⬆️
adaptive/learner/learner1D.py 92.85% <100.00%> (+0.04%) ⬆️
adaptive/learner/learner2D.py 79.43% <100.00%> (+0.14%) ⬆️
adaptive/learner/sequence_learner.py 88.46% <100.00%> (+0.30%) ⬆️
adaptive/runner.py 70.56% <0.00%> (+0.70%) ⬆️
adaptive/_version.py 47.66% <0.00%> (+2.80%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c339f01...574ba35. Read the comment docs.

@ogrisel
Copy link

@ogrisel ogrisel commented May 29, 2020

As noted in cloudpipe/cloudpickle#375, the performance problem when serializing large collections (list, dicts, sets) of small object is caused by the fact that prior to Python 3.8, cloudpickle had not other choice than subclass the Python implementation of the Pickler class which is slow.

Since Python 3.8, cloudpickle can subclass the C implementation of the Pickler class which should fix this performance problem.

@basnijholt
Copy link
Member Author

@basnijholt basnijholt commented May 29, 2020

@ogrisel, thanks a lot for your explanation (also in cloudpipe/cloudpickle#375 (comment))!

@jbweston jbweston self-requested a review May 29, 2020
Copy link
Contributor

@jbweston jbweston left a comment

LGTM, but what about LearnerND?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants