-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-39667: Sync zipp 3.0 #18540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39667: Sync zipp 3.0 #18540
Conversation
…rect performance degradation as found in zipp 3.0
Codecov Report
@@ Coverage Diff @@
## master #18540 +/- ##
==========================================
- Coverage 82.19% 82.06% -0.14%
==========================================
Files 1958 1955 -3
Lines 589743 584080 -5663
Branches 44457 44457
==========================================
- Hits 484767 479332 -5435
+ Misses 95311 95118 -193
+ Partials 9665 9630 -35
Continue to review full report at Codecov.
|
@@ -2880,7 +2881,7 @@ def test_open(self): | |||
a, b, g = root.iterdir() | |||
with a.open() as strm: | |||
data = strm.read() | |||
assert data == b"content of a" | |||
assert data == "content of a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the return type of Zipfile.Path.open
from bytes to string and could be backwards incompatible on backporting to 3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for highlighting this concern. It's this concern that's kept me from fast-tracking this change. I'll try to enumerate the rationale and concerns and alternatives (to simply breaking it).
Rationale
zipfile.Path
was intended to be compatible with a subset ofpathlib.Path
.- Recent efforts in importlib_resources revealed that a
pathlib.Path
compatible implementation of.open()
would be useful if not necessary. - It would be beneficial for users for the two
Path
objects to have a consistent and compatible interface. - The
.open()
probably should have been kept as a private (._open()
) interface until it was needed. - Users relying on
.open()
today can readily callPath.root.open()
to get the current behavior. - Impacted users of backport (
zipp
) would have already been exposed to this change (via zipp 3.0) and likely have adapted. No concerns were raised.
Concerns
- Users of Python 3.8 may be relying on the existing
.open()
implementation. - The documentation explicitly declares support for the existing behavior of
.open()
. If not for this concern, one might have been able to make the argument that.open()
was an internal implementation detail. - The "compatible subset" of pathlib.Path has previously been only informally defined, although there is an ABC in importlib_resources (slated for inclusion in Python 3.9) that aims to formalize that interface.
- There's no usage a user could supply that would work on both
.open()
methods as currently implemented.pathlib.Path
requiresmode='rb'
to return bytes andzipfile.Path
fails if 'b' is passed. - The interface isn't just used by callers of
zipfile
, but also by consumers of APIs that returnzipfile.Path
(i.e.importlib.metadata
).
Alternatives
I'm considering several possible approaches.
- Simply break compatibility. The impact may be negligible given the limited exposure of
zipfile.Path
. - Exclude this change in behavior from the (3.8) backport... and simply require all Python 3.8 users to deal with the incompatible interface.
- Provide a compatibility method like
_future_open
only on Python 3.8 that users could call to get the future behavior. - Provide a compatible subclass,
FuturePath
only on Python 3.8 that would implement the alternate interface and would be supplied byimportlib.metadata
and others. - Provide a backward-compatibility method like
_legacy_open
that provides the legacy behavior. That seems unnecessary given thatzipfile.Path.root.open
exists.
Regardless of how we proceed, there are some actions.
- Update docs to reflect change in API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After mulling this over throughout the week and capturing the rationale and concerns above, I'm leaning toward making this backward-incompatible change in Python 3.8.x, assuming you agree and think the concern is manageable. Here's how that would work:
- Backport this code change as presented in this PR.
- Docs would be updated to reflect the change happening in 3.8.(next).
- "What's New" would describe for users relying on
zipfile.Path.open
in Python 3.8.(next-1) to usezipfile.Path.root.open
instead for easy, straightforward compatibility with the old behavior.
@ambv What do you think? Let me know if there's a better forum in which to discuss.
…rformance is better. Ref python/cpython#18540 (comment).
Thanks @jaraco for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8. |
GH-18701 is a backport of this pull request to the 3.8 branch. |
bpo-39667: Sync zipp 3.0 (pythonGH-18540)
…voked on missing items or directories. Fixes #46. Jason R. Coombs (7): Remove unused import Rely on Python syntax for pwd keyword argument. Rely on dict on Python 3.7 and later where ordering is guaranteed. Performance is better. Ref python/cpython#18540 (comment). Add test capturing expectation that write works and can be binary or text. Add tests capturing expectation about specific Exceptions when opening directories. Ref #46. Raise more appropriate exceptions when 'open' is invoked on missing items or directories. Fixes #46. Update changelog. Ref #47.
* bpo-39667: Sync zipp 3.0 (GH-18540) * bpo-39667: Improve pathlib.Path compatibility on zipfile.Path and correct performance degradation as found in zipp 3.0 * 📜🤖 Added by blurb_it. * Update docs for new zipfile.Path.open * Rely on dict, faster than OrderedDict. * Syntax edits on docs Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> (cherry picked from commit 0aeab5c) Co-authored-by: Jason R. Coombs <jaraco@jaraco.com> * Clarify the change in behavior with a couple of workaround options. * Restore API compatibility while backporting performance improvements. Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Improve pathlib.Path compatibility on zipfile.Path and correct performance degradation as found in zipp 3.0.
https://bugs.python.org/issue39667