Skip to content

bpo-39667: Sync zipp 3.0 #18540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 29, 2020
Merged

bpo-39667: Sync zipp 3.0 #18540

merged 5 commits into from
Feb 29, 2020

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Feb 17, 2020

Improve pathlib.Path compatibility on zipfile.Path and correct performance degradation as found in zipp 3.0.

https://bugs.python.org/issue39667

…rect performance degradation as found in zipp 3.0
@codecov
Copy link

codecov bot commented Feb 17, 2020

Codecov Report

Merging #18540 into master will decrease coverage by 0.13%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18540      +/-   ##
==========================================
- Coverage   82.19%   82.06%   -0.14%     
==========================================
  Files        1958     1955       -3     
  Lines      589743   584080    -5663     
  Branches    44457    44457              
==========================================
- Hits       484767   479332    -5435     
+ Misses      95311    95118     -193     
+ Partials     9665     9630      -35     
Impacted Files Coverage Δ
Modules/_decimal/libmpdec/fnt.c 0.00% <0.00%> (-83.34%) ⬇️
Lib/distutils/tests/test_bdist_rpm.py 30.00% <0.00%> (-65.00%) ⬇️
Lib/distutils/command/bdist_rpm.py 7.63% <0.00%> (-56.88%) ⬇️
Lib/test/test_curses.py 36.60% <0.00%> (-54.79%) ⬇️
Lib/curses/__init__.py 20.51% <0.00%> (-30.77%) ⬇️
Modules/_decimal/libmpdec/umodarith.h 80.76% <0.00%> (-19.24%) ⬇️
Lib/curses/textpad.py 7.53% <0.00%> (-14.39%) ⬇️
Lib/test/test_urllib2net.py 76.92% <0.00%> (-13.85%) ⬇️
Lib/test/test_smtpnet.py 78.57% <0.00%> (-12.86%) ⬇️
Lib/ftplib.py 63.85% <0.00%> (-6.06%) ⬇️
... and 337 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c1b6a6...34d7114. Read the comment docs.

@@ -2880,7 +2881,7 @@ def test_open(self):
a, b, g = root.iterdir()
with a.open() as strm:
data = strm.read()
assert data == b"content of a"
assert data == "content of a"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the return type of Zipfile.Path.open from bytes to string and could be backwards incompatible on backporting to 3.8

Copy link
Member Author

@jaraco jaraco Feb 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for highlighting this concern. It's this concern that's kept me from fast-tracking this change. I'll try to enumerate the rationale and concerns and alternatives (to simply breaking it).

Rationale

  • zipfile.Path was intended to be compatible with a subset of pathlib.Path.
  • Recent efforts in importlib_resources revealed that a pathlib.Path compatible implementation of .open() would be useful if not necessary.
  • It would be beneficial for users for the two Path objects to have a consistent and compatible interface.
  • The .open() probably should have been kept as a private (._open()) interface until it was needed.
  • Users relying on .open() today can readily call Path.root.open() to get the current behavior.
  • Impacted users of backport (zipp) would have already been exposed to this change (via zipp 3.0) and likely have adapted. No concerns were raised.

Concerns

  • Users of Python 3.8 may be relying on the existing .open() implementation.
  • The documentation explicitly declares support for the existing behavior of .open(). If not for this concern, one might have been able to make the argument that .open() was an internal implementation detail.
  • The "compatible subset" of pathlib.Path has previously been only informally defined, although there is an ABC in importlib_resources (slated for inclusion in Python 3.9) that aims to formalize that interface.
  • There's no usage a user could supply that would work on both .open() methods as currently implemented. pathlib.Path requires mode='rb' to return bytes and zipfile.Path fails if 'b' is passed.
  • The interface isn't just used by callers of zipfile, but also by consumers of APIs that return zipfile.Path (i.e. importlib.metadata).

Alternatives

I'm considering several possible approaches.

  • Simply break compatibility. The impact may be negligible given the limited exposure of zipfile.Path.
  • Exclude this change in behavior from the (3.8) backport... and simply require all Python 3.8 users to deal with the incompatible interface.
  • Provide a compatibility method like _future_open only on Python 3.8 that users could call to get the future behavior.
  • Provide a compatible subclass, FuturePath only on Python 3.8 that would implement the alternate interface and would be supplied by importlib.metadata and others.
  • Provide a backward-compatibility method like _legacy_open that provides the legacy behavior. That seems unnecessary given that zipfile.Path.root.open exists.

Regardless of how we proceed, there are some actions.

  • Update docs to reflect change in API.

Copy link
Member Author

@jaraco jaraco Feb 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After mulling this over throughout the week and capturing the rationale and concerns above, I'm leaning toward making this backward-incompatible change in Python 3.8.x, assuming you agree and think the concern is manageable. Here's how that would work:

  • Backport this code change as presented in this PR.
  • Docs would be updated to reflect the change happening in 3.8.(next).
  • "What's New" would describe for users relying on zipfile.Path.open in Python 3.8.(next-1) to use zipfile.Path.root.open instead for easy, straightforward compatibility with the old behavior.

@ambv What do you think? Let me know if there's a better forum in which to discuss.

jaraco added a commit to jaraco/zipp that referenced this pull request Feb 22, 2020
@jaraco jaraco merged commit 0aeab5c into python:master Feb 29, 2020
@miss-islington
Copy link
Contributor

Thanks @jaraco for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-18701 is a backport of this pull request to the 3.8 branch.

sthagen added a commit to sthagen/python-cpython that referenced this pull request Feb 29, 2020
@jaraco jaraco deleted the feature/zipp-3 branch February 29, 2020 23:20
clrpackages pushed a commit to clearlinux-pkgs/zipp that referenced this pull request Mar 4, 2020
…voked on missing items or directories. Fixes #46.

Jason R. Coombs (7):
      Remove unused import
      Rely on Python syntax for pwd keyword argument.
      Rely on dict on Python 3.7 and later where ordering is guaranteed. Performance is better. Ref python/cpython#18540 (comment).
      Add test capturing expectation that write works and can be binary or text.
      Add tests capturing expectation about specific Exceptions when opening directories. Ref #46.
      Raise more appropriate exceptions when 'open' is invoked on missing items or directories. Fixes #46.
      Update changelog. Ref #47.
jaraco added a commit that referenced this pull request Apr 15, 2020
* bpo-39667: Sync zipp 3.0 (GH-18540)

* bpo-39667: Improve pathlib.Path compatibility on zipfile.Path and correct performance degradation as found in zipp 3.0

* 📜🤖 Added by blurb_it.

* Update docs for new zipfile.Path.open

* Rely on dict, faster than OrderedDict.

* Syntax edits on docs

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit 0aeab5c)

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>

* Clarify the change in behavior with a couple of workaround options.

* Restore API compatibility while backporting performance improvements.

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants