Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-113767: Adds a zipfile.register_compressor() API. #113768

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Jan 6, 2024

This refactors zipfile to add a register_compression() API and uses the API for all of our built-in stdlib based compressors (zlib's deflate, bz2, and lzma).

This allows additional compression methods to be officially supplied by third party libraries without monkeypatching the zipfile module. It is designed to obsolete the gross hacks that things like https://pypi.org/project/zipfile-zstd/ have to do.

Motivation: zstandard support in zipfile, without adding a zstandard implementation to the standard library and without using monkeypatching. But I'm sure others will find it useful as well. Another potential use case is enabling the use of optimized compressors (think parallelized zlib's like pigz) instead of the stdlib default implementations.

TODO items:

  • Overall design review & feature discussions.
  • Code review.
  • Documentation (see the huge zipfile.register_compression docstring for now).
  • NEWS entry & What's New text.

Uses this API for all of our built-in stdlib based compressors.

This allows additional compression methods to be officially supplied by
third party libraries without monkeypatching the zipfile module.

It is designed to obsolete the gross hacks that things like
https://pypi.org/project/zipfile-zstd/ have to do.
@gpshead
Copy link
Member Author

gpshead commented Jan 6, 2024

Some further deep diving on the API design and use here is needed. This refactoring and dancing around the decompressor flush method and differing APIs between zlib and bzip2 and lzma was frustratingly informative. #67389 should be considered and perhaps don't require a flush() on the decompressors at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant