Skip to content

zipfile local header extra data is inaccessible #113994

@AnyOldName3

Description

@AnyOldName3

Feature or enhancement

Proposal:

Background

I'm trying to extract data from an HTTrack cache zip, documented here https://www.httrack.com/html/cache.html. That has per-file data (which I need) in the per-file local header's extra field field. Currently, the central directory's file header's extra field field is accessible via ZipInfo.extra, but as far as I can tell from the spec, that's not required to be the same thing. For example, 7zip writes NTFS timestamps to the central directory but not the local headers according to https://sourceforge.net/p/sevenzip/bugs/2313/

Proposal

Add some way to access this, and any other interesting local header fields to ZipFile. All the others are things where it only makes sense for a file to have a single value, like the CRC or filename, so it's probably just this one field that matters if only sane zips need to be supported.

Therefore, there could be a method such as ZipFile.getlocalheaderextra(name), functioning roughly like getinfo or read, and returning a bytes object.

Implementing this could be mostly a copy-and-paste job - ZipFile.open already finds the field and seeks past it here

zef_file.seek(fheader[_FH_EXTRA_FIELD_LENGTH], whence=1)

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-featureA feature request or enhancement

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions