Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-18819: tarfile: only set device fields for device files #18080

Open
wants to merge 1 commit into
base: master
from

Conversation

@wchargin
Copy link

wchargin commented Jan 20, 2020

The GNU docs describe the devmajor and devminor fields of the tar
header struct only in the context of character and block special files,
suggesting that in other cases they are not populated. Typical utilities
behave accordingly; this patch teaches tarfile to do the same.

Test Plan:
No tests added because none appear to exist for this module. Manually
verified that this enables output that is bit-for-bit compatible with
GNU tar. In particular, this program now passes on my Ubuntu 16.04,
whereas it failed before this patch:

import os
import subprocess
import tarfile
import tempfile

filename = "important_data"
contents = b"The quick brown fox jumps over the lazy dog"

with tempfile.TemporaryDirectory() as tmpdir:
    os.chdir(tmpdir)
    with open(filename, "wb") as outfile:
        outfile.write(contents)
    with tarfile.open("py.tar", "x", format=tarfile.GNU_FORMAT) as outfile:
        outfile.add(filename)
    subprocess.check_call(["tar", "cf", "gnu.tar", filename])
    subprocess.check_call(["shasum", "-a", "256", "py.tar", "gnu.tar"])
    subprocess.check_call(["cmp", "-b", "py.tar", "gnu.tar"])

(The exact hashes depend on the calling user and current time but should
always be the same across both output archives.)

wchargin-branch: tarfile-limit-device-headers

https://bugs.python.org/issue18819

The GNU docs describe the `devmajor` and `devminor` fields of the tar
header struct only in the context of character and block special files,
suggesting that in other cases they are not populated. Typical utilities
behave accordingly; this patch teaches `tarfile` to do the same.

Test Plan:
No tests added because none appear to exist for this module. Manually
verified that this enables output that is bit-for-bit compatible with
GNU tar. In particular, this program now passes on my Ubuntu 16.04,
whereas it failed before this patch:

```python
import os
import subprocess
import tarfile
import tempfile

filename = "important_data"
contents = b"The quick brown fox jumps over the lazy dog"

with tempfile.TemporaryDirectory() as tmpdir:
    os.chdir(tmpdir)
    with open(filename, "wb") as outfile:
        outfile.write(contents)
    with tarfile.open("py.tar", "x", format=tarfile.GNU_FORMAT) as outfile:
        outfile.add(filename)
    subprocess.check_call(["tar", "cf", "gnu.tar", filename])
    subprocess.check_call(["shasum", "-a", "256", "py.tar", "gnu.tar"])
    subprocess.check_call(["cmp", "-b", "py.tar", "gnu.tar"])
```

(The exact hashes depend on the calling user and current time but should
always be the same across both output archives.)

wchargin-branch: tarfile-limit-device-headers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.