Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfiles: AttributeError: '_Stream' object has no attribute 'exception' while trying to open tgz file #107396

Closed
balmeida-nokia opened this issue Jul 28, 2023 · 7 comments · Fixed by #107485
Labels
3.11 only security fixes 3.12 bugs and security fixes 3.13 new features, bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@balmeida-nokia
Copy link
Contributor

balmeida-nokia commented Jul 28, 2023

Bug report

If a tar file appears to be a tar.gz file, it can fail in this block of tarfile.py

        while c < size:
            # Skip underlying buffer to avoid unaligned double buffering.
            if self.buf:
                buf = self.buf
                self.buf = b""
            else:
                buf = self.fileobj.read(self.bufsize)
                if not buf:
                    break
            try:
                buf = self.cmp.decompress(buf)
            except self.exception as e:
                raise ReadError("invalid compressed data") from e
            t.append(buf)
            c += len(buf)

at:

buf = self.cmp.decompress(buf)

before self.exception is set.
Because around:

            if comptype == "gz":
                try:
                    import zlib
                except ImportError:
                    raise CompressionError("zlib module is not available") from None
                self.zlib = zlib
                self.crc = zlib.crc32(b"")
                if mode == "r":
                    self._init_read_gz()
                    self.exception = zlib.error
                else:
                    self._init_write_gz()

This: self._init_read_gz()
is called before self.exception = zlib.error

Stack trace:

_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
__init__, tarfile.py:375
open, tarfile.py:1827
<module>, test.py:2

In my case, buf = self.cmp.decompress(buf) failed to execute but that's a separate issue that may or may not be a bug

Your environment

  • CPython versions tested on: 3.11.4
  • Operating system and architecture: Windows 10 21H2

Linked PRs

@balmeida-nokia balmeida-nokia added the type-bug An unexpected behavior, bug, or error label Jul 28, 2023
@sunmy2019
Copy link
Member

sunmy2019 commented Jul 29, 2023

Looks reasonable. Will you open a PR?

@sunmy2019 sunmy2019 added stdlib Python modules in the Lib dir 3.11 only security fixes 3.12 bugs and security fixes 3.13 new features, bugs and security fixes labels Jul 29, 2023
@balmeida-nokia
Copy link
Contributor Author

Sure

balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Jul 31, 2023
In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Jul 31, 2023
In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().

closes: python#107396
balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Jul 31, 2023
In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().

closes: python#107396
balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Aug 1, 2023
balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Aug 1, 2023
In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().

closes: python#107396
@encukou
Copy link
Member

encukou commented Aug 10, 2023

Hello,
How does this issue manifest? exception appears like an write-only attribute of an internal class.
I'd like to add a test for the PR, but can't think of a reasonable one. Do you have a use case we currently don't have tests for?

@balmeida-nokia
Copy link
Contributor Author

@encukou Please see: #107398

Maybe that one will explain better the sequence where this failure can happen.

My interpretation is that gzip supports backwards skipping and it appears to be rarely used. If I open a tgz with r|* (which, by design, doesn't support backwards skipping) and the gzip stream requests a backwards skip, an exception is raised. At the except statement it has except self.exception which is were the attribute is read, which means the attribute is actually read-write.

@encukou
Copy link
Member

encukou commented Aug 10, 2023

Oh, I see!
Here's a reproducer:

with open('hm.tar.gz', 'rb') as f:
    f = io.BytesIO(
        b'\x1f\x8b'  # header
        + b'\x08'  # compression method
        + b'\x04'  # flags
        + b'\0\0\0\0\0\0'  # padding?
        + b'\0\x01'  # size
        + bytes(5)  # corrupt data (zeros)
    )
    tarfile.open(fileobj=f, mode='r|gz')

If you can add it as a test using with self.assertRaises(tarfile.ReadError): to Lib/test/test_tarfile.py, it'd speed things up. If not I'll get to it later.

@balmeida-nokia
Copy link
Contributor Author

OK. I will.

balmeida-nokia added a commit to balmeida-nokia/cpython that referenced this issue Aug 18, 2023
In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().

closes: python#107396
@balmeida-nokia
Copy link
Contributor Author

balmeida-nokia commented Aug 18, 2023

@encukou Made a commit with the tests. I'm unsure if I placed the new class in the correct placement but it seemed to be one of the best places to place it. I tried to figure out if a new class should be done and seemed to be a reasonable choice.
#107485

(sorry for the delay. Ty for waiting)

encukou added a commit to balmeida-nokia/cpython that referenced this issue Aug 21, 2023
encukou pushed a commit that referenced this issue Aug 21, 2023
…7485)


In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 21, 2023
…ythonGH-107485)

In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
(cherry picked from commit 37135d2)

Co-authored-by: balmeida-nokia <83089745+balmeida-nokia@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 21, 2023
…ythonGH-107485)

In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
(cherry picked from commit 37135d2)

Co-authored-by: balmeida-nokia <83089745+balmeida-nokia@users.noreply.github.com>
Yhg1s pushed a commit that referenced this issue Aug 21, 2023
…GH-107485) (#108207)

gh-107396: tarfiles: set self.exception before _init_read_gz() (GH-107485)

In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
(cherry picked from commit 37135d2)

Co-authored-by: balmeida-nokia <83089745+balmeida-nokia@users.noreply.github.com>
encukou pushed a commit that referenced this issue Aug 21, 2023
…GH-107485) (GH-108208)

gh-107396: tarfiles: set self.exception before _init_read_gz() (GH-107485)

In the stack call of: _init_read_gz()
```
_read, tarfile.py:548
read, tarfile.py:526
_init_read_gz, tarfile.py:491
```
a try;except exists that uses `self.exception`, so it needs to be set before
calling _init_read_gz().
(cherry picked from commit 37135d2)

Co-authored-by: balmeida-nokia <83089745+balmeida-nokia@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes 3.13 new features, bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants