Skip to content

pickle.loads will crash with self-references inside a custom hash function #124937

Open
@charles-cooper

Description

@charles-cooper

Bug report

Bug description:

here is a reproduction of the issue:

import pickle

class Foo:
    def __init__(self):
        self.x: object = {self}

    def __hash__(self):
        return hash(self.x)

foo = Foo()

print(pickle.loads(pickle.dumps(foo)))

running this will result in the following exception:

Traceback (most recent call last):
  File "/home/charles/vyper/foo.py", line 10, in <module>
    foo = Foo()
          ^^^^^
  File "/home/charles/vyper/foo.py", line 5, in __init__
    self.x: object = {self}
                     ^^^^^^
  File "/home/charles/vyper/foo.py", line 8, in __hash__
    return hash(self.x)
                ^^^^^^
AttributeError: 'Foo' object has no attribute 'x'

a workaround to the issue has been described at https://stackoverflow.com/a/44888113. however, i consider this a bug in the cpython implementation, because pickle theoretically handles object cycles (e.g., replacing line 5 with self.x = [self] poses no problem to the unpickler).

i suspect that cpython rehashes all items when reconstructing a dict or set, which makes the issue even more problematic, e.g. if the hash function has any side-effects, they will be executed by the unpickler.

build info:

$ python
Python 3.11.10 (main, Sep  7 2024, 18:35:41) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions