Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-58451: Add optional delete_on_close parameter to NamedTemporaryFile #97015

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

Ev2geny
Copy link

@Ev2geny Ev2geny commented Sep 22, 2022

This is a reincarnation of the already marked as awaiting merge PR 22431, which was subsequently closed due to some git mistake.

This PR fixes issue 58451: tempfile.NamedTemporaryFile not particularly useful on Windows

tempfile.NamedTemporaryFile is too hard to use portably when you need to open the file by name after writing it. To do that, you need to close the file first (on Windows), which means you have to pass delete=False, which in turn means that you get no help in cleaning up the actual file resource,

Hence at the moment there is no out of the box solution to use tempfile.NamedTemporaryFile on Windows in such scenario (which is often used in unit testing):

  • in test module:
  1. create and open temporary file
  2. write data to it
  3. pass name of the temporary file to the operational code
  • In operational code, being tested
  1. open file, using name of the temporary file
  2. read data from this temporary file

In this Pull Request the issue is solved by adding an additional optional argument to NamedTemporaryFile() 'delete_on_close' (default is True). It works in combination with already existing argument 'delete', and determines the deletion behaviour.

If delete is true (the default) and delete_on_close is true (the default), the file is deleted as soon as it is closed. If delete is true and delete_on_close is false, the file is deleted on context manager exit only, if no context manager was used, then the file is deleted only when the file-like object is finalized. If delete is false, the value of delete_on_close is ignored.

So, the change shall be fully backwards compatible.

@Ev2geny
Copy link
Author

Ev2geny commented Sep 22, 2022

The changes done to the code since it left the PR 22431

  • Merge conflicts are resolved

  • A new functionality is added based on the review feedback from @eryksun to cover the case if delete is true and delete_on_close is false, but the file is never used in a with context. The new functionality is that in this case the file is deleted when the instance is finalized. This is implemented by adding a __del__() method to the _TemporaryFileCloser. This required documentation update as well.
    @eryksun , please have a look at the implementation as well as documentation update.

Doc/whatsnew/3.12.rst Outdated Show resolved Hide resolved
Co-authored-by: Éric <merwok@netwok.org>
@Ev2geny
Copy link
Author

Ev2geny commented Sep 22, 2022

Ok, looks like there are some Ubuntu errors, which I will need to look at

Lib/tempfile.py Outdated Show resolved Hide resolved
…r to _TemporaryFileCloser

Test test_del_by_finalizer_if_no_with is changed from simulating finalizer by calling __del__() to making
sure finalizer really runs
Lib/tempfile.py Outdated Show resolved Hide resolved
Ev2geny added 3 commits Sep 24, 2022
…close is moved to _TemporaryFileWrapper

based on the advice from @eryksun

There are errors in the test (at least on Linux)
Lib/tempfile.py Show resolved Hide resolved
Lib/tempfile.py Outdated Show resolved Hide resolved
Lib/tempfile.py Outdated Show resolved Hide resolved
Doc/library/tempfile.rst Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/test/test_tempfile.py Outdated Show resolved Hide resolved
Lib/tempfile.py Outdated Show resolved Hide resolved
Ev2geny and others added 3 commits Sep 25, 2022
Implemented review from @eryksun and @merwok

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Éric <merwok@netwok.org>
…ch takes into account delete_on_close moved to _TemporaryFileWrapper (suggestion of @ekisun)
@Ev2geny
Copy link
Author

Ev2geny commented Sep 25, 2022

@eryksun , I have implemented all suggestions, but it now fails the unit test for the issue18879

    def test_method_lookup(self):
        # Issue #18879: Looking up a temporary file method should keep it
        # alive long enough.
        f = self.do_create()
        wr = weakref.ref(f)
        write = f.write
        write2 = f.write
        del f
        write(b'foo')

@eryksun
Copy link
Contributor

eryksun commented Sep 25, 2022

I have implemented all suggestions, but it now fails the unit test for the issue18879

The delegated attribute cache is a problem. It leads to a reference cycle if the function wrapper (bpo-18879) is changed to keep a reference to the file wrapper (i.e. self) instead of the file closer (i.e. self._closer). Apparently avoiding this reference cycle was the reason the closer was added.

So, sorry to have misled you, but it looks like we're going to have to implement all delete operations in the closer. The wrapper just needs to change __init__() to instantiate the closer with delete_on_close, and change __exit__() to call the closer's cleanup() method. The latter includes a call to self.file.close() if needed. Since all deleting is handled by the closer, there's no need to implement __del__() in the wrapper. For example:

class _TemporaryFileCloser:
    """A separate object allowing proper closing of a temporary file's
    underlying file object, without adding a __del__ method to the
    temporary file."""

    cleanup_called = False
    close_called = False

    def __init__(self, file, name, delete=True, delete_on_close=True):
        self.file = file
        self.name = name
        self.delete = delete
        self.delete_on_close = delete_on_close

    def cleanup(self, windows=(_os.name == 'nt'), unlink=_os.unlink):
        if not self.cleanup_called:
            self.cleanup_called = True
            try:
                if not self.close_called:
                    self.close_called = True
                    self.file.close()
            finally:
                if self.delete and not (self.delete_on_close and windows):
                    try:
                        unlink(self.name)
                    except FileNotFoundError:
                        pass

    def close(self):
        if not self.close_called:
            self.close_called = True
            try:
                self.file.close()
            finally:
                if self.delete and self.delete_on_close:
                    self.cleanup()

    def __del__(self):
        self.cleanup()
class _TemporaryFileWrapper:
    """Temporary file wrapper

    This class provides a wrapper around files opened for
    temporary use.  In particular, it seeks to automatically
    remove the file when it is no longer needed.
    """

    def __init__(self, file, name, delete=True, delete_on_close=True):
        self.file = file
        self.name = name
        self._closer = _TemporaryFileCloser(file, name, delete,
                                            delete_on_close)

    def __getattr__(self, name):
        # Attribute lookups are delegated to the underlying file
        # and cached for non-numeric results
        # (i.e. methods are cached, closed and friends are not)
        file = self.__dict__['file']
        a = getattr(file, name)
        if hasattr(a, '__call__'):
            func = a
            @_functools.wraps(func)
            def func_wrapper(*args, **kwargs):
                return func(*args, **kwargs)
            # Avoid closing the file as long as the wrapper is alive,
            # see issue #18879.
            func_wrapper._closer = self._closer
            a = func_wrapper
        if not isinstance(a, int):
            setattr(self, name, a)
        return a

    # The underlying __enter__ method returns the wrong object
    # (self.file) so override it to return the wrapper
    def __enter__(self):
        self.file.__enter__()
        return self

    # Need to trap __exit__ as well to ensure the file gets
    # deleted when used in a with statement
    def __exit__(self, exc, value, tb):
        result = self.file.__exit__(exc, value, tb)
        self._closer.cleanup()
        return result

    def close(self):
        """
        Close the temporary file, possibly deleting it.
        """
        self._closer.close()

    # iter() doesn't use __getattr__ to find the __iter__ method
    def __iter__(self):
        # Don't return iter(self.file), but yield from it to avoid closing
        # file as long as it's being used as iterator (see issue #23700).  We
        # can't use 'yield from' here because iter(file) returns the file
        # object itself, which has a close method, and thus the file would get
        # closed when the generator is finalized, due to PEP380 semantics.
        for line in self.file:
            yield line

This passes all tests for me in both Linux and Windows.

@eryksun
Copy link
Contributor

eryksun commented Sep 25, 2022

As I mentioned previously, the exception handler in NamedTemporaryFile() also needs to be changed to check delete_on_close:

    except:
        if name is not None and not (
          _os.name == 'nt' and delete and delete_on_close):
            _os.unlink(name)
        raise

@Ev2geny
Copy link
Author

Ev2geny commented Sep 25, 2022

As I mentioned previously, the exception handler in NamedTemporaryFile() also needs to be changed to check delete_on_close:

Isn't this code trying to remove whatever is left after unsuccessful attempt to create a temporary file? In this case why would we want to check the delete_on_close flag? Which code is going to delete it then, if we don't do it now ?

	modified:   Lib/tempfile.py
@eryksun
Copy link
Contributor

eryksun commented Sep 25, 2022

Isn't this code trying to remove whatever is left after unsuccessful attempt to create a temporary file? In this case why would we want to check the delete_on_close flag? Which code is going to delete it then, if we don't do it now ?

In the io.open() call, if opener() succeeds, then name is not None, and ownership of the file descriptor gets transferred to the io.FileIO instance. If io.open() ultimately fails or instantiating _TemporaryFileWrapper fails, the file descriptor gets closed. In this case, the file will be deleted automatically if _os.name == 'nt' and delete and delete_on_close, else it has to be deleted manually.

Lib/tempfile.py Show resolved Hide resolved
Lib/tempfile.py Show resolved Hide resolved
Ev2geny and others added 3 commits Sep 26, 2022
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Eryk Sun <eryksun@gmail.com>
@Ev2geny
Copy link
Author

Ev2geny commented Sep 26, 2022

Isn't this code trying to remove whatever is left after unsuccessful attempt to create a temporary file? In this case why would we want to check the delete_on_close flag? Which code is going to delete it then, if we don't do it now ?

In the io.open() call, if opener() succeeds, then name is not None, and ownership of the file descriptor gets transferred to the io.FileIO instance. If io.open() ultimately fails or instantiating _TemporaryFileWrapper fails, the file descriptor gets closed. In this case, the file will be deleted automatically if _os.name == 'nt' and delete and delete_on_close, else it has to be deleted manually.

OK, thanks. Understood. I have implemented this in this commit

@Ev2geny
Copy link
Author

Ev2geny commented Sep 26, 2022

@eryksun , sorry for asking an educational question, but can you explain me please what is the reason of using class attribute here, rather than an instance attribute?

class _TemporaryFileCloser:

   # these are class attributes
    cleanup_called = False
    close_called = False

    def __init__(self, file, name, delete=True, delete_on_close=True):
        self.file = file
        self.name = name
        self.delete = delete
        self.delete_on_close = delete_on_close

        # why not just setting the default  instance attributes here?
        # cleanup_called = False
        # close_called = False

    def cleanup(self, windows=(_os.name == 'nt'), unlink=_os.unlink):
        if not self.cleanup_called:        #if done, 1st time, this is looks at class attribute
            # But this creates an instance attribute with the same name
            # From now on all calls to  self.cleanup_called will return an instance attribute 
            self.cleanup_called = True  

@eryksun
Copy link
Contributor

eryksun commented Sep 26, 2022

sorry for asking an educational question, but can you explain me please what is the reason of using class attribute here, rather than an instance attribute?

It's just less work and less memory usage. The default value of False can be referenced on the class until it gets overridden to True as an instance attribute.

	modified:   Lib/test/test_tempfile.py
@Ev2geny
Copy link
Author

Ev2geny commented Sep 29, 2022

@zooba , this is a reincarnation of the already approved by you PR 22431 (which was closed due to git error). It has been further reviewed and updated. Can you please look at it now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants