Skip to content

Exclude cache directories from backups using CACHEDIR.TAG #9018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 18, 2020

Conversation

jstasiak
Copy link
Contributor

This helps to prevent bloating backups with caches.

See https://bford.info/cachedir/ for more information about the
specification – it's supported by Borg, restic, GNU Tar and other
solutions.

This helps to prevent bloating backups with caches.

See https://bford.info/cachedir/ for more information about the
specification – it's supported by Borg, restic, GNU Tar and other
solutions.
@jstasiak
Copy link
Contributor Author

I'm wondering if this maybe should always (re)create the CACHEDIR.TAG file in .mypy_cache to avoid situations where Mypy is interrupted after creating the directory and before the file is created, leaving the directory CACHEDIR.TAG-less forever. I estimate the chance of this to be slim, but given enough people using it there'll be a handful who'll encounter that condition I image and the code would not be much more complicated, the relevant lines would just read

if os.path.isdir(options.cache_dir):
    exclude_from_backups(options.cache_dir)

@jstasiak
Copy link
Contributor Author

Following up on the concern mentioned above, we could attempt to "create" the directory + gitignore + CACHEDIR.TAG atomically by creating a temporary directory, creating files in it and then renaming the temporary directory to the desired path (like I did in a patch to Cargo) but this would have to happen somewhere deeper down the stack, in the place where the actual cache directory is created now (I'm not sure where is it).

@jstasiak
Copy link
Contributor Author

FWIW I had a quick look around and the only other open source backup solution I found (I only looked at open source ones) that supports this out of the box is attic, others (like rsync and bacula) support this in a exclude-directory-if-a-file-named-like-this-exists-in-it way – so while my initial commit message may be technically correct I wanted the above to be known, I don't want to oversell this solution on false premises.

Copy link
Collaborator

@msullivan msullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure.

It might be worth handling the interrupted case, yeah. It would be fine to just check every time. If you're excited about that, submit a follow-up.

@msullivan msullivan merged commit 0b865c9 into python:master Oct 18, 2020
@jstasiak
Copy link
Contributor Author

Cheers.

@jstasiak jstasiak deleted the exclude-cache-dir-from-backups branch October 18, 2020 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants