Description
Describe the issue:
Accessing the object returned by np.load() on a .npz file results in a BadZipFile exception if a .npz file of the same name is written in the mean time. The error descritpion is either CRC (see example below) if the array names match, or "File name in directory [...] differ" if they don't.
Interestingly enough, this does not happen if the first element of the loaded object is accessed before the np.savez_compressed() that breaks it afterwards. (line commented out in example code). Accessing another element doesn't have this effect, but it doesn't hurt either.
Reproduce the code example:
import numpy as np
fn = "test.npz"
np.savez_compressed(fn, d1= np.array([1,2]), d2= np.array([11,22]))
backup = np.load(fn)
# backup[backup.files[0]]
np.savez_compressed(fn, d1= np.array([33,33]), d2= np.array([3,3]))
backup[backup.files[0]]
Error message:
File "/home/▓▓▓▓/py_venvs/default/lib/python3.8/site-packages/numpy/lib/npyio.py", line 241, in __getitem__
magic = bytes.read(len(format.MAGIC_PREFIX))
File "/usr/lib/python3.8/zipfile.py", line 940, in read
data = self._read1(n)
File "/usr/lib/python3.8/zipfile.py", line 1030, in _read1
self._update_crc(data)
File "/usr/lib/python3.8/zipfile.py", line 958, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'd1.npy'
# OR, with different array names on the second write:
File "/home/▓▓▓▓/py_venvs/default/lib/python3.8/site-packages/numpy/lib/npyio.py", line 240, in __getitem__
bytes = self.zip.open(key)
File "/usr/lib/python3.8/zipfile.py", line 1556, in open
raise BadZipFile(
zipfile.BadZipFile: File name in directory 'd1.npy' and header b'd3.npy' differ.
NumPy/Python version information:
Python 3.8.10 / Numpy 1.23.4
Context for the issue:
Happens when trying to backup previous data to restore it at a later point when the new data was already written. There are numerous ways around it, such as the above mentioned hack, or immediately reading the arrays into a dict and then closing the loaded file object properly (as, if I understand correctly, you should do anyway)