Skip to content

Already tz-aware when reading hdf5 generated by Python2 in Python3 #17145

Closed
@Denisevi4

Description

@Denisevi4

This is a follow up with Issue 14307 that was closed. There was a request for a reproducible example, but since the author who reported the issue first did not respond, the Issue was closed.

I am currently experiencing the same issue. And I posted a reproducible example on StackOverflow here:

StackOverflow

Here is its Copy-paste:
I'm using Anaconda Python: 3.4.5 and 2.7.3. Both are using pandas 0.18.1.

Here is a reproducible example:

generate.py (to be executed with Python2):

`import pandas as pd
from pandas import HDFStore

index = pd.DatetimeIndex(['2017-06-20 06:00:06.984630-05:00', '2017-06-20 06:03:01.042616-05:00'], dtype='datetime64[ns, CST6CDT]', freq=None)

p1 = [0, 1]
p2 = [0, 2]

df1 = pd.DataFrame({"p1":p1, "p2":p2}, index=index)
df2 = pd.DataFrame({"p1":p1, "p2":p2, "i":index})

store = HDFStore("./test_issue.h5")
store['df'] = df1
store.close()
`

read_issue.py:
`import pandas as pd
from pandas import HDFStore

store = HDFStore("./test_issue.h5", mode="r")
df = store['/df']

store.close()
print(df)
`
Running read_issue.py in Python2 has no issues
But running it in Python3 produces Error with this traceback:

Traceback (most recent call last): File "read_issue.py", line 5, in df = store['df'] File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 417, in getitem return self.get(key) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 634, in get return self._read_group(group) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 1272, in _read_group return s.read(**kwargs) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 2779, in read ax = self.read_index('axis%d' % i) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 2367, in read_index _, index = self.read_index_node(getattr(self.group, key)) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 2492, in read_index_node _unconvert_index(data, kind, encoding=self.encoding), **kwargs) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/indexes/base.py", line 153, in new result = DatetimeIndex(data, copy=copy, name=name, **kwargs) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/util/decorators.py", line 91, in wrapper return func(*args, **kwargs) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/tseries/index.py", line 321, in new raise TypeError("Already tz-aware, use tz_convert " TypeError: Already tz-aware, use tz_convert to convert. Closing remaining open files:./test_issue.h5...done

So, there is an issue with indices. However, if you save df2 in generate.py (datetime as a column, not as an index), then Python3 in read_issue.py produces a different error:

Traceback (most recent call last): File "read_issue.py", line 5, in df = store['/df'] File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 417, in getitem return self.get(key) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 634, in get return self._read_group(group) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 1272, in _read_group return s.read(**kwargs) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py", line 2788, in read placement=items.get_indexer(blk_items)) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/core/internals.py", line 2518, in make_block return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) File "/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/core/internals.py", line 90, in init len(self.mgr_locs))) ValueError: Wrong number of items passed 2, placement implies 1 Closing remaining open files:./test_issue.h5...done

Also, if you execute generate_issue.py in Python3 (saving either df1 or df2), then there is no problem executing read_issue.py in either Python3 or Python2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions