Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upENH: DataFrame Constructions from Data Classes #37577
Comments
We already support that in the main DataFrame constructor, right? https://pandas.pydata.org/docs/user_guide/dsintro.html?highlight=dataclass#from-a-list-of-dataclasses |
Wow! It really works. Nice! Well, I guess that an explicit mention of data classes in API reference is neded. I guess that many people (especially mature users) looks up on the reference and do not read user guide intro. |
sure would take updated docs. also could take a PR to import dataclass at the top level (as this was for 3.6 compat before): https://github.com/pandas-dev/pandas/blob/master/pandas/core/dtypes/inference.py#L419 |
Is your feature request related to a problem?
I wish to construct
pandas.DataFrame
from iterable ofdataclasses.dataclass
as from iterable of tuplesDataFrame.from_records
. The rationale behind is that data classes is more typed object than general tuple or dictionary. Also, data classes more memory efficient thantuple
's. It makes data classes attractive to use them instead ofdict
's ortuple
's whenever schema is known.Describe the solution you'd like
I would like class method
.from_dataclasses
which allowsDataFrame
construction and type inference from uniform (for simplicity) sequence of data classes. See example below.In the example above schema of
DataFrame
is infered withRecord.__annotations__
dictionary which contains type user provided type information. API could also provide ways to validate schema in runtime by comparying type of actual type and specified type for a column.API breaking implications
There is no API breaking in general but there is requirements to minimum Python version (which is 3.7).