Fix inconsistent results for empty datasets describe #41716

weikhor · 2021-05-29T09:55:20Z

[✔ ] closes BUG: GroupBy.describe produces inconsistent results for empty datasets #41575
[ ✔] tests added / passed
[✔ ] Ensure all linting tests pass, see here for how to run them

…pep8

weikhor · 2021-05-29T17:06:55Z

Hi @jreback This PR is about fixing #41575. Thank

pandas/core/groupby/generic.py

pandas/tests/groupby/test_groupby.py

jreback · 2021-05-31T21:57:27Z

this currently fails on master right?

pep8speaks · 2021-06-01T15:56:36Z

Hello @weikhor! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file pandas/core/groupby/generic.py:

Line 1864:36: W292 no newline at end of file

In the file pandas/tests/groupby/test_function.py:

Line 1210:1: E302 expected 2 blank lines, found 1

Comment last updated at 2021-06-08 16:24:04 UTC

weikhor · 2021-06-01T16:30:08Z

this currently fails on master right?
Yes

…by_describe_empty_dataset

github-actions · 2021-07-09T00:02:25Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

rhshadrach

Thanks for the PR! I don't understand why the changing of errors being raised in _cython_transform and _cython_agg_general, can you explain why that's needed?

rhshadrach · 2021-07-15T21:40:34Z

pandas/core/groupby/generic.py

@@ -675,7 +682,7 @@ def nunique(self, dropna: bool = True) -> Series:
    @doc(Series.describe)
    def describe(self, **kwargs):
        result = self.apply(lambda x: x.describe(**kwargs))
-        if self.axis == 1:
+        if self.axis == 1 or not isinstance(result.index, MultiIndex):


As @jreback commented, can check that it's not empty. I think the code would be more readable if we added it as a separate check without the transpose - I couldn't figure out why you'd want to take the transpose before realizing the frame is empty.

if result.empty: return result elif self.axis == 1: return result.T

rhshadrach · 2021-07-15T21:42:07Z

pandas/core/groupby/generic.py

@@ -1850,4 +1861,4 @@ def func(df):

        return self._python_apply_general(func, self._obj_with_exclusions)

-    boxplot = boxplot_frame_groupby
+    boxplot = boxplot_frame_groupby


Add a newline here

rhshadrach · 2021-07-15T21:43:08Z

pandas/tests/groupby/test_function.py

@@ -1206,3 +1206,10 @@ def test_groupby_sum_below_mincount_nullable_integer():
    result = grouped.sum(min_count=2)
    expected = DataFrame({"b": [pd.NA] * 3, "c": [pd.NA] * 3}, dtype="Int64", index=idx)
    tm.assert_frame_equal(result, expected)
+
+def test_groupby_empty_dataset():
+    # 41575


Can you prefix with GH or GH # or GH#? We're not consistent, but I believe there is usually some prefix.

rhshadrach · 2021-07-15T21:44:29Z

pandas/tests/groupby/test_function.py

+def test_groupby_empty_dataset():
+    # 41575
+    df = DataFrame(columns=["A", "B", "C"])
+    result = df.groupby("A").B.describe().reset_index(drop=True)


Can you remove the reset index? The test should just test the behavior of describe.

mroeschke · 2021-08-17T02:03:23Z

Thanks for the PR, but it appears that this has gone stale and needs to address the reviews. Closing due to inactivity but happy to reopen if you're interested in continuing.

weikhor added 8 commits May 29, 2021 17:01

produces inconsistent results for empty datasets

931aba0

add test for inconsistent results for empty datasets

014faaf

reverse automatic change the lambda function to def created from auto…

5b492b3

…pep8

reverse automatic change the lambda function to def created from auto…

cda0ced

…pep8

change double quote to single quote

9cc26b9

if else

a271919

add

fb89500

resolve standard

242ce40

weikhor changed the title ~~Resolve about inconsistent results for empty datasets describe~~ Fix inconsistent results for empty datasets describe May 29, 2021

weikhor added 2 commits May 30, 2021 01:40

resolve different types if the DataFrame is empty

7e5eb1b

resolve line spacing

ff4e0ad

jreback requested changes May 31, 2021

View reviewed changes

pandas/core/groupby/generic.py Outdated Show resolved Hide resolved

pandas/tests/groupby/test_groupby.py Outdated Show resolved Hide resolved

jreback added Bug Groupby labels May 31, 2021

Khor Chean Wei added 2 commits June 1, 2021 23:48

Merge branch 'pandas-dev:master' into master

68539ba

add test group by

6dd9964

Khor Chean Wei added 3 commits June 1, 2021 23:57

Delete test_aggregate.py

998ca52

Update frame.py

7946226

Update frame.py

a70d337

Khor Chean Wei added 5 commits June 8, 2021 22:57

Merge branch 'pandas-dev:master' into master

0df7fc1

Merge branch 'master' of https://github.com/weikhor/pandas into group…

7a3dbe9

…by_describe_empty_dataset

update multiindex

510f757

git pull from master

49c556d

git pull from master

69ea1cd

github-actions bot added the Stale label Jul 9, 2021

rhshadrach requested changes Jul 15, 2021

View reviewed changes

mroeschke closed this Aug 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix inconsistent results for empty datasets describe #41716

Fix inconsistent results for empty datasets describe #41716

Uh oh!

weikhor commented May 29, 2021

Uh oh!

weikhor commented May 29, 2021

Uh oh!

Uh oh!

Uh oh!

jreback commented May 31, 2021

Uh oh!

pep8speaks commented Jun 1, 2021 •

edited

Loading

Uh oh!

weikhor commented Jun 1, 2021

Uh oh!

github-actions bot commented Jul 9, 2021

Uh oh!

rhshadrach left a comment

Uh oh!

rhshadrach Jul 15, 2021

Uh oh!

rhshadrach Jul 15, 2021

Uh oh!

rhshadrach Jul 15, 2021

Uh oh!

rhshadrach Jul 15, 2021

Uh oh!

mroeschke commented Aug 17, 2021

Uh oh!

Uh oh!

Uh oh!

Fix inconsistent results for empty datasets describe #41716

Fix inconsistent results for empty datasets describe #41716

Uh oh!

Conversation

weikhor commented May 29, 2021

Uh oh!

weikhor commented May 29, 2021

Uh oh!

Uh oh!

Uh oh!

jreback commented May 31, 2021

Uh oh!

pep8speaks commented Jun 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2021-06-08 16:24:04 UTC

Uh oh!

weikhor commented Jun 1, 2021

Uh oh!

github-actions bot commented Jul 9, 2021

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

rhshadrach Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

rhshadrach Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

rhshadrach Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

rhshadrach Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

mroeschke commented Aug 17, 2021

Uh oh!

Uh oh!

pep8speaks commented Jun 1, 2021 •

edited

Loading