Skip to content

BUG: aggfunc use different default arguments in pivot_table #36508

Closed
@dcsaba89

Description

@dcsaba89
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['a', 'a'], 'X': [5, 9]})

# standard deviation of X:
np.std(df['X'])
# 2.0

# aggfunc returns the same standard deviation as np.std as expected with this trick
pd.pivot_table(df, index='A', values='X', aggfunc=lambda x: np.std(x))
#        X
#  A
#  a     2

# aggfunc returns the unbiased standard deviation unexpectedly
pd.pivot_table(df, index='A', values='X', aggfunc=np.std)

#               X
#  A
#  a     2.828427

Problem description

The default value of ddof for np.std is 0:
numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=)

When np,std passed to aggfunc to calculate the standard deviation of X it returns the unbiased standard deviation, because it picks different ddof =1 and does not pick up the default ddof = 0.
On the other hand the expected behavior if we have a function f, when we pass it to aggfunc:
aggfunc=f and aggfunc=lambda x: f(x) must return exactly the same result.

Expected Output

  1. When np.std is passed to aggfunc directly it picks up ddof=1 by default (this is unexpected as the default ddof for np.std is 0).
  2. When np.std is passed to aggfunc by using lambda x: np.std(x), it picks up correctly the default ddof=0 from numpy as expected.

To summerize, the expected behavior is to use the function's default arguments when it is passed to aggregate values in pd.pivot_table.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 2a7d332
python : 3.8.5.final.0
python-bits : 32
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.1.2
numpy : 1.19.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions