pandas-dev / pandas Public
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Center rolling window for time offset #38780
Conversation
take |
pandas/_libs/window/indexers.pyx
Outdated
@@ -11,7 +11,7 @@ def calculate_variable_window_bounds( | |||
int64_t num_values, | |||
int64_t window_size, | |||
object min_periods, # unused but here to match get_window_bounds signature | |||
object center, # unused but here to match get_window_bounds signature | |||
object center, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be bint center
now if supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your quick response, this is done now. :)
pandas/core/window/rolling.py
Outdated
@@ -450,7 +457,36 @@ def homogeneous_func(values: np.ndarray): | |||
if values.size == 0: | |||
return values.copy() | |||
|
|||
def _calculate_center_offset(window) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this code below should be needed.
The center
calculation should just adjust the indexers return in calculate_variable_window_bounds
and everything else should work correctly.
pandas/core/window/rolling.py
Outdated
@@ -334,6 +334,13 @@ def _get_window_indexer(self) -> BaseIndexer: | |||
""" | |||
if isinstance(self.window, BaseIndexer): | |||
return self.window | |||
|
|||
if self.is_freq_type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clause shouldn't be needed either. self,center
should just be passed to VariableWindowIndexer
in the clause below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is corrected too
pandas/core/window/rolling.py
Outdated
or isinstance( | ||
self._on, (ABCDatetimeIndex, ABCTimedeltaIndex, ABCPeriodIndex) | ||
) | ||
) and isinstance(self.window, (str, BaseOffset, timedelta)): | ||
|
||
self._validate_monotonic() | ||
|
||
# we don't allow center | ||
if self.center: | ||
# we don't allow center for offset based windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire clause should be removed once center
is supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this now, too! :)
pandas/core/window/rolling.py
Outdated
_calculate_center_offset(self.window) | ||
if self.center | ||
and not isinstance( | ||
self._get_window_indexer(self.window), VariableWindowIndexer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should be passing an argument here, just self._get_window_indexer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the argument here, too. Thanks for the feedback!
pandas/_libs/window/indexers.pyx
Outdated
@@ -74,14 +78,27 @@ def calculate_variable_window_bounds( | |||
# right endpoint is open | |||
else: | |||
end[0] = 0 | |||
if center_window: | |||
for j in range(0, num_values+1): | |||
if (index[j] == index[0] + index_growth_sign*window_size/2 and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use spaces around operators & parens to format this (we don't lint these files automatically)
pandas/core/window/rolling.py
Outdated
def _center_window(self, result: np.ndarray, offset: int) -> np.ndarray: | ||
""" | ||
Center the result in the window for weighted rolling aggregations. | ||
""" | ||
if self.axis > result.ndim - 1: | ||
raise ValueError("Requested axis is larger then no. of argument dimensions") | ||
|
||
if offset > 0: | ||
lead_indexer = [slice(None)] * result.ndim | ||
lead_indexer[self.axis] = slice(offset, None) | ||
result = np.copy(result[tuple(lead_indexer)]) | ||
return result | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an effort to make the tests run again before introducing new ones, I encountered an error with "'Rolling' object has no attribute '_center_window'".
I tried to solve the error by adding the _center_window()
method to the BaseWindow
class.
However, this resulted in new errors.
I am not sure whether it is correct to follow this debugging path. Could you quickly comment @jreback , @mroeschke or @arw2019?
Meanwhile, I am working on adding the changes you asked for. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #38780 (comment), I think you shouldn't need any additional code in this _apply
method.
The changes should just need to occur in pandas/_libs/window/indexers.pyx
in the calculate_variable_window_bounds
function to make it return indexes that "center" the window.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, for non time offests, center is handled when calculating the indexes of each window
pandas/pandas/core/window/indexers.py
Line 81 in 67d4cae
if center: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #38780 (comment), I think you shouldn't need any additional code in this
_apply
method.The changes should just need to occur in
pandas/_libs/window/indexers.pyx
in thecalculate_variable_window_bounds
function to make it return indexes that "center" the window.
Just removed the unnecessary lines from rolling.py and that made the tests work again, thanks for that hint! At first I understood I had to migrate those lines to the pandas/_libs/window/indexers.pyx
instead of ditching them all together.
pandas/_libs/window/indexers.pyx
Outdated
@@ -60,6 +62,8 @@ def calculate_variable_window_bounds( | |||
|
|||
if index[num_values - 1] < index[0]: | |||
index_growth_sign = -1 | |||
if center: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sine we typed center
as bint
. This isnt needed anymore
pandas/core/window/rolling.py
Outdated
@@ -334,9 +334,12 @@ def _get_window_indexer(self) -> BaseIndexer: | |||
""" | |||
if isinstance(self.window, BaseIndexer): | |||
return self.window | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you undo this white space?
pandas/core/window/rolling.py
Outdated
@@ -451,6 +454,7 @@ def homogeneous_func(values: np.ndarray): | |||
return values.copy() | |||
|
|||
def calc(x): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you undo this white space?
pandas/core/window/rolling.py
Outdated
@@ -1880,20 +1884,14 @@ def validate(self): | |||
# we allow rolling on a datetimelike index | |||
if ( | |||
self.obj.empty | |||
# TODO: add "or self.is_datetimelike"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this TODO
pls also add a whatsnew note (enh is ok), as well as see if any updates are needed in the https://pandas.pydata.org/pandas-docs/dev/user_guide/window.html, also the doc-string of rolling might need a small update as well
pandas/tests/window/test_rolling.py
Outdated
|
||
tm.assert_frame_equal(result, expected) | ||
|
||
|
||
def test_closed_fixed_binary_col(): | ||
def test_datetimelike_centered_selections(closed, arithmetic_win_operators): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you paramterize on closed (e.g. directly put your window selection up there with the closed parameter), eg.. don't use the fixture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, this is done, too.
pandas/tests/window/test_rolling.py
Outdated
+ [77.0] * 2 | ||
) | ||
|
||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if doing this pls paramterize outside the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
doc/source/whatsnew/v1.3.0.rst
Outdated
used (:issue:`38780`). | ||
For example: | ||
|
||
.. code-block:: ipython |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ipython:: python
so the code runs (don't need the prompts or output just the code)
@@ -134,6 +134,30 @@ a copy will no longer be made (:issue:`32960`) | |||
The default behavior when not passing ``copy`` will remain unchanged, i.e. | |||
a copy will be made. | |||
|
|||
Centered Datetime-Like Rolling Windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would not object to an example in window.rst as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the same example also to the window.rst now
doc/source/whatsnew/v1.3.0.rst
Outdated
index=date_range("2020", periods=5, freq="1D") | ||
) | ||
In [2]: df.rolling("2D", center=True).mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the In
Out
annotations when using .. ipython:: python
(the code will execute line by line
def test_even_number_window_alignment(): | ||
# see discussion in GH 38780 | ||
s = Series(range(3), index=date_range(start="2020-01-01", freq="D", periods=3)) | ||
|
||
# behavior of index- and datetime-based windows differs here! | ||
# s.rolling(window=2, min_periods=1, center=True).mean() | ||
|
||
result = s.rolling(window="2D", min_periods=1, center=True).mean() | ||
|
||
expected = Series([0.5, 1.5, 2], index=s.index) | ||
|
||
tm.assert_series_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke I added this test to check for the behavior when operating with even sized windows. As mentioned by @sevberg this currently does behave differently than the classic index-based windows.
Is that what you had in mind here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would also work with N
instead of D
as frequency, therefore I am not sure if this is what you meant when referring to nano seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I believe this test exercises the difference in behavior which is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
just a minor doc comment, ping on green (I think you addressed all of @mroeschke comments but if not pls do)
df = pd.DataFrame( | ||
{"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -156,6 +156,16 @@ By default the labels are set to the right edge of the window, but a | |||
s.rolling(window=5).mean() | |||
s.rolling(window=5, center=True).mean() | |||
This can also be applied to datetime-like indices. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a versionadded 1.3 tag here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, too!
thanks @luholgit very nice! |
Big thanks to you, @jreback and @mroeschke! I've learned a lot while working on this. :) |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Added center functionality for
VariableWindowIndexer
. Note - I am unsure if the NotImplementedError in lines 1966-1969 in rolling.py still correctly raises an error for offset based windows.Finalizes previous PR #36097
The text was updated successfully, but these errors were encountered: