bpo-38490: statistics: Add covariance, Pearson's correlation, and simple linear regression #16813
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). Recognized GitHub usernameWe couldn't find a bugs.python.org (b.p.o) account corresponding to the following GitHub usernames: This might be simply due to a missing "GitHub Name" entry in one's b.p.o account settings. This is necessary for legal reasons before we can look at this contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
I'd like to recommend you to open the PR after the discussion on https://bugs.python.org/issue38490 is finalized. :) |
The only current outstanding issue is the order of returned values from ISTM given the short discussion here that the order of I'd like to get @rhettinger's approval, or otherwise, for this before moving forward. |
Given that this is the statistics module, not linear algebra, I think we
ought to stick to the common convention taught in stats classes, which
is
y = a + b x
i.e. (intercept, gradient) in that order. Unless Raymond strongly
objects, I say go with this order.
Do we have time to add a new feature to this before feature-freeze? I
would like to add two methods to the tuple returned:
def predict_y(self, x):
"""Return the predicted y value for the given x.
Returns the predicted response or dependent variable.
"""
return self.intercept + self.gradient*x
def predict_x(self, y):
"""Return the predicted x value for the given y.
Returns the predicted explanatory or independent variable.
"""
return (y - self.intercept)/self.gradient
Predicting the x or y values from the linear regression line are very
common operations, I think it would be very useful to offer them without
expecting users to create the function themselves.
…--
Steve
|
@stevendaprano I'd be against having |
It's been two weeks and the discussion seems to got stuck. Is there anything more I should change about the PR? |
Please add an entry to |
I'm also -1 on adding |
Apart from my minor suggestion for the What's New entry, this looks good to me. @rhettinger, would you like to take another look? Any other comments? |
Co-authored-by: Tal Einat <taleinat+github@gmail.com>
@twolodzko I am very impressed in this, thank you, and I look forward to using it. I just commented on what looks like a broken piece of ReST and a slightly unusual choice of wording, but apart from this I am very happy with your patch and I think it should be merged. |
If I'm not missing something, all issues seem to be resolved right now. |
Indeed. |
It's been over 20 days since last discussion on this PR. I don't want to be pushy, but this PR was posted over a year ago and at some point I would forget about it as well, and all the time we all spend on it would be wasted. If there's anything I can do about it, I'm open to it, but right now it is not exactly clear for me if we are waiting for something? |
Unfortunately, it seems like there are some minor grammatical errors in the new docstrings and documentation. |
@ZackerySpytz could you be more specific? |
I don't know what is to be fixed. Hope this will get merged at some point. If there's anything I need to do, please tag me. |
This PR is ready. It waits for reviews and merge. If there is anything I can do about it, please let me know. |
09aa6f9
into
python:master
@taleinat: Please replace |
Merged! This has indeed reached a very good state a while ago and was reviewed favorably by several core devs. Definitely good to have this for 3.10. |
This PR adds functions for calculating bivariate covariance and Pearson's correlation.
https://bugs.python.org/issue38490