Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-38490: statistics: Add covariance, Pearson's correlation, and simple linear regression #16813

Merged
merged 31 commits into from Apr 25, 2021

Conversation

@twolodzko
Copy link
Contributor

@twolodzko twolodzko commented Oct 15, 2019

This PR adds functions for calculating bivariate covariance and Pearson's correlation.

https://bugs.python.org/issue38490

@the-knights-who-say-ni
Copy link

@the-knights-who-say-ni the-knights-who-say-ni commented Oct 15, 2019

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

Recognized GitHub username

We couldn't find a bugs.python.org (b.p.o) account corresponding to the following GitHub usernames:

@twolodzko

This might be simply due to a missing "GitHub Name" entry in one's b.p.o account settings. This is necessary for legal reasons before we can look at this contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

Tim and others added 14 commits Oct 15, 2019
Tim
Tim
Tim
Tim
Tymoteusz Wołodźko
Tymoteusz Wołodźko
Tymoteusz Wołodźko
Tymoteusz Wołodźko
Tymoteusz Wołodźko
Tymoteusz Wołodźko
Tymoteusz Wołodźko added 4 commits Oct 17, 2019
Tymoteusz Wołodźko
Tymoteusz Wołodźko
@corona10
Copy link
Member

@corona10 corona10 commented Oct 18, 2019

I'd like to recommend you to open the PR after the discussion on https://bugs.python.org/issue38490 is finalized. :)

@twolodzko twolodzko changed the title bpo-38490: Add covariance and Pearson's correlation bpo-38490: Add covariance, Pearson's correlation, and simple linear regression Oct 22, 2019
@twolodzko twolodzko changed the title bpo-38490: Add covariance, Pearson's correlation, and simple linear regression bpo-38490: statistics: Add covariance, Pearson's correlation, and simple linear regression Oct 22, 2019
Doc/library/statistics.rst Outdated Show resolved Hide resolved
Doc/library/statistics.rst Outdated Show resolved Hide resolved
@taleinat
Copy link
Contributor

@taleinat taleinat commented Oct 19, 2020

The only current outstanding issue is the order of returned values from linear_regression().

ISTM given the short discussion here that the order of (intercept, gradient) could go either way, and that in the context of statistics the current order seems fine. And since this will be a named tuple, the order is not as important as it would be otherwise. As long as the documentation and doc-string are consistent and clear on this, which they are, IMO this is fine.

I'd like to get @rhettinger's approval, or otherwise, for this before moving forward.

@stevendaprano
Copy link
Member

@stevendaprano stevendaprano commented Oct 19, 2020

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Oct 19, 2020

@stevendaprano I'd be against having predict_x as this is incorrect if this is a statistical model y = a + bx + noise, because it ignores the noise.

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Nov 2, 2020

It's been two weeks and the discussion seems to got stuck. Is there anything more I should change about the PR?

@ZackerySpytz
Copy link
Contributor

@ZackerySpytz ZackerySpytz commented Nov 2, 2020

Please add an entry to Doc/whatsnew/3.10.rst and add versionadded directives to the documentation for the new functions.

@taleinat
Copy link
Contributor

@taleinat taleinat commented Nov 2, 2020

@stevendaprano I'd be against having predict_x as this is incorrect if this is a statistical model y = a + bx + noise, because it ignores the noise.

I'm also -1 on adding .predict_{x,y} methods. I feel that they would be beyond the scope of what this module sets out to achieve, and they'd also be surprising for many to find on a named tuple.

Doc/whatsnew/3.10.rst Outdated Show resolved Hide resolved
@taleinat
Copy link
Contributor

@taleinat taleinat commented Nov 2, 2020

Apart from my minor suggestion for the What's New entry, this looks good to me.

@rhettinger, would you like to take another look? Any other comments?

Co-authored-by: Tal Einat <taleinat+github@gmail.com>
Copy link
Member

@stevendaprano stevendaprano left a comment

@twolodzko I am very impressed in this, thank you, and I look forward to using it. I just commented on what looks like a broken piece of ReST and a slightly unusual choice of wording, but apart from this I am very happy with your patch and I think it should be merged.

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Nov 3, 2020

If I'm not missing something, all issues seem to be resolved right now.

@taleinat
Copy link
Contributor

@taleinat taleinat commented Nov 3, 2020

If I'm not missing something, all issues seem to be resolved right now.

Indeed.

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Nov 25, 2020

It's been over 20 days since last discussion on this PR. I don't want to be pushy, but this PR was posted over a year ago and at some point I would forget about it as well, and all the time we all spend on it would be wasted. If there's anything I can do about it, I'm open to it, but right now it is not exactly clear for me if we are waiting for something?

@ZackerySpytz
Copy link
Contributor

@ZackerySpytz ZackerySpytz commented Nov 26, 2020

Unfortunately, it seems like there are some minor grammatical errors in the new docstrings and documentation.

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Nov 26, 2020

@ZackerySpytz could you be more specific?

@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Nov 28, 2020

I don't know what is to be fixed. Hope this will get merged at some point. If there's anything I need to do, please tag me.

@twolodzko twolodzko requested review from rhettinger and stevendaprano Feb 24, 2021
@twolodzko
Copy link
Contributor Author

@twolodzko twolodzko commented Apr 22, 2021

This PR is ready. It waits for reviews and merge. If there is anything I can do about it, please let me know.

@taleinat taleinat merged commit 09aa6f9 into python:master Apr 25, 2021
10 checks passed
10 checks passed
Docs
Details
Check for source changes
Details
Windows (x86)
Details
Windows (x64)
Details
macOS
Details
Ubuntu
Details
Azure Pipelines PR #20201110.55 succeeded
Details
Travis CI - Pull Request Build Passed
Details
bedevere/issue-number Issue number 38490 found
Details
bedevere/news News entry found in Misc/NEWS.d
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Apr 25, 2021

@taleinat: Please replace # with GH- in the commit message next time. Thanks!

@taleinat
Copy link
Contributor

@taleinat taleinat commented Apr 25, 2021

Merged!

This has indeed reached a very good state a while ago and was reviewed favorably by several core devs. Definitely good to have this for 3.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants