New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-72793: C implementation of parts of copy.deepcopy #91610
base: main
Are you sure you want to change the base?
Conversation
@pablogsal Would you be able to review this PR? |
@serhiy-storchaka As the latest core develop working on this file, would you be able to review this PR? |
83741a6
to
2f8e489
Compare
af2a201
to
502c747
Compare
I don't feel comfortable reviewing the C code, but you will need to add tests to ensure the fallback and C implementation have the same inputs/outputs/etc -- the easiest way would be to duplicate the tests and run one set for the C accelerator and one for the Python version.
A
Misc/NEWS.d/next/Core and Builtins/2022-04-16-20-21-13.gh-issue-72793.qc-BP-.rst
Outdated
Show resolved
Hide resolved
2c18cb6
to
c0fd5c9
Compare
@AA-Turner The Note: in earlier versions of the PR the fallback there was a funny |
If both functions exist and can be theoretically used, both must be tested. You can A |
Modules/Setup.bootstrap.in
Outdated
@@ -25,6 +25,7 @@ _weakref _weakref.c | |||
|
|||
# commonly used core modules | |||
_abc _abc.c | |||
_copy _copy.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build rule should be either added to Setup.bootstrap.in
or Setup.stdlib.in
rules. Is the module required to run setup.py
? In case it it not required, it should not be added to the bootstrap file.
I don't think we have any documentation concerning new modules in the dev guide yet.
for (i = 0; i < N_DISPATCHERS; ++i) { | ||
dd = &deepcopy_dispatch[i]; | ||
if (Py_TYPE(x) != dd->type) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be faster to replace the atomic types and dispatch types code with a _Py_hashtable_t
table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_hashopenssl.c
has an example how to use the hash table API.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
There are some issues with the |
@tiran noted the fallback will be used by PyPy, so it must indeed be tested. I will add the required tests |
We have helper code to block / force imports to test both pure and C accelerated features.
|
I addressed some of the review comments. @serhiy-storchaka Could you indicate which issues with the |
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
See https://github.com/orgs/python/projects/9. Note that some issues are common for the |
Good work so far!
- If you pass
PyObject *module
to all functions then you can access the module state for free.PyState_FindModule
is costly. - You have to provide callbacks to clear/free the module state.
- We prefer a different approach to test C and Py implementations of a module.
@@ -0,0 +1,60 @@ | |||
"""Unit tests for the copy module without the C accelerator.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use a different approach to test pure Python and C accelerated code:
- create a test class for C and Python implementation
- remove
import copy
- replace
copy.*
withself.module....
py_copy = import_helper.import_fresh_module(
'copy', blocked=['_copy']
)
c_copy = import_helper.import_fresh_module(
'copy', fresh=['_copy']
)
class TestCopy:
module = None
def test_exceptions(self):
self.assertIs(self.module.Error, self.module.error)
self.assertTrue(issubclass(self.module.Error, Exception))
class TestCopyPy(TestCopy, unittest.TestCase):
module = py_copy
@unittest.skipUnless(c_copy, 'requires _copy')
class TestCopyC(TestCopy, unittest.TestCase):
module = c_copy
# XXX: import_fresh_module() is supposed to leave sys.module cache untouched, | ||
# XXX: but it does not, so we have to cleanup ourselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a new bug report for this issue.
static struct PyModuleDef copy_moduledef = { | ||
PyModuleDef_HEAD_INIT, | ||
"_copy", | ||
"C implementation of deepcopy", | ||
sizeof(PyObject *), | ||
functions, | ||
NULL, | ||
NULL, | ||
NULL, | ||
NULL | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New code should use named struct initialization, e.g. .m_name = "_copy",
.
Since you store a strong Python reference to a mutable object in the module state, you must provide traverse, clear, and free callbacks.
And please don't reuse the module state pointer as a PyObject *
. It's a clever hack -- a bit too clever and surprising for the next person that reads the code. :)
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
The original idea and implementation are from @Villemoes. The original issue has been inactive for a long time.
Benchmark on deepcopy of a
dict
anddataclass
:Benchmark details
Test script
Fixes #72793
Pyperformance results
Pyperformance results show small speedup, although this could very well be a random fluctuation. (there are no explicit calls to deepcopy in the pyperformance tests)
Notes