Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-72793: C implementation of parts of copy.deepcopy #91610

Draft
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Apr 16, 2022

The original idea and implementation are from @Villemoes. The original issue has been inactive for a long time.

Benchmark on deepcopy of a dict and dataclass:

deepcopy dict: Mean +- std dev: [b] 7.28 us +- 0.27 us -> [d] 982 ns +- 24 ns: 7.41x faster
deepcopy dataclass: Mean +- std dev: [b] 6.49 us +- 0.07 us -> [d] 3.69 us +- 0.11 us: 1.76x faster

Geometric mean: 3.61x faster

Benchmark details

Test script
import pyperf
runner = pyperf.Runner()

setup="""
import copy

a={'list': [1,2,3,43], 't': (1,2,3), 'str': 'hello', 'subdict': {'a': True}}

from dataclasses import dataclass

@dataclass
class A:
    a : list
    b : str
    c : bool
    
dc=A([1,2,3], 'hello', True)
"""

runner.timeit(name=f"deepcopy dict", stmt=f"b=copy.deepcopy(a)", setup=setup)
runner.timeit(name=f"deepcopy dataclass", stmt=f"b=copy.deepcopy(dc)", setup=setup)

Fixes #72793

Pyperformance results

Pyperformance results show small speedup, although this could very well be a random fluctuation. (there are no explicit calls to deepcopy in the pyperformance tests)

2to3: Mean +- std dev: [base] 325 ms +- 5 ms -> [patch] 322 ms +- 4 ms: 1.01x faster
chaos: Mean +- std dev: [base] 87.8 ms +- 0.8 ms -> [patch] 88.3 ms +- 1.1 ms: 1.01x slower
float: Mean +- std dev: [base] 103 ms +- 2 ms -> [patch] 98.5 ms +- 2.5 ms: 1.04x faster
go: Mean +- std dev: [base] 163 ms +- 1 ms -> [patch] 165 ms +- 2 ms: 1.01x slower
json_dumps: Mean +- std dev: [base] 15.1 ms +- 0.5 ms -> [patch] 14.9 ms +- 0.1 ms: 1.01x faster
json_loads: Mean +- std dev: [base] 28.0 us +- 0.3 us -> [patch] 28.1 us +- 0.3 us: 1.01x slower
logging_format: Mean +- std dev: [base] 8.11 us +- 0.14 us -> [patch] 8.18 us +- 0.12 us: 1.01x slower
logging_silent: Mean +- std dev: [base] 133 ns +- 1 ns -> [patch] 130 ns +- 1 ns: 1.02x faster
meteor_contest: Mean +- std dev: [base] 128 ms +- 1 ms -> [patch] 126 ms +- 1 ms: 1.01x faster
nbody: Mean +- std dev: [base] 111 ms +- 2 ms -> [patch] 113 ms +- 2 ms: 1.01x slower
nqueens: Mean +- std dev: [base] 106 ms +- 1 ms -> [patch] 106 ms +- 1 ms: 1.00x slower
pathlib: Mean +- std dev: [base] 23.0 ms +- 0.4 ms -> [patch] 23.2 ms +- 0.7 ms: 1.01x slower
pickle: Mean +- std dev: [base] 10.5 us +- 0.1 us -> [patch] 10.6 us +- 0.4 us: 1.01x slower
pickle_dict: Mean +- std dev: [base] 30.9 us +- 0.2 us -> [patch] 31.3 us +- 0.3 us: 1.02x slower
pickle_list: Mean +- std dev: [base] 4.41 us +- 0.04 us -> [patch] 4.55 us +- 0.06 us: 1.03x slower
pyflate: Mean +- std dev: [base] 532 ms +- 8 ms -> [patch] 535 ms +- 4 ms: 1.01x slower
regex_compile: Mean +- std dev: [base] 163 ms +- 1 ms -> [patch] 165 ms +- 1 ms: 1.01x slower
regex_dna: Mean +- std dev: [base] 238 ms +- 2 ms -> [patch] 213 ms +- 1 ms: 1.12x faster
regex_effbot: Mean +- std dev: [base] 3.50 ms +- 0.03 ms -> [patch] 3.08 ms +- 0.03 ms: 1.14x faster
regex_v8: Mean +- std dev: [base] 29.2 ms +- 0.7 ms -> [patch] 25.8 ms +- 0.9 ms: 1.13x faster
richards: Mean +- std dev: [base] 57.4 ms +- 1.5 ms -> [patch] 58.2 ms +- 1.3 ms: 1.01x slower
scimark_fft: Mean +- std dev: [base] 447 ms +- 10 ms -> [patch] 452 ms +- 2 ms: 1.01x slower
scimark_lu: Mean +- std dev: [base] 142 ms +- 2 ms -> [patch] 145 ms +- 1 ms: 1.02x slower
scimark_sor: Mean +- std dev: [base] 155 ms +- 1 ms -> [patch] 156 ms +- 2 ms: 1.01x slower
scimark_sparse_mat_mult: Mean +- std dev: [base] 6.18 ms +- 0.11 ms -> [patch] 6.03 ms +- 0.06 ms: 1.02x faster
spectral_norm: Mean +- std dev: [base] 150 ms +- 7 ms -> [patch] 146 ms +- 3 ms: 1.02x faster
unpack_sequence: Mean +- std dev: [base] 52.6 ns +- 1.0 ns -> [patch] 52.2 ns +- 0.6 ns: 1.01x faster
unpickle_list: Mean +- std dev: [base] 5.77 us +- 0.06 us -> [patch] 5.88 us +- 0.10 us: 1.02x slower
xml_etree_parse: Mean +- std dev: [base] 176 ms +- 3 ms -> [patch] 173 ms +- 2 ms: 1.02x faster

Benchmark hidden because not significant (17): deltablue, fannkuch, hexiom, logging_simple, pickle_pure_python, pidigits, python_startup, python_startup_no_site, raytrace, scimark_monte_carlo, sqlite_synth, telco, unpickle, unpickle_pure_python, xml_etree_iterparse, xml_etree_generate, xml_etree_process

Geometric mean: 1.01x faster

Notes

@eendebakpt eendebakpt marked this pull request as draft Apr 16, 2022
@eendebakpt eendebakpt marked this pull request as ready for review Apr 16, 2022
@eendebakpt
Copy link
Contributor Author

@eendebakpt eendebakpt commented Apr 28, 2022

@pablogsal Would you be able to review this PR?

@eendebakpt
Copy link
Contributor Author

@eendebakpt eendebakpt commented May 27, 2022

@serhiy-storchaka As the latest core develop working on this file, would you be able to review this PR?

@eendebakpt eendebakpt force-pushed the deepcopy_c_implementation branch from 83741a6 to 2f8e489 Compare Jun 7, 2022
@eendebakpt eendebakpt force-pushed the deepcopy_c_implementation branch 2 times, most recently from af2a201 to 502c747 Compare Jun 7, 2022
@AA-Turner AA-Turner added the extension-modules label Jun 7, 2022
Copy link
Member

@AA-Turner AA-Turner left a comment

I don't feel comfortable reviewing the C code, but you will need to add tests to ensure the fallback and C implementation have the same inputs/outputs/etc -- the easiest way would be to duplicate the tests and run one set for the C accelerator and one for the Python version.

A

Lib/test/test_capi.py Show resolved Hide resolved
@eendebakpt eendebakpt force-pushed the deepcopy_c_implementation branch from 2c18cb6 to c0fd5c9 Compare Jun 8, 2022
@eendebakpt
Copy link
Contributor Author

@eendebakpt eendebakpt commented Jun 8, 2022

I don't feel comfortable reviewing the C code, but you will need to add tests to ensure the fallback and C implementation have the same inputs/outputs/etc -- the easiest way would be to duplicate the tests and run one set for the C accelerator and one for the Python version.

A

@AA-Turner The _deepcopy_fallback is only used for objects not handled by the C deepcopy method (and vice versa). therefore we cannot test both the fallback and C on the same inputs. The deepcopy is the public method (and _deepcopy_fallback is called via deepcopy), so I think we only need to test on deepcopy. If there are any tests you would like me to add, let me know.

Note: in earlier versions of the PR the fallback there was a funny try-except statement to fix a build problem. This I resolved by making _copy a builtin module.

@AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented Jun 8, 2022

The _deepcopy_fallback is only used for objects not handled by the C deepcopy method (and vice versa). therefore we cannot test both the fallback and C on the same inputs.

If both functions exist and can be theoretically used, both must be tested. You can import copy._deepcopy_fallback and import _copy.deepcopy to test the functions independently. If I understand correctly, in your current patch _deepcopy_fallback is only ever called from the C layer. If so, you should make this more clear -- oftentimes the C accelerator has a pure Python fallback which implements the same method when the extension module can't be loaded.

A

@@ -25,6 +25,7 @@ _weakref _weakref.c

# commonly used core modules
_abc _abc.c
_copy _copy.c
Copy link
Member

@tiran tiran Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build rule should be either added to Setup.bootstrap.in or Setup.stdlib.in rules. Is the module required to run setup.py? In case it it not required, it should not be added to the bootstrap file.

I don't think we have any documentation concerning new modules in the dev guide yet.

Modules/Setup.stdlib.in Show resolved Hide resolved
Lib/copy.py Outdated Show resolved Hide resolved
Modules/_copy.c Outdated Show resolved Hide resolved
Modules/_copy.c Outdated Show resolved Hide resolved
for (i = 0; i < N_DISPATCHERS; ++i) {
dd = &deepcopy_dispatch[i];
if (Py_TYPE(x) != dd->type)
continue;
Copy link
Member

@tiran tiran Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be faster to replace the atomic types and dispatch types code with a _Py_hashtable_t table.

Copy link
Member

@tiran tiran Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_hashopenssl.c has an example how to use the hash table API.

@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Jun 8, 2022

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Jun 8, 2022

There are some issues with the copy module. I prefer to resolve them prior to adding the C implementation.

@eendebakpt
Copy link
Contributor Author

@eendebakpt eendebakpt commented Jun 8, 2022

The _deepcopy_fallback is only used for objects not handled by the C deepcopy method (and vice versa). therefore we cannot test both the fallback and C on the same inputs.

If both functions exist and can be theoretically used, both must be tested. You can import copy._deepcopy_fallback and import _copy.deepcopy to test the functions independently. If I understand correctly, in your current patch _deepcopy_fallback is only ever called from the C layer. If so, you should make this more clear -- oftentimes the C accelerator has a pure Python fallback which implements the same method when the extension module can't be loaded.

A

@tiran noted the fallback will be used by PyPy, so it must indeed be tested. I will add the required tests

@tiran
Copy link
Member

@tiran tiran commented Jun 8, 2022

We have helper code to block / force imports to test both pure and C accelerated features.

from test.support.import_helper import import_fresh_module

copy_py = import_fresh_module('copy', blocked=['_copy'])
try:
    copy_c = import_fresh_module('copy', fresh=['_copy'])
except ImportError:
    copy_c = None

@eendebakpt eendebakpt requested a review from as a code owner Jun 8, 2022
@eendebakpt
Copy link
Contributor Author

@eendebakpt eendebakpt commented Jun 8, 2022

I addressed some of the review comments.

@serhiy-storchaka Could you indicate which issues with the copy module there are, and whether there already is a timeline on addressing them?

Lib/copy.py Outdated Show resolved Hide resolved
Lib/copy.py Show resolved Hide resolved
Lib/test/test_capi.py Show resolved Hide resolved
eendebakpt and others added 2 commits Jun 8, 2022
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Jun 9, 2022

@serhiy-storchaka Could you indicate which issues with the copy module there are, and whether there already is a timeline on addressing them?

See https://github.com/orgs/python/projects/9. Note that some issues are common for the pickle and copy modules. Some issues are about copying concrete classes, they can be ignored in this context. But #93627 is specially about inconsistency between implementations, it should be addressed first.

Copy link
Member

@tiran tiran left a comment

Good work so far!

  • If you pass PyObject *module to all functions then you can access the module state for free. PyState_FindModule is costly.
  • You have to provide callbacks to clear/free the module state.
  • We prefer a different approach to test C and Py implementations of a module.

@@ -0,0 +1,60 @@
"""Unit tests for the copy module without the C accelerator."""
Copy link
Member

@tiran tiran Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a different approach to test pure Python and C accelerated code:

  • create a test class for C and Python implementation
  • remove import copy
  • replace copy.* with self.module....
py_copy = import_helper.import_fresh_module(
    'copy', blocked=['_copy']
)
c_copy = import_helper.import_fresh_module(
    'copy', fresh=['_copy']
)


class TestCopy:
    module = None

    def test_exceptions(self):
        self.assertIs(self.module.Error, self.module.error)
        self.assertTrue(issubclass(self.module.Error, Exception))


class TestCopyPy(TestCopy, unittest.TestCase):
    module = py_copy


@unittest.skipUnless(c_copy, 'requires _copy')
class TestCopyC(TestCopy, unittest.TestCase):
    module = c_copy

# XXX: import_fresh_module() is supposed to leave sys.module cache untouched,
# XXX: but it does not, so we have to cleanup ourselves.
Copy link
Member

@tiran tiran Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open a new bug report for this issue.

Modules/_copy.c Outdated Show resolved Hide resolved
Modules/_copy.c Outdated Show resolved Hide resolved
Modules/_copy.c Outdated Show resolved Hide resolved
Modules/_copy.c Outdated Show resolved Hide resolved
static struct PyModuleDef copy_moduledef = {
PyModuleDef_HEAD_INIT,
"_copy",
"C implementation of deepcopy",
sizeof(PyObject *),
functions,
NULL,
NULL,
NULL,
NULL
};
Copy link
Member

@tiran tiran Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New code should use named struct initialization, e.g. .m_name = "_copy",.

Since you store a strong Python reference to a mutable object in the module state, you must provide traverse, clear, and free callbacks.

And please don't reuse the module state pointer as a PyObject *. It's a clever hack -- a bit too clever and surprising for the next person that reads the code. :)

eendebakpt and others added 4 commits Jun 10, 2022
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Christian Heimes <christian@python.org>
@eendebakpt eendebakpt marked this pull request as draft Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes extension-modules performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants