gh-91049: Introduce set vectorcall field API for PyFunctionObject #92257

adphrost · 2022-05-03T19:33:02Z

Avoid specializing functions with overridden vectorcall field

bedevere-bot · 2022-05-03T19:33:05Z

Every change to Python requires a NEWS entry.

Please, add it using the blurb_it Web app or the blurb command-line tool.

adphrost · 2022-05-03T19:39:26Z

@markshannon Can you let me know if there should be any other checks in ceval.c besides the one added in this PR?

For context, this check fixes an edge case where calling function with a set vectorfield only worked with a unpacked tuple, e.g. f(1) failed but f(*(1,)) succeeded (now both succeed)

itamaro

thanks for working on this @adphrost !

Lib/test/test_call.py

itamaro · 2022-05-03T19:48:44Z

Python/ceval.c

@@ -4792,7 +4792,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
            PyObject *function = PEEK(total_args + 1);
            int positional_args = total_args - KWNAMES_LEN();
            // Check if the call can be inlined or not
-            if (Py_TYPE(function) == &PyFunction_Type && tstate->interp->eval_frame == NULL) {
+            if (Py_TYPE(function) == &PyFunction_Type && tstate->interp->eval_frame == NULL && ((PyFunctionObject *)function)->vectorcall == _PyFunction_Vectorcall) {


this probably exceeds line length

What is the max line length allowed? And is there a linter I can run to handle this for me?

vstinner · 2022-05-03T19:50:52Z

Misc/NEWS.d/next/C API/2022-05-03-19-35-37.gh-issue-92193.61VoFL.rst

+
+This PR introduces an API call where vectorcall field can be modified like so:
+
+`void PyFunction_SetVectorcall(PyFunctionObject *func, vectorcallfunc vectorcall);`


Please document the function in Doc/c-api/function.rst and in the NEWS entry, refer to use with:

:c:func:`PyFunction_SetVectorcall`

New public functions should also be documented in What's New in Python 3.11 > C API > New features: Doc/whatsnew/3.11.rst.

markshannon

A few comments.
Have you attempted to measure the performance impact of this? (I'd expect it to be small or negligible).

Objects/funcobject.c

Modules/_testcapimodule.c

Change how assertion is done Co-authored-by: Itamar Ostricher <itamarost@gmail.com>

cpython-cla-bot · 2022-05-03T21:00:13Z

All commit authors signed the Contributor License Agreement.

itamaro · 2022-05-04T19:28:59Z

Have you attempted to measure the performance impact of this? (I'd expect it to be small or negligible).

@markshannon is there a simple way to ask a buildbot to run pyperformance comparing this against the base revision? we can do that manually, but a buildbot doing it for us would be nice :)

itamaro · 2022-05-19T03:43:10Z

circling back to @markshannon 's performance impact question. -- tldr Geometric mean: 1.00x faster

some more details:

rebased this PR onto a4460f2 and ran pyperformance on a4460f2 and on the rebased branch

both baseline and branch were built using ./configure --enable-optimizations --with-lto && make -j

used pyperformance 1.0.5 (installed directly from GitHub) on bare metal AWS machine (Linux-5.13.0-1023-aws-x86_64-with-glibc2.31 with 72 logical CPUs), and excluded the sqlalchemy benchmarks because they tried building greenlet and failed

pyperformance run --benchmarks=-sqlalchemy_declarative,-sqlalchemy_imperative --python builds/3.12-opt/python -o main.json --rigorous
pyperformance run --benchmarks=-sqlalchemy_declarative,-sqlalchemy_imperative --python builds/3.12-gh-91049-vectorcall-opt/python -o gh-91049.json --rigorous

compare results:

pyperf compare_to main.json gh-91049.json -G
Slower (20):
- regex_effbot: 3.02 ms +- 0.01 ms -> 3.17 ms +- 0.01 ms: 1.05x slower
- meteor_contest: 110 ms +- 2 ms -> 114 ms +- 4 ms: 1.04x slower
- pidigits: 200 ms +- 0 ms -> 208 ms +- 0 ms: 1.04x slower
- telco: 7.17 ms +- 0.21 ms -> 7.39 ms +- 0.34 ms: 1.03x slower
- html5lib: 64.0 ms +- 2.7 ms -> 65.6 ms +- 2.6 ms: 1.02x slower
- unpickle_list: 5.30 us +- 0.08 us -> 5.40 us +- 0.05 us: 1.02x slower
- pyflate: 430 ms +- 3 ms -> 438 ms +- 4 ms: 1.02x slower
- sympy_sum: 168 ms +- 2 ms -> 171 ms +- 2 ms: 1.02x slower
- xml_etree_iterparse: 110 ms +- 1 ms -> 112 ms +- 3 ms: 1.02x slower
- async_tree_memoization: 640 ms +- 25 ms -> 650 ms +- 25 ms: 1.01x slower
- chaos: 71.8 ms +- 0.7 ms -> 72.8 ms +- 0.6 ms: 1.01x slower
- hexiom: 6.71 ms +- 0.06 ms -> 6.80 ms +- 0.06 ms: 1.01x slower
- sympy_str: 299 ms +- 3 ms -> 302 ms +- 4 ms: 1.01x slower
- spectral_norm: 101 ms +- 1 ms -> 103 ms +- 1 ms: 1.01x slower
- sympy_integrate: 21.5 ms +- 0.2 ms -> 21.8 ms +- 0.2 ms: 1.01x slower
- logging_silent: 105 ns +- 1 ns -> 107 ns +- 3 ns: 1.01x slower
- raytrace: 303 ms +- 2 ms -> 306 ms +- 4 ms: 1.01x slower
- regex_v8: 23.1 ms +- 0.3 ms -> 23.3 ms +- 0.1 ms: 1.01x slower
- unpickle_pure_python: 239 us +- 2 us -> 240 us +- 2 us: 1.01x slower
- 2to3: 282 ms +- 2 ms -> 283 ms +- 1 ms: 1.00x slower

Faster (25):
- genshi_xml: 57.1 ms +- 2.7 ms -> 54.3 ms +- 1.8 ms: 1.05x faster
- scimark_sparse_mat_mult: 4.93 ms +- 0.06 ms -> 4.70 ms +- 0.10 ms: 1.05x faster
- scimark_lu: 112 ms +- 2 ms -> 109 ms +- 2 ms: 1.03x faster
- json_dumps: 13.3 ms +- 0.2 ms -> 12.9 ms +- 0.1 ms: 1.03x faster
- scimark_fft: 337 ms +- 3 ms -> 331 ms +- 3 ms: 1.02x faster
- nbody: 97.6 ms +- 1.8 ms -> 95.9 ms +- 1.8 ms: 1.02x faster
- pathlib: 23.0 ms +- 1.2 ms -> 22.6 ms +- 0.8 ms: 1.02x faster
- pickle_pure_python: 324 us +- 3 us -> 318 us +- 3 us: 1.02x faster
- genshi_text: 23.2 ms +- 0.4 ms -> 22.8 ms +- 0.3 ms: 1.02x faster
- unpack_sequence: 46.5 ns +- 0.6 ns -> 45.7 ns +- 0.7 ns: 1.02x faster
- fannkuch: 401 ms +- 6 ms -> 396 ms +- 4 ms: 1.01x faster
- chameleon: 7.06 ms +- 0.06 ms -> 6.98 ms +- 0.08 ms: 1.01x faster
- pickle_dict: 27.5 us +- 0.1 us -> 27.2 us +- 1.0 us: 1.01x faster
- pickle: 10.2 us +- 0.1 us -> 10.1 us +- 0.1 us: 1.01x faster
- xml_etree_parse: 170 ms +- 2 ms -> 168 ms +- 3 ms: 1.01x faster
- xml_etree_process: 56.8 ms +- 0.5 ms -> 56.3 ms +- 0.5 ms: 1.01x faster
- nqueens: 88.0 ms +- 1.1 ms -> 87.2 ms +- 1.1 ms: 1.01x faster
- json_loads: 26.3 us +- 0.3 us -> 26.1 us +- 0.2 us: 1.01x faster
- async_tree_none: 549 ms +- 15 ms -> 544 ms +- 12 ms: 1.01x faster
- async_tree_cpu_io_mixed: 771 ms +- 13 ms -> 766 ms +- 10 ms: 1.01x faster
- mako: 10.5 ms +- 0.0 ms -> 10.4 ms +- 0.1 ms: 1.01x faster
- crypto_pyaes: 77.2 ms +- 0.8 ms -> 76.8 ms +- 0.9 ms: 1.01x faster
- xml_etree_generate: 81.2 ms +- 0.6 ms -> 80.8 ms +- 0.5 ms: 1.00x faster
- regex_compile: 142 ms +- 1 ms -> 141 ms +- 1 ms: 1.00x faster
- scimark_sor: 121 ms +- 1 ms -> 120 ms +- 1 ms: 1.00x faster

Benchmark hidden because not significant (18): async_tree_io, deltablue, django_template, dulwich_log, float, go, logging_format, logging_simple, pickle_list, python_startup, python_startup_no_site, regex_dna, richards, scimark_monte_carlo, sqlite_synth, sympy_expand, tornado_http, unpickle

Geometric mean: 1.00x faster

thanks @ansley for doing the perf analysis on this :)

Andrew Frost added 2 commits May 3, 2022

Enable setting vectorcall field on PyFunctionObjects

9cf96ec

check if vectorcall is nondefault before inlining call

282e3dc

adphrost requested a review from markshannon as a code owner May 3, 2022

bedevere-bot added the awaiting review label May 3, 2022

📜🤖 Added by blurb_it.

edcdcf1

itamaro reviewed May 3, 2022

View changes

vstinner reviewed May 3, 2022

View changes

markshannon reviewed May 3, 2022

View changes

Objects/funcobject.c Outdated Show resolved Hide resolved

Modules/_testcapimodule.c Outdated Show resolved Hide resolved

Modules/_testcapimodule.c Outdated Show resolved Hide resolved

Apply suggestions from code review

debf5a9

Change how assertion is done Co-authored-by: Itamar Ostricher <itamarost@gmail.com>

itamaro mentioned this pull request May 3, 2022

gh-92193: Check for overridden vectorcall field on PyFunctionObject #92194

Closed

addressed comments

473d18d

markshannon mentioned this pull request May 3, 2022

GH-92239: Respect PEP 523 #92245

Merged

added to docs

76e8c9d

itamaro mentioned this pull request May 25, 2022

Allow extensions to set the vectorcall field on instances of PyFunctionObject #91049

Open

python / cpython Public

gh-91049: Introduce set vectorcall field API for PyFunctionObject #92257

gh-91049: Introduce set vectorcall field API for PyFunctionObject #92257

adphrost commented May 3, 2022

bedevere-bot commented May 3, 2022

adphrost commented May 3, 2022

itamaro left a comment

itamaro May 3, 2022

adphrost May 3, 2022

vstinner May 3, 2022

markshannon left a comment

cpython-cla-bot bot commented May 3, 2022 •

edited

itamaro commented May 4, 2022

itamaro commented May 19, 2022


		This PR introduces an API call where vectorcall field can be modified like so:

		`void PyFunction_SetVectorcall(PyFunctionObject *func, vectorcallfunc vectorcall);`

python / cpython Public

gh-91049: Introduce set vectorcall field API for PyFunctionObject #92257

Are you sure you want to change the base?

gh-91049: Introduce set vectorcall field API for PyFunctionObject #92257

Conversation

adphrost commented May 3, 2022

bedevere-bot commented May 3, 2022

adphrost commented May 3, 2022

itamaro left a comment

itamaro May 3, 2022

Choose a reason for hiding this comment

adphrost May 3, 2022

Choose a reason for hiding this comment

vstinner May 3, 2022

Choose a reason for hiding this comment

markshannon left a comment

cpython-cla-bot bot commented May 3, 2022 • edited

itamaro commented May 4, 2022

itamaro commented May 19, 2022

cpython-cla-bot bot commented May 3, 2022 •

edited