-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-102471: Add PyLong import and export API #121339
base: main
Are you sure you want to change the base?
Conversation
See also issue #111415 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my 2c.
The gmpy2 code, used for benchmarking, can be found in my fork:
https://github.com/skirpichev/gmpy/tree/trying-py-import-export
Objects/longobject.c
Outdated
PyUnstable_Long_Export(PyObject *obj, PyUnstable_LongExport *long_export) | ||
{ | ||
if (!PyLong_Check(obj)) { | ||
PyErr_Format(PyExc_TypeError, "expect int, got %T", obj); | ||
return -1; | ||
} | ||
PyLongObject *self = (PyLongObject*)obj; | ||
|
||
long_export->obj = (PyLongObject*)Py_NewRef(obj); | ||
long_export->negative = _PyLong_IsNegative(self); | ||
long_export->ndigits = _PyLong_DigitCount(self); | ||
if (long_export->ndigits == 0) { | ||
long_export->ndigits = 1; | ||
} | ||
long_export->digits = self->long_value.ob_digit; | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this mostly give a direct access to the PyLongObject - it almost as fast as using private stuff before.
Old code:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**2' 'mpz(x)'
1000000 loops, best of 20: 232 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=10**100' 'mpz(x)'
500000 loops, best of 11: 500 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**1000' 'mpz(x)'
100000 loops, best of 20: 2.53 usec per loop
With proposed API:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**2' 'mpz(x)'
1000000 loops, best of 20: 258 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**100' 'mpz(x)'
500000 loops, best of 20: 528 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**1000' 'mpz(x)'
100000 loops, best of 20: 2.56 usec per loop
Objects/longobject.c
Outdated
PyObject* | ||
PyUnstable_Long_Import(int negative, size_t ndigits, Py_digit *digits) | ||
{ | ||
return (PyObject*)_PyLong_FromDigits(negative, ndigits, digits); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is something I would like to avoid. This requires allocation of a temporary buffer and using memcpy. Can we offer a writable layout to use it's digits in the mpz_export directly?
Benchmarks for old code:
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 11: 111 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 11: 475 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 11: 2.39 usec per loop
With new API:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 20: 111 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 20: 578 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 20: 2.53 usec per loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires allocation of a temporary buffer and using memcpy.
Right, PyLongObject has to manage its own memory.
Can we offer a writable layout to use it's digits in the mpz_export directly?
That sounds strange from the Python point of view and make the internals "less opaque". I would prefer to leak less implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, PyLongObject has to manage its own memory.
I'm not trying to change that. More complete proposal: vstinner#4
gmpy2 patch: https://github.com/skirpichev/gmpy/tree/trying-py-import-export-v2
New benchmarks:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 20: 111 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 20: 509 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 20: 2.44 usec per loop
I would prefer to leak less implementation details.
I don't think this leak anything. It doesn't leak memory management details. PyLong_Import
will just do allocation memory. Writting digits will be job for mpz_export, as before.
Without this, it seems - there are noticeable performance regression for integers of intermediate range. Something up to 20% vs 7% on my branch.
Edit: currently, proposed PyUnstable_Long_ReleaseImport()
match PyUnstable_Long_ReleaseExport()
. Perhaps, it could be one function (say, PyUnstable_Long_ReleaseDigitArray()), but I unsure - maybe it puts some constraints on internals of the PyLongObject.
CC @tornaria, as Sage people might be interested in this feature. |
CC @oscarbenjamin, you may want this for python-flint |
I updated my PR:
|
Misc/NEWS.d/next/C API/2024-07-03-17-26-53.gh-issue-102471.XpmKYk.rst
Outdated
Show resolved
Hide resolved
Absolutely. Currently python-flint uses a hex-string as an intermediate format when converting between large int and |
@skirpichev: I added a PyLongWriter API similar to what @encukou proposed. Example: PyLongObject *
_PyLong_FromDigits(int negative, Py_ssize_t digit_count, digit *digits)
{
PyLongWriter *writer = PyLongWriter_Create();
if (writer == NULL) {
return NULL;
}
if (negative) {
PyLongWriter_SetSign(writer, -1);
}
Py_digit *writer_digits = PyLongWriter_AllocDigits(writer, digit_count);
if (writer_digits == NULL) {
goto error;
}
memcpy(writer_digits, digits, digit_count * sizeof(digit));
return (PyLongObject*)PyLongWriter_Finish(writer);
error:
PyLongWriter_Discard(writer);
return NULL;
} The >>> import _testcapi; _testcapi.pylong_import(0, [100, 0, 0]) is 100
True |
I mark the PR as a draft until we agree on the API. |
Yes, that looks better and should fix speed regression. I'll try to benchmark that, perhaps tomorrow. But cost is 5 (!) public functions and one new struct, additionally to |
I updated the PR to remove the |
My concern is to avoid the problem capi-workgroup/api-evolution#36 : avoid exposing |
@skirpichev: Would it be useful to add a |
Update:
|
This reverts commit b70a6dd.
Change reverted: capi-workgroup/decisions#35 (comment) |
This function always succeeds if *obj* is a Python :class:`int` object or a | ||
subclass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner, after some thinking I believe we should drop this contract, if we take into account future changes in internals of CPython's integers. If single layout view will be invalid - this function, probably, will allocate temporary buffers. That might fail.
Can we offer a different contract instead, something like this: "This function always succeeds if obj is a Python int
object or a subclass and it's value can't be converted to C long."? In this case users have a clear hint: "try something like PyLong_AsLongAndOverflow and if it fails - fallback to this function". If not, I think this is a severe issue with that part of API.
If tomorrow, Python internals change and exporting a small integer requires a copy, I would prefer to copy by default. Maybe we can add a "NO_COPY" flag to the function: it would fail if a copy is needed. |
That's fine. My point is: this might fail.
Then what user should do in this case? Catch exception and try some PyLong_As*() function?
Well, if we can't do this, maybe we could say that this function might raise only MemoryError's? |
That seems reasonable.
I don't see how that would be useful. If this fails because there is not enough memory when allocating a PyLong for a small integer then it is probably reasonable that there is a MemoryError and that the call should fail. The calling code should likely give up whatever it is trying to do as well. It already says "On error, set an exception and return -1" which implies that the caller should check the return value. |
Exactly.
No, currently is says "This function always succeeds if obj is a Python :class: But we can't just drop this sentence. Should caller expect here other exceptions? What to do for them? |
I created python/peps#3965 to remove the sentence. |
Rename reserved member to _reserved.
On PyPy, a copy is always needed, so the function can fail (with MemoryError) even for small integers: https://discuss.python.org/t/pep-757-c-api-to-import-export-python-integers/63895/12 |
FYI, in aleaxit/gmpy#517 I did int<->mpz conversion PyLong_AsNativeBytes/FromNativeBytes. Maybe it worth to include this in PEP's benchmarks? |
Objects/longobject.c
Outdated
const PyLongLayout PyLong_LAYOUT = { | ||
.bits_per_digit = PyLong_SHIFT, | ||
.digits_order = -1, // least significant first | ||
.endian = PY_LITTLE_ENDIAN ? -1 : 1, | ||
.digit_size = sizeof(digit), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that if you make this static const ...
then you shouldn't need to add this to ignored.tsv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails with:
Objects/longobject.c - PyLong_LAYOUT not supported (typespec (unknown)) static const PyLongLayout
I would prefer to only have a single flavor of the code in the PEP benchmarks. Moreover, the API is still being discussed: https://discuss.python.org/t/pep-757-c-api-to-import-export-python-integers/63895/18 Let's wait and update benchmarks later. |
On Mon, Sep 16, 2024 at 10:03:11AM -0700, Victor Stinner wrote:
FYI, in [1]aleaxit/gmpy#517 I did int<->mpz conversion
PyLong_AsNativeBytes/FromNativeBytes. Maybe it worth to include this in
PEP's benchmarks?
I would prefer to only have a single flavor of the code in the PEP
benchmarks. Moreover, the API is still being discussed:
[2]https://discuss.python.org/t/pep-757-c-api-to-import-export-python-integers/63895/18
Let's wait and update benchmarks later.
The rationale is to show, that using existing API costs performance
regression wrt current master (using private API). We will keep just
code snippets with new API, as examples of it's usage. (E.g. we don't show
now code in the gmpy2 master.) Just add a link to this pr.
But I agree, lets wait.
Edit:
reading (int->mpz):
| Benchmark | master | bytes | new-api |
|----------------|:-------:|:---------------------:|:---------------------:|
| 1<<7 | 262 ns | 274 ns: 1.05x slower | 267 ns: 1.02x slower |
| 1<<38 | 319 ns | 283 ns: 1.12x faster | 275 ns: 1.16x faster |
| 1<<300 | 514 ns | 913 ns: 1.77x slower | 586 ns: 1.14x slower |
| 1<<3000 | 2.35 us | 4.49 us: 1.91x slower | 2.42 us: 1.03x slower |
| Geometric mean | (ref) | 1.33x slower | 1.01x slower |
writing (mpz->int):
| Benchmark | master | bytes | new-api |
|----------------|:-------:|:---------------------:|:---------------------:|
| 1<<38 | 210 ns | 215 ns: 1.02x slower | not significant |
| 1<<300 | 519 ns | 795 ns: 1.53x slower | 567 ns: 1.09x slower |
| 1<<3000 | 2.26 us | 4.52 us: 2.00x slower | 2.31 us: 1.02x slower |
| Geometric mean | (ref) | 1.33x slower | 1.02x slower |
Benchmark hidden because not significant (1): 1<<7
I think, that my patch with *Bytes API can be optimized to avoid memory allocation, but hardly this will critically mitigate speed difference. And this will not work with ``--enable-nails`` GMP builds.
|
* Rename functions and structure: * PyLong_AsDigitArray() => PyLong_Export() * PyLong_FreeDigitArray() => PyLong_FreeExport() * PyLong_DigitArray => PyLongExport * 'array' => 'export_long'
Since PEP 757 is actively being discussed, I mark this PR as a draft again. @skirpichev @pitrou: I updated my PR to implement @pitrou's idea: add |
Add PyLong_Export() and PyLong_Import() functions and PyLong_LAYOUT structure.
📚 Documentation preview 📚: https://cpython-previews--121339.org.readthedocs.build/