bpo-32780: Fix the PEP3118 format string for ctypes.Structure #5561

eric-wieser · 2018-02-06T05:33:46Z

The summary of this diff is that it:

adds a _ctypes_alloc_format_padding function to append strings like 37x to a format string to indicate 37 padding bytes
removes the branches that amount to "give up on producing a valid format string if the struct is packed"
combines the resulting adjacent if (isStruct) {s now that neither is if (isStruct && !isPacked) {
invokes _ctypes_alloc_format_padding to add padding between structure fields, and after the last structure field. The computation used for the total size is unchanged from ctypes already used.

This patch does not affect any existing aligment computation; all it does is use subtraction to deduce the amount of paddnig introduced by the existing code.

Without this fix, it would never include padding bytes - an assumption that was only
valid in the case when _pack_ was set - and this case was explicitly not implemented.

This should allow conversion from ctypes structs to numpy structs

Fixes numpy/numpy#10528

https://bugs.python.org/issue32780

(#76961)

the-knights-who-say-ni · 2018-02-06T05:33:48Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

Without this fix, it would never include padding bytes - an assumption that was only valid in the case when `_pack_` was set - and this case was explicitly not implemented. This should allow conversion from ctypes structs to numpy structs.

mattip

LGTM. The PR is sufficient to fix the problems described. Perhaps judicious use of spacing between fields could make the format clearer, especially when anonymous padding is added. For instance, I would find "<b:x:7x<Q:y:" clearer as "<b:x: 7x <Q:y:". Not critical as the format is probably going to be machine-parsed anyway.

eric-wieser · 2018-05-23T15:45:02Z

That's looks like a reasonable suggestion, bit possibly out of scope for this PR.

Less intrusively, I could add whitespace in the test expectations, and remove it before comparing.

Let's see how the core devs feel.

Thanks for the review!

eric-wieser · 2018-10-21T07:25:03Z

@abalkin: Any chance you could take a look at this?

eric-wieser · 2019-04-14T06:35:52Z

@brettcannon: I'd argue this is type-bugfix, not type-enhancement - the implementation in master produces invalid buffers in almost all cases. This isn't just adding support for _pack_ (an enhancement), but also fixing the behavior for when _pack_ is absent.

mattip · 2019-09-23T05:30:45Z

ping

eric-wieser · 2019-11-28T00:39:35Z

Resolved merge conflicts with https://bugs.python.org/issue22273

mattip · 2020-05-24T22:13:30Z

Is there a buffer protocol expert who could help move this along?

github-actions · 2022-08-14T00:12:38Z

This PR is stale because it has been open for 30 days with no activity.

mattip · 2022-08-14T08:24:20Z

How can we move this 4-year old PR forward?

abalkin · 2022-08-14T09:31:39Z

Modules/_ctypes/stgdict.c

+    Py_ssize_t log_n = 0;
+    while (n > 0) {
+        log_n++;
+        n /= 10;


Multiplication is often faster than division. Can this be rewritten by computing powers of 10 until n is exceeded?

Better yet, just inline linear search.

if (n < 10ULL) return 1; if (n < 100ULL) return 2; ... if (n < 10_000_000_000_000_000_000ULL) return 20;

On the second thought, I would be surprised if this has not been implemented elsewhere in cpython code base. Off the top of my head, I cannot recall where it could be, but I will try to search. If someone beats me to it - please leave a note.

Can this be rewritten by computing powers of 10 until n is exceeded?

This is risky because the power can overflow

abalkin · 2022-08-14T09:59:46Z

Modules/_ctypes/stgdict.c

+    }
+
+    /* decimal characters + x + null */
+    buf = PyMem_Malloc(clog10(padding) + 2);


I left a comment about log10 implementation above, but looking at the actual use, I don't see why it is needed. Can't we just make buf

char buf[20];

and not allocate it on the heap?

I wanted to avoid risking introducing a buffer overflow by accident by choosing too short a buffer; especially since I don't think we care about performance here. The whole framework for building the format strings here consists of repeated heap allocations, so one more allocation doesn't seem like a big deal.

I think 20 isn't actually enough, as an int64 can need up to 19 digits, and then we need the x and the null.

~~I could ask PyOS_snprintf to compute the size for me if you'd prefer? Although I can't see any evidence that PyOS_snprintf is actually called with a null buffer anywhere in CPython.~~ Nevermind, PyOS_snprintf does not support this feature of snprintf (#95993)

I've pushed the version with stack allocation as requested

the-knights-who-say-ni added the CLA not signed label Feb 6, 2018

bedevere-bot added the awaiting review label Feb 6, 2018

eric-wieser force-pushed the ctypes-padding branch from fd7e46e to 95157de Compare Feb 6, 2018

eric-wieser mentioned this pull request Feb 6, 2018

BUG: Cannot convert ctypes struct into numpy array numpy/numpy#10528

Closed

eric-wieser force-pushed the ctypes-padding branch 5 times, most recently from d851b6e to 9dbc70a Compare Feb 6, 2018

eric-wieser mentioned this pull request Feb 6, 2018

Numpy does not recognize ctypes arrays with c_wchar field numpy/numpy#10100

Closed

eric-wieser force-pushed the ctypes-padding branch from 9dbc70a to ac4e5bb Compare Feb 6, 2018

This comment has been minimized.

Sign in to view

eric-wieser mentioned this pull request Feb 7, 2018

BUG: np.dtype(ctype) does not respect endianness numpy/numpy#10533

Closed

eric-wieser mentioned this pull request Apr 25, 2018

WIP: Remove fragile use of __array_interface__ in ctypeslib.as_array numpy/numpy#10970

Merged

eric-wieser mentioned this pull request May 21, 2018

bpo-32782: PEP3118 itemsize of an empty ctypes array should not be 0 #5576

Merged

mattip approved these changes May 23, 2018

View changes

bedevere-bot added awaiting core review and removed awaiting review labels May 23, 2018

Mariatta removed the CLA not signed label Jun 15, 2018

the-knights-who-say-ni added the CLA signed label Jun 15, 2018

abalkin self-assigned this Jul 8, 2018

eric-wieser mentioned this pull request Aug 12, 2018

BUG: np.dtype(ctypes.Structure) does not respect _pack_ field numpy/numpy#10532

Closed

brettcannon added the type-feature A feature request or enhancement label Apr 2, 2019

brettcannon added type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Apr 17, 2019

eric-wieser mentioned this pull request May 26, 2019

BUG: converting list of ctypes.Structure to ndarray numpy/numpy#13628

Open

eric-wieser mentioned this pull request Nov 28, 2019

BUG: Use __array__ during dimension discovery numpy/numpy#14995

Merged

Merge branch 'master' into ctypes-padding

e6021e1

eric-wieser requested review from abalkin and skrah Feb 10, 2020

eric-wieser mentioned this pull request Apr 22, 2020

segfault when "viewing" closed mmap object numpy/numpy#9537

Closed

eric-wieser mentioned this pull request Jul 17, 2020

Custom array containers lose memoryview functionality numpy/numpy#16803

Open

Merge remote-tracking branch 'upstream/main' into ctypes-padding

97e9d95

eric-wieser mannequin mentioned this pull request Jul 7, 2022

ctypes: memoryview gives incorrect PEP3118 format strings for both packed and unpacked structs #76961

Open

eric-wieser added 2 commits Jul 8, 2022

remove unused variable, extend documentation

6bf5a25

comment

3b7f14f

ezio-melotti removed the CLA signed label Jul 13, 2022

github-actions bot added the stale Stale PR or inactive for long period of time. label Aug 14, 2022

abalkin reviewed Aug 14, 2022

View changes

github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 15, 2022

eric-wieser added 2 commits Aug 15, 2022

fix incorrect name and documentation

27b1601

remove the heap allocation as requested

e300e16

eric-wieser requested review from abalkin and removed request for skrah Oct 7, 2022

Merge branch 'main' into ctypes-padding

7e8b4f9

bpo-32780: Fix the PEP3118 format string for ctypes.Structure #5561

bpo-32780: Fix the PEP3118 format string for ctypes.Structure #5561

eric-wieser commented Feb 6, 2018 •

edited

the-knights-who-say-ni commented Feb 6, 2018

This comment has been minimized.

mattip left a comment

eric-wieser commented May 23, 2018

eric-wieser commented Oct 21, 2018

eric-wieser commented Apr 14, 2019

mattip commented Sep 23, 2019

eric-wieser commented Nov 28, 2019

mattip commented May 24, 2020

github-actions bot commented Aug 14, 2022

mattip commented Aug 14, 2022

abalkin Aug 14, 2022

abalkin Aug 14, 2022

eric-wieser Aug 15, 2022

abalkin Aug 14, 2022

eric-wieser Aug 15, 2022 •

edited

eric-wieser Aug 15, 2022

bpo-32780: Fix the PEP3118 format string for ctypes.Structure #5561

Are you sure you want to change the base?

bpo-32780: Fix the PEP3118 format string for ctypes.Structure #5561

Conversation

eric-wieser commented Feb 6, 2018 • edited

the-knights-who-say-ni commented Feb 6, 2018

This comment has been minimized.

mattip left a comment

eric-wieser commented May 23, 2018

eric-wieser commented Oct 21, 2018

eric-wieser commented Apr 14, 2019

mattip commented Sep 23, 2019

eric-wieser commented Nov 28, 2019

mattip commented May 24, 2020

github-actions bot commented Aug 14, 2022

mattip commented Aug 14, 2022

abalkin Aug 14, 2022

Choose a reason for hiding this comment

abalkin Aug 14, 2022

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022

Choose a reason for hiding this comment

abalkin Aug 14, 2022

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022 • edited

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022

Choose a reason for hiding this comment

eric-wieser commented Feb 6, 2018 •

edited

eric-wieser Aug 15, 2022 •

edited