Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upbpo-39087: Add PyUnicode_GetUTF8Buffer(). #17659
Conversation
from _testcapi import unicode_getutf8buffer, test_unicode_getutf8buffer | ||
|
||
# Run tests wrtten in C. Raise an error when test failed. | ||
test_unicode_getutf8buffer() |
This comment has been minimized.
This comment has been minimized.
vstinner
Dec 20, 2019
Member
Hum. test_capi executes all test_capi attributes with a name starting with "test_". Please check if it's executed. If yes, just rename the method. For example, rename it to unicode_test_getutf8buffer()).
This comment has been minimized.
This comment has been minimized.
methane
Dec 20, 2019
Author
Member
You are right. I confirmed the test_unicode_getutf8buffer is called from test_capi.
I just removed calling test_unicode_getutf8buffer here because there is test_unicode_compare_with_ascii
already.
Let's run the test from test_capi.
@@ -1061,6 +1061,28 @@ These are the UTF-8 codec APIs: | |||
raised by the codec. | |||
.. c:function: int PyUnicode_GetUTF8Buffer(PyObject *unicode, const char errors, Py_buffer *view) |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 21, 2019
Member
I am not sure about the order of parameters. There is a logic in placing the output parameter at the end, but in PyObject_GetBuffer()
(for which I modelled the name) it is not the last parameter. Also, if we will add more similar functions in future, with multiple additional parameters (like flags
and errors
), it is better to place them at the end.
} | ||
Py_ssize_t refcnt = Py_REFCNT(str); | ||
|
||
if (PyUnicode_GetUTF8Buffer(str, NULL, &buf) < 0) { |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 21, 2019
Member
It is not possible that PyUnicode_GetUTF8Buffer()
fails for an ASCII string.
def test_getutf8buffer(self): | ||
from _testcapi import unicode_getutf8buffer | ||
|
||
ascii_ = "foo" |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
methane
Dec 25, 2019
Author
Member
- Follow other tests
- Variable name is a hint why this value is tested.
return PyBuffer_FillInfo(view, unicode, | ||
PyUnicode_UTF8(unicode), | ||
PyUnicode_UTF8_LENGTH(unicode), | ||
1, PyBUF_SIMPLE); |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 21, 2019
Member
1, PyBUF_SIMPLE); | |
1 /* readonly */, PyBUF_SIMPLE); |
} | ||
|
||
if (PyUnicode_UTF8(unicode) != NULL) { | ||
return PyBuffer_FillInfo(view, unicode, |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 21, 2019
Member
We should check that Py_TYPE(unicode)->tp_as_buffer == NULL
. Just for the case if the string subclass implement the buffer protocol.
This comment has been minimized.
This comment has been minimized.
IMHO it's a bad old habit that test_capi runs "all" tests. I prefer that test_unicode tests the C API of Unicode objects. So rename the method instead. |
{ | ||
PyObject *unicode; | ||
const char *errors = NULL; | ||
if(!PyArg_ParseTuple(args, "U|s", &unicode, &errors)) { |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 25, 2019
Member
U
calls PyUnicode_READY()
. This can affect the testing. I suggest to use O
.
"without exception set. (%s:%d)", | ||
__FILE__, __LINE__); | ||
} | ||
Py_DECREF(str); |
This comment has been minimized.
This comment has been minimized.
return NULL; | ||
} | ||
|
||
if (buf.obj == str || !PyBytes_CheckExact(buf.obj)) { |
This comment has been minimized.
This comment has been minimized.
// Test 3: There is a UTF-8 cache | ||
// Reuse str of the previoss test. | ||
|
||
const char *cache = PyUnicode_AsUTF8(str); |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 25, 2019
Member
Would it be too difficult to test also that there was no cache before calling PyUnicode_AsUTF8()
?
"without exception set. (%s:%d)", | ||
__FILE__, __LINE__); | ||
} | ||
Py_DECREF(str); |
This comment has been minimized.
This comment has been minimized.
"Py_REFCNT(str) must not be changed. (%s:%d)", | ||
__FILE__, __LINE__); | ||
PyBuffer_Release(&buf); | ||
Py_DECREF(str); |
This comment has been minimized.
This comment has been minimized.
serhiy-storchaka
Dec 25, 2019
Member
I think that is not safe to decref str
and even call PyBuffer_Release()
if this test fails.
methane commentedDec 19, 2019
•
edited by bedevere-bot
Ref: bpo-39087