Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_sqlite3 fails on non-UTF-8 locale #91922

Open
serhiy-storchaka opened this issue Apr 25, 2022 · 4 comments
Open

test_sqlite3 fails on non-UTF-8 locale #91922

serhiy-storchaka opened this issue Apr 25, 2022 · 4 comments
Labels
3.9 3.10 3.11 expert-unicode type-bug

Comments

@serhiy-storchaka
Copy link

@serhiy-storchaka serhiy-storchaka commented Apr 25, 2022

$ LC_ALL=en_US.iso88591 ./python -m test -vuall test_sqlite3
...
======================================================================
ERROR: test_ctx_mgr_rollback_if_commit_failed (test.test_sqlite3.test_dbapi.MultiprocessTests.test_ctx_mgr_rollback_if_commit_failed)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 1767, in test_ctx_mgr_rollback_if_commit_failed
    cx = sqlite.connect(TESTFN, timeout=self.CONNECTION_TIMEOUT)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_open_uri (test.test_sqlite3.test_dbapi.OpenTests.test_open_uri)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 674, in test_open_uri
    with managed_connect(TESTFN) as cx:
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 44, in managed_connect
    cx = sqlite.connect(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_open_with_path_like_object (test.test_sqlite3.test_dbapi.OpenTests.test_open_with_path_like_object)
Checks that we can successfully connect to a database using an object that
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 670, in test_open_with_path_like_object
    with managed_connect(path) as cx:
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 44, in managed_connect
    cx = sqlite.connect(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_trace_callback_content (test.test_sqlite3.test_hooks.TraceCallbackTests.test_trace_callback_content)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_hooks.py", line 279, in test_trace_callback_content
    con1 = sqlite.connect(TESTFN, isolation_level=None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

----------------------------------------------------------------------
@serhiy-storchaka serhiy-storchaka added type-bug 3.11 3.10 3.9 labels Apr 25, 2022
@AlexWaygood AlexWaygood added the tests label Apr 25, 2022
@serhiy-storchaka serhiy-storchaka removed the tests label Apr 26, 2022
@serhiy-storchaka
Copy link
Author

@serhiy-storchaka serhiy-storchaka commented Apr 26, 2022

It is not just tests. There is a bug in the code.

It is due to this line:

if (PySys_Audit("sqlite3.connect", "s", database) < 0) {

database is a file path which can be not UTF-8. And "s" tries to decode it as UTF-8. There may be other similar bugs related to audition in other code. @tiran

I afraid also that sqlite3.connect() does not work correctly with non-ASCII path on Windows. It could be more correct to use sqlite3_open16() on Windows, but it lacks the flags parameter.

@erlend-aasland

@erlend-aasland
Copy link

@erlend-aasland erlend-aasland commented Apr 26, 2022

I'd rather fix this by documenting that the database path must be UTF-8. I'm afraid that using both sqlite3_open16 and sqlite3_open_v2 will create too much complexity in the code; my initial reaction is that it is not worth the added complexity, but I will absolutely consider it.

@erlend-aasland
Copy link

@erlend-aasland erlend-aasland commented Apr 26, 2022

Quoting the SQLite docs:

Note to Windows users: The encoding used for the filename argument of sqlite3_open() and sqlite3_open_v2() must be UTF-8, not whatever codepage is currently defined. Filenames containing international characters must be converted to UTF-8 prior to passing them into sqlite3_open() or sqlite3_open_v2().

We should add this information to the docs.

@erlend-aasland
Copy link

@erlend-aasland erlend-aasland commented Apr 27, 2022

I'm troubled by this sentence in the SQLite docs:

The default encoding will be UTF-8 for databases created using sqlite3_open() or sqlite3_open_v2(). The default encoding for databases created using sqlite3_open16() will be UTF-16 in the native byte order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 3.10 3.11 expert-unicode type-bug
Projects
None yet
Development

No branches or pull requests

3 participants