Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running tests in parallel on Windows quits too soon #95027

Closed
terryjreedy opened this issue Jul 19, 2022 · 34 comments
Closed

Running tests in parallel on Windows quits too soon #95027

terryjreedy opened this issue Jul 19, 2022 · 34 comments
Labels
3.11 3.12 OS-windows release-blocker tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@terryjreedy
Copy link
Member

terryjreedy commented Jul 19, 2022

On my Win10 the test suite completes when run serially. But with main and 3.11, but not 3.10, it quits too soon with -j0. This has occured with both repository debug builds and installed 3.11.0b4. Failure is currently deterministic with variable details. Presence of -ugui or -uall has no apparent effect.

What happens is that roughly about 100 tests before the end, a 'regrtest worker thread' fails ('warning') with UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position <variable>: invalid start byte. The 14 worker processes are stopped and the usual summary is given, but with an additional list of ' tests omitted <list of test names'. The test is called a 'SUCCESS'. This is followed by a traceback for SystemExit(0), followed by 1 or more tracebacks for PermissionError because a temporary test file is supposedly used by another process.

Attaching a file with output starting with the initial warning fails. Will paste separately.

If this is not limited to my system, I think it should be a release blocker.

@terryjreedy terryjreedy added type-bug An unexpected behavior, bug, or error tests Tests in the Lib/test dir 3.11 3.12 labels Jul 19, 2022
@terryjreedy
Copy link
Member Author

terryjreedy commented Jul 19, 2022

Sample failure.

  -- running: test_asyncio (1 min 20 sec), test_distutils (1 min 2 sec),
  test_concurrent_futures (1 min 9 sec), test_multiprocessing_spawn (32.3 sec)
Warning -- regrtest worker thread failed: Traceback (most recent call last):
Warning --   File "C:\Programs\Python311\Lib\test\libregrtest\runtest_mp.py", line 305, in run
Warning --     mp_result = self._runtest(test_name)
Warning --                 ^^^^^^^^^^^^^^^^^^^^^^^^
Warning --   File "C:\Programs\Python311\Lib\test\libregrtest\runtest_mp.py", line 272, in _runtest
Warning --     stdout = stdout_fh.read().strip()
Warning --              ^^^^^^^^^^^^^^^^
Warning --   File "C:\Programs\Python311\Lib\tempfile.py", line 483, in func_wrapper
Warning --     return func(*args, **kwargs)
Warning --            ^^^^^^^^^^^^^^^^^^^^^
Warning --   File "<frozen codecs>", line 322, in decode
Warning -- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position 1332: invalid start byte
Kill <TestWorkerProcess #1 running test=test_ssl pid=15188 time=10.9 sec>
Kill <TestWorkerProcess #2 running test=test_tomllib pid=11616 time=782 ms>
Kill <TestWorkerProcess #3 running test=test_socket pid=17624 time=11.5 sec>
Kill <TestWorkerProcess #4 running test=test_asyncio pid=3644 time=1 min 21 sec>
Kill <TestWorkerProcess #5 running test=test_tools pid=8556 time=422 ms>
Kill <TestWorkerProcess #7 running test=test_concurrent_futures pid=17068 time=1 min 10 sec>
Kill <TestWorkerProcess #8 running test=test_tk pid=18116 time=1.8 sec>
Kill <TestWorkerProcess #9 running test=test_tarfile pid=444 time=7.3 sec>
Kill <TestWorkerProcess #10 running test=test_trace pid=9376 time=297 ms>
Kill <TestWorkerProcess #11 running test=test_threading pid=11716 time=5.0 sec>
Kill <TestWorkerProcess #12 running test=test_multiprocessing_spawn pid=2664 time=32.6 sec>
Kill <TestWorkerProcess #13 running test=test_subprocess pid=14384 time=8.9 sec>
Kill <TestWorkerProcess #14 running test=test_telnetlib pid=11752 time=6.9 sec>

== Tests result: SUCCESS ==

79 tests omitted:
    test_asyncio test_concurrent_futures test_distutils
    test_multiprocessing_spawn test_socket test_ssl test_subprocess
    test_tarfile test_telnetlib test_threading test_tk test_tomllib
    test_tools test_trace test_traceback test_tracemalloc
    test_ttk_guionly test_ttk_textonly test_tuple test_turtle
    test_type_annotations test_type_cache test_type_comments
    test_typechecks test_typing test_ucn test_unary test_unicode
    test_unicode_file test_unicode_file_functions
    test_unicode_identifiers test_unicodedata test_univnewlines
    test_unpack test_unpack_ex test_unparse test_urllib test_urllib2
    test_urllib2_localnet test_urllib2net test_urllib_response
    test_urllibnet test_urlparse test_userdict test_userlist
    test_userstring test_utf8_mode test_utf8source test_uu test_uuid
    test_venv test_wait3 test_wait4 test_warnings test_wave
    test_weakref test_weakset test_webbrowser test_winconsoleio
    test_winreg test_winsound test_with test_wsgiref test_xdrlib
    test_xml_dom_minicompat test_xml_etree test_xml_etree_c
    test_xmlrpc test_xmlrpc_net test_xxlimited test_xxtestfuzz
    test_yield_from test_zipapp test_zipfile test_zipfile64
    test_zipimport test_zipimport_support test_zlib test_zoneinfo

326 tests OK.

31 tests skipped:
    test_asdl_parser test_check_c_globals test_clinic test_curses
    test_dbm_gnu test_dbm_ndbm test_devpoll test_epoll test_fcntl
    test_fork1 test_gdb test_grp test_ioctl test_kqueue
    test_multiprocessing_fork test_multiprocessing_forkserver test_nis
    test_openpty test_ossaudiodev test_pipes test_poll test_posix
    test_pty test_pwd test_readline test_resource test_smtpnet
    test_socketserver test_spwd test_syslog test_threadsignals

Total duration: 1 min 22 sec
Tests result: SUCCESS
Traceback (most recent call last):
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 478, in temp_dir
    yield path
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 531, in temp_cwd
    yield cwd_dir
  File "C:\Programs\Python311\Lib\test\libregrtest\main.py", line 701, in main
    self._main(tests, kwargs)
  File "C:\Programs\Python311\Lib\test\libregrtest\main.py", line 758, in _main
    sys.exit(0)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Programs\Python311\Lib\test\support\__init__.py", line 201, in _force_run
    return func(*args)
           ^^^^^^^^^^^
PermissionError: [WinError 32] The process cannot access the file because it is
 being used by another process:
 'C:\\Users\\Terry\\AppData\\Local\\Temp\\test_python_6764æ\\test_python_worker_17068æ'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Programs\Python311\Lib\test\__main__.py", line 2, in <module>
    main()
  File "C:\Programs\Python311\Lib\test\libregrtest\main.py", line 763, in main
    Regrtest().main(tests=tests, **kwargs)
  File "C:\Programs\Python311\Lib\test\libregrtest\main.py", line 695, in main
    with os_helper.temp_cwd(test_cwd, quiet=True):
  File "C:\Programs\Python311\Lib\contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 529, in temp_cwd
    with temp_dir(path=name, quiet=quiet) as temp_path:
  File "C:\Programs\Python311\Lib\contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 483, in temp_dir
    rmtree(path)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 440, in rmtree
    _rmtree(path)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 383, in _rmtree
    _waitfor(_rmtree_inner, path, waitall=True)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 328, in _waitfor
    func(pathname)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 380, in _rmtree_inner
    _force_run(fullname, os.rmdir, fullname)
  File "C:\Programs\Python311\Lib\test\support\__init__.py", line 212, in _force_run
    return func(*args)
           ^^^^^^^^^^^
PermissionError: [WinError 32] The process cannot access the file because it is
 being used by another process:
 'C:\\Users\\Terry\\AppData\\Local\\Temp\\test_python_6764æ\\test_python_worker_17068æ'

@neonene
Copy link
Contributor

neonene commented Jul 28, 2022

Probably cp1252 is used as system locale on your OS which emits an error message with non-ascii characters such as 0x91 (left single quotation). If so, switching to utf-8 locale is the easiest way to finish running tests after a few retries:

https://websiteforstudents.com/how-to-change-system-locale-in-windows-11/

Another workaround would be using 'wb+' mode in TestWorkerProcess._runtest() to account for non-utf8 text from subprocess:

def _runtest(self, test_name: str) -> MultiprocessResult:
# gh-94026: Write stdout+stderr to a tempfile as workaround for
# non-blocking pipes on Emscripten with NodeJS.
with tempfile.TemporaryFile(
'w+', encoding=sys.stdout.encoding
) as stdout_fh:

I'm not sure the root cause of the race condition when running test_asyncio and test_concurrent_futures.

@neonene
Copy link
Contributor

neonene commented Jul 29, 2022

Maybe related: gh-91227

@neonene
Copy link
Contributor

neonene commented Jul 29, 2022

See also: gh-91323 (specific to 3.11 and main)

@terryjreedy
Copy link
Member Author

terryjreedy commented Aug 18, 2022

Multiprocessing people: due to some regression in 3.11/2, parallel tests on ma updated by otherwise pretty stock American Win10 started failing 29 days ago. They still fail today with essentially the same traceback. I believe they ran not too many months before.

@pablogsal Today, sequential tests also fail by hanging for hours in test_winconsoleio, after taking 81 minutes to get that far
1:21:51 [415/435] test_winconsoleio. (The suite once ran in 18 minutes on my machine.) test_repl alone took 34 minutes.

EDIT: This is with plain python -m test (with -j0 added for parallel).
EDIT2: On rerun, sequential tests ran OK in 55 minutes with 40 skipped. Will rerun again to make sure.

@pablogsal
Copy link
Member

pablogsal commented Aug 18, 2022

@pablogsal Today, sequential tests also fail by hanging for hours in test_winconsoleio, after taking 81 minutes to get that far
1:21:51 [415/435] test_winconsoleio. (The suite once ran in 18 minutes on my machine.) test_repl alone took 34 minutes.
Unfortunately I don't have a windows machine :( Could you bisect to find the commit that introduced the multiprocessing regression? I don't think that area changed a lot during 3.11/3.12

@terryjreedy
Copy link
Member Author

terryjreedy commented Aug 18, 2022

The second sequential run was again fine, so forget winconsoleio. (I have no idea why the first run could have gone so badly.) Do the tests run in parallel on your non-windows machine?
@zooba Can you try -j0 on your Windows machine to verify that this is not a local-only problem?

To me, the output pasted above reveals two bugs in the testing program.

  1. The premature shutdown is called a success rather than a failure.
  2. Exiting after (wrongly) reporting success results in more exceptions.

git bisect wants a command that returns a 0/non-0 exit code. Though it would not help here, due to the fake 'success', is there a way to run regrtest and suppress printing and get an exit code instead? I looked at the 'Special runs;' options can could not find anything.

I do not know git beyond the devguide chapter, so I would need some coaching even to find a good version for bisect (other than by manually downloading and re-installing earlier releases). What would be a good way to get an exit-code command/script?

@pablogsal
Copy link
Member

pablogsal commented Aug 18, 2022

You can run git bisect manually and inspect every commit and then run git bisect bad/good depending if it works or not. Is slower but more reliable.

@zooba
Copy link
Member

zooba commented Aug 18, 2022

It's unlikely to be UTF-8 vs mbcs, but could well be that it's decoding with 'strict' instead of fsdecode (which uses 'surrogateescape' to avoid exactly this issue). Though if the change to write via a temp file instead of stdout directly is recent, I agree that's an interesting place to start looking.

The additional error appears to be in the rmtree helper in the test library, which is supposed to handle these issues and retry them:

  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 483, in temp_dir
    rmtree(path)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 440, in rmtree
    _rmtree(path)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 383, in _rmtree
    _waitfor(_rmtree_inner, path, waitall=True)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 328, in _waitfor
    func(pathname)
  File "C:\Programs\Python311\Lib\test\support\os_helper.py", line 380, in _rmtree_inner
    _force_run(fullname, os.rmdir, fullname)
  File "C:\Programs\Python311\Lib\test\support\__init__.py", line 212, in _force_run
    return func(*args)

@gvanrossum gvanrossum changed the title Running tests in parallel quits too soon Running tests in parallel on Windows quits too soon Sep 6, 2022
@terryjreedy
Copy link
Member Author

terryjreedy commented Sep 7, 2022

@zooba Did you or anyone else verify this problem on a machine other than mine? Windows or otherwise?

At this point, I do not think that this should be a release blocker. If we are really concerned about tests running on installations, we need a more concerted effort to test installers. I installed the rc1 on my Macbook Air, ran the test suite both serial and parallel, and it failed both times with a malloc error. I ran test_idle alone and it hangs on the second call to a IDLE test help function. It does the same in the current 3.10 and 3.9 (prior to anything released today). (There is something odd in the debug print behavior, so I still do not know exactly where the failure is.) These two failures are different from the one reported here.

@zooba
Copy link
Member

zooba commented Sep 7, 2022

Sorry for the delay, I do get the same result as you when testing locally, even when running tests from the build directory.

Haven't got any more details than my analysis above - I'm trying again with less strict encoding errors to see what happens.

zooba added a commit to zooba/cpython that referenced this issue Sep 7, 2022
zooba added a commit to zooba/cpython that referenced this issue Sep 7, 2022
@zooba
Copy link
Member

zooba commented Sep 7, 2022

There's one possible fix. There are likely other ways of doing it, so if someone has an approach they'd really rather see, feel free to send a different PR.

@terryjreedy
Copy link
Member Author

terryjreedy commented Sep 8, 2022

Steve's patch seems to fix the premature quit problem in repository main (see PR). There is a separate issue that test_io fails when run in a subprocess (python -m test -j0 test_io) but not when not (python -m test test_io). Also, test_distutils and test_peg_generator changed the execution environment on my machine, and test_wmi ran for 9 minutes, which is terrible, especially in parallel mode and everything else is done in 5.

@tim-one
Copy link
Member

tim-one commented Sep 8, 2022

Here's the proximate cause of the original failure report. It took weeks for me to figure this out in the background. Because, when all the tests are running, this failure shows up at an unpredictable time, and there's nothing in the Warning messages produced that says anything about which test caused it. Instead the whole test run just dies abruptly, and cascades of irrelevant errors are also produced because the runner is trying to shut down cleanly but keeps bashing into trying to delete files that are still open (a no-no on Windows) due to whatever other tests are running concurrently.

It's test_distutils, which has been failing for me on Windows for about a month, under any form of parallel running, even with just 1 process:

C:\Code\Python\PCbuild>rt -q -j1 test_distutils

C:\Code\Python\PCbuild>"C:\Code\Python\PCbuild\amd64\python.exe"  -u -Wd -E -bb -m test  -j1 test_distutils
0:00:00 Run tests in parallel using 1 child processes
0:00:30 load avg: 0.00 running: test_distutils (30.0 sec)
0:01:00 load avg: 0.00 running: test_distutils (1 min)
Warning -- regrtest worker thread failed: Traceback (most recent call last):
Warning --   File "C:\Code\Python\Lib\test\libregrtest\runtest_mp.py", line 334, in run
Warning --     mp_result = self._runtest(test_name)
Warning --                 ^^^^^^^^^^^^^^^^^^^^^^^^
Warning --   File "C:\Code\Python\Lib\test\libregrtest\runtest_mp.py", line 293, in _runtest
Warning --     stdout = stdout_fh.read().strip()
Warning --              ^^^^^^^^^^^^^^^^
Warning --   File "C:\Code\Python\Lib\tempfile.py", line 483, in func_wrapper
Warning --     return func(*args, **kwargs)
Warning --            ^^^^^^^^^^^^^^^^^^^^^
Warning --   File "<frozen codecs>", line 322, in decode
Warning -- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position 2829: invalid start byte

== Tests result: NO TEST RUN ==

1 test omitted:
    test_distutils

Total duration: 1 min 6 sec
Tests result: NO TEST RUN

Changing the encoding on stdout_h (in runtest_mp.py) to latin-1 when it's created makes that go away, but no idea whether that's appropriate. There's no problem here if -j isn't used. If you want to use -j, add -x test_distutils and it all runs fine.

test_wmi also took a loooong time for me today.

@vstinner
Copy link
Member

vstinner commented Sep 8, 2022

Regression introduced recently by #94253 if I followed correctly.

@zooba
Copy link
Member

zooba commented Sep 8, 2022

It's test_distutils, which has been failing for me on Windows for about a month

Good news, we can stop running it now 😄

I'm not sure why it's still in there. I guess the PR to remove distutils stalled on something...

@zooba
Copy link
Member

zooba commented Sep 8, 2022

Regression introduced recently by #94253 if I followed correctly.

@vstinner I believe so - see my #96669 patch for one workaround, but feel free to propose something else if you prefer. Possibly UTF-8 mode is sufficient and can be set with xoption? But I think we want to override the errors as well.

@felixxm
Copy link
Contributor

felixxm commented Sep 27, 2022

I'm not sure if this is related but we have a similar regression when running Django tests in parallel on Windows and Python 3.11 🤔 :

 Traceback (most recent call last):
  File "D:\a\django\django\tests\runtests.py", line 773, in <module>
    failures = django_tests(
               ^^^^^^^^^^^^^
  File "D:\a\django\django\tests\runtests.py", line 432, in django_tests
    failures = test_runner.run_tests(test_labels)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\django\django\django\test\runner.py", line 1057, in run_tests
    self.teardown_databases(old_config)
  File "D:\a\django\django\django\test\runner.py", line 980, in teardown_databases
    _teardown_databases(
  File "D:\a\django\django\django\test\utils.py", line 378, in teardown_databases
    connection.creation.destroy_test_db(
  File "D:\a\django\django\django\db\backends\base\creation.py", line 307, in destroy_test_db
    self._destroy_test_db(test_database_name, verbosity)
  File "D:\a\django\django\django\db\backends\sqlite3\creation.py", line 111, in _destroy_test_db
    os.remove(test_database_name)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'other_1.sqlite3'
Error: Process completed with exit code 1.

See logs.

@carltongibson
Copy link
Contributor

carltongibson commented Sep 27, 2022

Via the discussion in #96669, I tried the Django test-case with the PYTHONIOENCODING env var...

$env:PYTHONIOENCODING='utf-8:surrogateescape'; python .\runtests.py --parallel

I don't know if that's quite sufficient. Same result as reported by @felixxm.
I will try and say more

(This with 3.11b5 and 3.11rc2. Works without error with 3.10.)

@vstinner
Copy link
Member

vstinner commented Sep 27, 2022

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'other_1.sqlite3'

This bug tracker is for Python issue. Your issue is unrelated. Please open an issue to: https://github.com/django/django

@felixxm
Copy link
Contributor

felixxm commented Sep 27, 2022

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'other_1.sqlite3'

This bug tracker is for Python issue. Your issue is unrelated. Please open an issue to: https://github.com/django/django

Hi Victor,

We're maintaining Django 😄 The same tests work fine on Python 3.8, 3.9, and 3.10, so I assumed (finding this issue) that it might be related 🤔 We will try to provide more details later. Sorry for the lack of a preliminary investigation on our side.

@carltongibson
Copy link
Contributor

carltongibson commented Sep 27, 2022

Hi Victor,

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:

This is the exact error from the opening pair of comments. As @felixxm says, we're reporting an apparent regression in a pre-release version, as we've been asked to do. Sorry if that wasn't clear from the comments.

@zooba
Copy link
Member

zooba commented Sep 28, 2022

Thanks, but we've narrowed this one down to our own test suite. If you're reusing libregrtest in your tests, then you should get the fix automatically, but if you aren't (and I assume you're not, because pytest is miles better than libregrtest 😉) it'll be a different issue and should get a new bug.

@carltongibson
Copy link
Contributor

carltongibson commented Sep 28, 2022

OK, thanks @zooba — I was hoping to get to it today, but it's on my list for tomorrow now to work out exactly which release introduced the change, and we'll open a new issue with at least reproduce steps for your consideration.

@vstinner
Copy link
Member

vstinner commented Sep 28, 2022

I assumed (finding this issue) that it might be related thinking

You report an issue about a PermissionError on a sqlite database, whereas this issue is about an UnicodeDecodeError. I don't see how they could be related. Moreover, Django uses its own test runner (django\test\runner.py), and Python uses its own test runner (libregrtest).

Since the error message is different, please open a separated issue. Yes, it's possible that it's a Python regression, but someone has to analyze the issue to make sure that it's a Python regression, and not something else.

@carltongibson
Copy link
Contributor

carltongibson commented Sep 28, 2022

OK, it'll be with you in the morning. Thanks @vstinner. 🎁

@vstinner
Copy link
Member

vstinner commented Oct 12, 2022

The test is called a 'SUCCESS'. This is followed by a traceback for SystemExit(0), followed by 1 or more tracebacks for PermissionError because a temporary test file is supposedly used by another process.

I created issue #98219 about this annoying PermissionError.

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

I can reproduce this issue on Windows on the main branch with:

python -m test  -j1 test_distutils -m test_spawn 

Output:

vstinner@WIN C:\victor\python\main>python -m test  -j1 test_distutils -m test_spawn 
Running Debug|x64 interpreter...
0:00:00 Run tests in parallel using 1 child processes
Warning -- regrtest worker thread failed: Traceback (most recent call last):
Warning --   File "C:\victor\python\main\Lib\test\libregrtest\runtest_mp.py", line 334, in run
Warning --     mp_result = self._runtest(test_name)
Warning --                 ^^^^^^^^^^^^^^^^^^^^^^^^
Warning --   File "C:\victor\python\main\Lib\test\libregrtest\runtest_mp.py", line 293, in _runtest
Warning --     stdout = stdout_fh.read().strip()
Warning --              ^^^^^^^^^^^^^^^^
Warning --   File "C:\victor\python\main\Lib\tempfile.py", line 483, in func_wrapper
Warning --     return func(*args, **kwargs)
Warning --            ^^^^^^^^^^^^^^^^^^^^^
Warning --   File "C:\victor\python\main\Lib\codecs.py", line 322, in decode
Warning --     (result, consumed) = self._buffer_decode(data, self.errors, final)
Warning --                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning -- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position 264: invalid start byte

== Tests result: NO TEST RUN ==

1 test omitted:
    test_distutils

Total duration: 954 ms
Tests result: NO TEST RUN

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

@vstinner:

Regression introduced recently by #94253 if I followed correctly.

I confirm:

  • before commit 199ba23 (parent of this commit), I cannot reproduce the issue
  • at commit 199ba23, I reproduce the issue

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

Encodings used by libregrtest on Windows.

Before commit 199ba23:

  • no -jN option: sys.stdout.encoding: UTF-8

  • -j1:

    • parent process: Popen.stdout (pipe) encoding: cp1252
    • child process: sys.stdout.encoding: cp1252

At commit 199ba23:

  • no -jN option: sys.stdout.encoding: UTF-8 (same)

  • -j1:

    • parent process: stdout temporary file: UTF-8 (was cp1252) <===== WHAT CHANGED
    • child process: sys.stdout.encoding: cp1252 (same)

libregrtest now uses sys.stdout.encoding for the stdout temporary file encoding:

  • On Linux, sys.stdout.encoding is the same if stdout is a terminal or not (if sys.stdout.isatty() is true or false).
  • On Windows, sys.stdout.encoding is UTF-8 if stdout is a terminal, or the ANSI code page (ex: cp1252) otherwise (ex: if stdout is redirected to a program or a file)

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

Another difference on Windows:

  • When stdout is a terminal (isatty() is true), WindowsConsoleIO is used.
  • When stdout is not a terminal (isatty() is false), FileIO is used.
C:\victor\python\main>python -c "import sys; out=sys.stdout; print(type(out.buffer.raw), out.isatty(), out.encoding)"     
<class '_io._WindowsConsoleIO'> True utf-8

C:\victor\python\main>python -c "import sys; out=sys.stdout; print(type(out.buffer.raw), out.isatty(), out.encoding)"|more
<class '_io.FileIO'> False cp1252

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

I'm working on a fix, but first I'm trying to add a test to test_regrtest which reproduces the issue ;-)

@vstinner
Copy link
Member

vstinner commented Oct 20, 2022

I wrote PR #98492 to fix the issue.

vstinner added a commit that referenced this issue Oct 21, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 21, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
(cherry picked from commit ec1f6f5)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 21, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
(cherry picked from commit ec1f6f5)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington added a commit that referenced this issue Oct 21, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
(cherry picked from commit ec1f6f5)

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner
Copy link
Member

vstinner commented Oct 21, 2022

I would prefer a formal review of my PR, but I merged my PR just to unblock the 3.11.0 final release (scheduled next Monday). Maybe if something can be enhanced, it can be done later. IMO this fix is better than the current situation. In short, it just restores the old behavior: encodings used before 199ba23

pablogsal pushed a commit that referenced this issue Oct 24, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
(cherry picked from commit ec1f6f5)

Co-authored-by: Victor Stinner <vstinner@python.org>
pablogsal pushed a commit that referenced this issue Oct 24, 2022
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
(cherry picked from commit ec1f6f5)

Co-authored-by: Victor Stinner <vstinner@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 3.12 OS-windows release-blocker tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

9 participants