tokenize.detect_encoding has issues with \r

Tokenize does not correctly detect encodings when the carriage return character is involved. In this program, it does not detect the header, since it does not consider the \r to start a new line.

```
import tokenize
import runpy

code = """\r# coding: big5

print('錯誤的答案')
"""

with open('file.py', 'w', encoding='big5') as file:
    file.write(code)
with open('file.py', 'rb') as file:
    print(tokenize.detect_encoding(file.readline))
    # result: ('utf-8', [b'\r# coding: big5\r\n', b'\r\n'])

runpy.run_path('file.py')
# prints 錯誤的答案 as required
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

tokenize.detect_encoding has issues with \r #93608

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

tokenize.detect_encoding has issues with \r #93608

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions