Open
Description
Bug report
in _guess_quote_and_delimiter function in csv.py, regular expressions end with "(?:$|\n)" won't be able to handle \r in windows line end
>>> import re
>>> re.search(r'(?P<delim>[^\w\n"\'])(?P<space> ?)(?P<quote>["\']).*?(?P=quote)(?:$|\n)', '2020-10-01 17:17:37+08:00,https://www.mozilla.org/en-US/firefox/welcome/2/,"Pocket - Save news, videos, stories and more"\r\n', re.M|re.S)
>>> re.search(r'(?P<delim>[^\w\n"\'])(?P<space> ?)(?P<quote>["\']).*?(?P=quote)(?:$|\r|\n)', '2020-10-01 17:17:37+08:00,https://www.mozilla.org/en-US/firefox/welcome/2/,"Pocket - Save news, videos, stories and more"\r\n', re.M|re.S)
<re.Match object; span=(74, 122), match=',"Pocket - Save news, videos, stories and more"\r>
import csv
a = 'Timestamp,URL,Title\r\n2020-10-01 17:17:37+08:00,https://www.mozilla.org/en-US/firefox/welcome/2/,"Pocket - Save news, videos, stories and more"\r\n'
sniffer = csv.Sniffer()
result = sniffer.sniff(a)
print(result.delimiter)
print(result.quotechar)
The wrong output is "p"
After the change, output becomes ","
Your environment
Windows
- CPython versions tested on: 3.11.2
- Operating system and architecture: windows 64
Linked PRs
Metadata
Metadata
Assignees
Projects
Status
No status