Description
Bug report
I want to read lines of text, but separated with null bytes, which decode as null codepoint with UTF8. This is widespread practice in command-line tools, for example find -print0
, locate -0
, xargs -0
, sort -z
.
Python's open
documentation states (emphasis mine):
When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
So this is legal: open(some_file, newline="\x00")
, but I immediately get the exception ValueError: embedded null character
.
Your environment
- CPython versions tested on: 3.9, 3.10