New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39574: Improve str.__doc__ to clarify encoding and errors and their defaults. #18401
base: main
Are you sure you want to change the base?
Conversation
errors defaults to 'strict'."); | ||
Create a new string object from the given object.\n\ | ||
\n\ | ||
If a single argument is given, returns the result of object.__str__()\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a single argument is given, returns the result of object.__str__()\n\ | |
If the object argument is given, returns the result of object.__str__()\n\ |
You can call str(encoding='spam')
. A single argument is provided, but no __str__()
is called.
The object argument will be made positional only in 3.9, so this wording will likely changed again.
If a single argument is given, returns the result of object.__str__()\n\ | ||
(if defined) or repr(object).\n\ | ||
\n\ | ||
If encoding or errors or both are specified, then the object must\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not "or both" redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In english, it's ambiguous, so 'and both' removes that ambiguity, especially as the rest of the paragraph talks about using one option or the other, it might be misunderstood that only one or the other are meant to be used.
If encoding or errors or both are specified, then the object must\n\ | ||
expose a data buffer that will be decoded using the given encoding\n\ | ||
and error handler. If errors is specified, encoding defaults to\n\ | ||
sys.getdefaultencoding(). If encoding is specified, errors defaults\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use just 'utf-8'
instead of sys.getdefaultencoding()
. It is a constant in Python 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be useful to update the doc for getdefaultencoding() to say that?
@@ -15228,13 +15228,16 @@ PyDoc_STRVAR(unicode_doc, | |||
"str(object='') -> str\n\ | |||
str(bytes_or_buffer[, encoding[, errors]]) -> str\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str(bytes_or_buffer[, encoding[, errors]]) -> str\n\ | |
str(bytes_or_buffer, encoding='utf-8', errors='strict') -> str\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would read it as the defaults applying in all cases including when a single positional arg is given. Even though docs below clarify that's not the case, the users are used to the idea that just by reading the signature is enough to understand what will be the default arg values.
It seems like in this case where defaults are applied in a weird way, it's best to undersupply information in the signature which will force users to either read the docs below or to supply desired values for these args, - in either case it will work as expected.
https://bugs.python.org/issue39574