New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99308: Clarify re docs for byte pattern group names #99311
Conversation
Kentzo
commented
Nov 10, 2022
•
edited by bedevere-bot
edited by bedevere-bot
- Issue: Cryptic deprecation notice in the re module #99308
Doc/library/re.rst
Outdated
@@ -496,6 +496,8 @@ The special characters are: | |||
|
|||
.. versionchanged:: 3.12 | |||
Group *id* can only contain ASCII digits. | |||
In bytes replacement strings group names must contain only characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s avoid five nouns in a row
In bytes replacement strings group names must contain only characters | |
In bytes replacement strings, group names must contain only characters |
Doc/library/re.rst
Outdated
In bytes replacement strings group names must contain only characters | ||
in the ASCII range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe its just me, but this was pretty confusing to me and hard to parse and understand. The comma @merwok suggest helps a lot in making the basic grammar more clear, but there still seems to be some problems with the actual semantics here.
As far as I can tell, this should refer to patterns, not replacements, and I was very confused why the latter were mentioned until I realized it was incorrectly copy/pasted from the original mention (that talks about replacement strings).
Also, "bytes strings" is both arguably an oxymoron in Python 3 (since this is presumably intended to specifically refer to bytes, not strings), seemingly ungrammatical (at least without being proper type references), and certainly confusing.
Finally, name should be formatted consistently with id in the previous line, :class:bytes
objects are composed of bytes, not characters (though perhaps that's being too pedantic), and it would be useful to explictly specify what the ASCII range is. Implementing all of that, and fixing the phrasing, we have:
In bytes replacement strings group names must contain only characters | |
in the ASCII range. | |
In :class:`bytes` patterns, group *name* can only contain bytes | |
in the ASCII range (``b'\x00'``-``b'\x7f'``). |
Or, with decimal values for the ASCII range,
In bytes replacement strings group names must contain only characters | |
in the ASCII range. | |
In :class:`bytes` patterns, group *name* can only contain bytes | |
in the ASCII range (values 0-127). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CAM-Gerlach Maybe b'\x00'
-b'\x7f'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so; edited. I just didn't want to overcomplicate it.
Thanks @Kentzo for the PR, and @kumaraditya303 for merging it |
Sorry, @Kentzo and @kumaraditya303, I could not cleanly backport this to |
Skipping backport as it is for 3.12 only. |