Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.9.14: grammar/clarity improvements for str.encode, str.decode error-checking documentation #99991

Closed
jayaddison opened this issue Dec 4, 2022 · 7 comments
Labels
docs Documentation in the Doc dir

Comments

@jayaddison
Copy link

jayaddison commented Dec 4, 2022

Documentation

The are a couple of paragraphs about error-checking behaviour for str.encode and str.decode that appears in a few versions of Python and could potentially be improved in future versions.

As found in the Python 3.9.14 documentation for std.encode:

By default, the errors argument is not checked for best performances, but only used at the first encoding error. Enable the Python Development Mode, or use a debug build to check errors.

The paragraph before that is fairly dense - there could be an opportunity to improve both of them.

Linked PRs

@jayaddison jayaddison added the docs Documentation in the Doc dir label Dec 4, 2022
@bizzyvinci
Copy link
Contributor

bizzyvinci commented Dec 9, 2022

do you have recommendations on how to improve them

@jayaddison
Copy link
Author

jayaddison commented Dec 9, 2022

It's not easy, but I'll try - here's an attempt for str.encode:

str.encode(encoding="utf-8", errors="strict")

Return a bytes object containing the string in the requested encoding.

The default strict error handling raises a UnicodeError exception when an encoding error occurs. Other possible values include ignore, replace, xmlcharrefreplace, backslashreplace and any other name registered via codecs.register_error(), see section Error Handlers.

For a list of possible encodings, see section Standard Encodings.

For performance reasons, the value of the errors argument is not checked for validity unless Python Development Mode is enabled.

Process used
Thinking like refactoring, the important items to maintain (like creating unit tests) seemed to be:

  • That the documentation describes what the method does (encodes a string, handles errors)
  • That relevant entities are hyperlinked/emphasized
  • That the documentation is relatively concise (the time required to read and understand the description is similar or reduced)

So looking at the text, there seemed to be a few steps to refactor it:

  • Break the dense paragraph into individual statements (usually individual sentences)
  • Apply consistent emphasis to entities (types, arguments, ...) that appear in the statements
  • Remove redundant/self-explanatory statements
  • It's a 2-argument method -- and each argument requires explanation -- so consolidate the statements into one paragraph per argument:
    • Move sentences describing encoding into the first paragraph
    • Move sentences describing errors into the second paragraph
    • Exception: some of the commentary appears to be about non-functional behaviour (performance) -- OK, that could move into a third paragraph
  • Re-read and refactor to remove unnecessary words and statements
  • Nitpick: error handling is important to understand, so let's move the list of codecs -- useful though it is -- to after the errors argument, in the hope that people will read about error-handling first

Note:

@bizzyvinci
Copy link
Contributor

bizzyvinci commented Dec 9, 2022

Thanks. I'll work on that.

@jayaddison
Copy link
Author

jayaddison commented Dec 10, 2022

For performance reasons, the value of the errors argument is not checked for validity unless Python Development Mode is enabled.

Ah: I forgot to mention debug builds in this suggestion. Similar to development-mode, debug builds also validate the errors argument.

A simple check to test whether error-handler-name-validation is enabled is to attempt encoding using an invalid error-handler-name. For example, "".encode('utf-8', errors='gh-99991').

@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Dec 16, 2022

By the way, thanks for your comprehensive and insightful feedback, @jayaddison ! We'd love to have more of it elsewhere, and you're welcome to contribute the changes yourself if you like—you seem to have quite a knack for documentation writing, which makes a big positive impact on the community. Thanks again!

@jayaddison
Copy link
Author

jayaddison commented Dec 16, 2022

Thanks for the kind words @CAM-Gerlach! There are more than a few similarities between good code and good documentation, I think 😄 I'll do my best to help out where and when I can.

JelleZijlstra pushed a commit that referenced this issue Dec 21, 2022
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 21, 2022
…-100198)

(cherry picked from commit a2bb3b7)

Co-authored-by: Bisola Olasehinde <horlasehinde@gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 21, 2022
…-100198)

(cherry picked from commit a2bb3b7)

Co-authored-by: Bisola Olasehinde <horlasehinde@gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
miss-islington added a commit that referenced this issue Dec 21, 2022
(cherry picked from commit a2bb3b7)

Co-authored-by: Bisola Olasehinde <horlasehinde@gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
miss-islington added a commit that referenced this issue Dec 21, 2022
(cherry picked from commit a2bb3b7)

Co-authored-by: Bisola Olasehinde <horlasehinde@gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
@hauntsaninja
Copy link
Contributor

hauntsaninja commented Dec 21, 2022

Thanks, looks like everything is merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

4 participants