Skip to content

gh-103997: Automatically dedent the argument to "-c" #103998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
Apr 18, 2025

Conversation

Erotemic
Copy link
Contributor

@Erotemic Erotemic commented Apr 29, 2023

This is an implementation of the idea proposed in #103997.

It intercepts the argument passed to -c, and removes common leading whitespace from each line in the argument.

Given an input string, the algorithm overview is:

  • split the string into lines
  • count the number of leading whitespace characters for each line
  • keep track of the minimum leading spaces, but ignore lines that contain no non-whitespace characters.
  • if number of common whitespace charcters is non-zero, then loop over all lines again and remove that number of leading spaces (again ignoring the lines that are entirely whitespace).
  • rejoin the new lines into a new string and continue the pymain-run-command function with that.

Big thanks to @sunmy2019 who really helped clean this PR up.

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@ghost
Copy link

ghost commented Apr 29, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@Erotemic Erotemic force-pushed the dedent_pymain_command branch from 71c6b63 to 51c1320 Compare April 29, 2023 23:20
@Erotemic
Copy link
Contributor Author

I was able to cobble my way through the C docs and write what I think is a reasonable attempt at a pure C dedent function (I forgot how fun -- albiet time consuming --- pointer logic can be).

To highlight some of the corner cases that need to be accounted for, this is the test case I'm using locally:

python -c "
import subprocess
# Use $ to note when a line will have all whitespace
lines = '''
         $

  $
    data = \"\"\"
    
    this data has newlines above and below  $
                            $

    \"\"\"
    if 1:         $
        print(123)$

    print(12345)
    print(repr(data))
'''.replace('$', '')
subprocess.run(['./python', '-c', lines])
"

I still haven't handled tabs, but I think my space logic is correct. I do need help vetting my C code and fixing the memory and security problems with it.

@sunmy2019
Copy link
Member

sunmy2019 commented Apr 30, 2023

Delete all your wchar_t stuff. It is error-prone and unnecessary.

Delete all your _unicode_dedent. It is lengthy. Do utf_8_bytes_dedent is much more concise and simpler.

@sunmy2019
Copy link
Member

You should act fast since 3.12 release window will soon close. No new feature after May 8.
https://peps.python.org/pep-0693/

Can you give me write access to your branch?

@Erotemic Erotemic force-pushed the dedent_pymain_command branch from 06667b7 to 9649590 Compare May 1, 2023 02:08
@Erotemic
Copy link
Contributor Author

Erotemic commented May 1, 2023

@sunmy2019 I gave you access to my cpython fork.

@sunmy2019
Copy link
Member

I got one thought:

textwrap.dedent remove space and tabs.
https://docs.python.org/3/library/textwrap.html#textwrap.dedent

textwrap.dedent("""
 \t\t1
   2""")
'\n\t\t1\n  2'

It requires remembering the exact "space and tab prefix". Should we implement this?

@Erotemic
Copy link
Contributor Author

Erotemic commented May 1, 2023

It requires remembering the exact "space and tab prefix". Should we implement this?

Probably, but off the top of my head I'm not sure what the most elegant way to do it would be. I suppose instead of maintaining the tab and space counts, it would be sufficient to maintain the current shortest whitespace sequence, and then check how much of that sequence new lines match, shortening the sequence as necessary. I can give that a try.

On a different note, I'm sure that new features will want tests associated with them, but I'm not sure where the other "-c" tests live. Do you know where the best spot to put tests would be?

@sunmy2019
Copy link
Member

sunmy2019 commented May 1, 2023

I got one thought:

textwrap.dedent remove space and tabs.

@Erotemic, take a look. I have implemented it with production quality at d336ac7.

Edit your #103998 (comment) to make it clear and concise, so that we can attract more reviewers.

@sunmy2019
Copy link
Member

On a different note, I'm sure that new features will want tests associated with them, but I'm not sure where the other "-c" tests live. Do you know where the best spot to put tests would be?

I guess Lib/test/test_cmd_line.py

@sunmy2019
Copy link
Member

sunmy2019 commented May 1, 2023

We need someone with write access to review this.

@vstinner as the recent maintainer of this file.
@hauntsaninja

@Erotemic
Copy link
Contributor Author

@AA-Turner CI issues have been resolved.

Since this patch was last updated the "Core and Builtins" news section had spaces in its path name removed, so the file for this patch had to be moved to the new correct location.

The other issue was a c compiler warning, which was elevated to an error by CI policy. The usage of _PyUnicode_Dedent did not have an explicit include in the c file that used it. Adding the explicit include of pycore_unicodeobject.h addressed the problem.

@Erotemic Erotemic requested a review from AA-Turner April 10, 2025 16:25
Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broadly looks fine to me, a minor suggestion on the test cases. I'll leave @methane or someone else to merge, though.

A

sunmy2019 and others added 2 commits April 17, 2025 23:21
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the refcount.dat file (don't remember if it's in Tools, Doc or somewhere else).

@bedevere-app
Copy link

bedevere-app bot commented Apr 17, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@picnixz
Copy link
Member

picnixz commented Apr 17, 2025

Ah, I've deleted a comment by mistake. My comment was about the construction Py_INCREF(x); return x; which can be replaced by return Py_NewRef(x);.

@sunmy2019
Copy link
Member

I have made the requested changes; please review again

@AA-Turner AA-Turner requested a review from picnixz April 18, 2025 03:59
Copy link
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-c option in idle and pdb should be dedented too. But it can be separated PR.

@methane methane merged commit fc0ec29 into python:main Apr 18, 2025
42 checks passed
@sunmy2019
Copy link
Member

Please update the refcount.dat file (don't remember if it's in Tools, Doc or somewhere else).

I overlooked this. I will check it later.

@picnixz
Copy link
Member

picnixz commented Apr 18, 2025

It's fine, this is just for internal knowledge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants