Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argparse add_argument treats - in flags differently to - in positional arguments. #95100

Open
CSDUMMI opened this issue Jul 21, 2022 · 17 comments
Labels
type-feature A feature request or enhancement

Comments

@CSDUMMI
Copy link

CSDUMMI commented Jul 21, 2022

Bug report

If you add an optional argument to a parser with argparse containing dashes,
those are converted to _ automatically in the resulting Namespace object.

But if you add a positional argument containing a -, this is not converted and the resulting error message suggests the argument name containing the - instead of the _. Which is of course not possible (without getattr), because it's not a valid variable name in Python.

This behaviour is misleading and undocumented and I'd suggest to convert - to _ in positional arguments too.

Reproduction code:

import argparse

parser = argparse.ArgumentParser("example")

parser.add_argument("foo-bar", type=str)

args = parser.parse_args()

print("getattr", getattr(args, "foo-bar"))

print("- replaced by _", args.foo_bar)

Results in:

$ python3 main.py abc
getattr aoe
Traceback (most recent call last):
  File "/tmp/main.py", line 11, in <module>
    print("- replaced by _", args.foo_bar)
AttributeError: 'Namespace' object has no attribute 'foo_bar'. Did you mean: 'foo-bar'?

Compounding this issue is the fact, that you are prevented from using the dest option on add_argument to overwrite the name in the Namespace.

Your environment

  • CPython versions tested on: Python 3.10.5
  • Operating system and architecture: Linux
@CSDUMMI CSDUMMI added the type-bug An unexpected behavior, bug, or error label Jul 21, 2022
@ericvsmith
Copy link
Member

I’ve been bitten by this before, and I think it would be a nice feature. But wouldn’t it break code currently using getattr?

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 21, 2022

You are right.

Is that happening a lot?

In that case, a work-around could be added to make it possible to use getattr with the dash name.

But that's not a very nice, right?

I'd expect that this is really not a lot of work and that removing a work around and replacing it with the convention in existing code should benefit the code quality of any project currently using it.

@ericvsmith
Copy link
Member

I guess you could make it available under both names, but I'm not sure it's worth the hassle.

I don't think we should break existing code, even if it would be better off with the change.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 22, 2022

PEP 387 lays out the rules for backward compatibility in Python.

Given a set of arguments, the return value, side effects, and raised exceptions of a function. This does not preclude changes from reasonable bug fixes.

I would consider this a reasonable bug fix, because this behavior is undocumented and diverges from the convention explicity documented for optional arguments in argparse.

@ericvsmith
Copy link
Member

In my years working on Python, I've learned that every change will break something.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 22, 2022

In that case, breaking something should not be an argument against changing something.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 22, 2022

I fear that adding a work around for supporting code using getattr and dash names, could lead to other problems.

For example, if you iterate over the resulting namespace, use vars or dir on it, a workaround to allow for both the - and _ version of a positional argument name would lead to the same argument being contained twice within the namespace.

With the work around enabled, this would happen:

iimport argparse

parser = argparse.ArgumentParser("example")

parser.add_argument("foo-bar", type=str)

args = parser.parse_args(["abc"])

print(vars(args)) # { "foo-bar" :  "abc", "foo_bar" : "abc" }

And that is yet another undocumented, unintuitive behavior with potential for breaking existing code.

@ericvsmith
Copy link
Member

In that case, breaking something should not be an argument against changing something.

I don’t think that follows.

But my point is the bar for change is high, and I don’t think this meets the criteria.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 23, 2022

I consider it an undocumented, unintuitive and not obvious behavior.

Behavior that is not documented, is also behavior that is not guaranteed.

And I would also wager, that it was not intentional.

Thus this should be considered as a reasonable bug fix.

It must be either fixed or at the very least be documented in the argparse documentation. Not everybody will be familiar with getattr and know that it can be used to work around this bug.

@ericvsmith
Copy link
Member

I agree it should at least be documented.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 23, 2022

So:

If your positional argument name contains a -, you must use getattr(args, "foo-bar") instead of args.foo_bar, because this might break backwards compatiblity if fixed.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 23, 2022

Explaining that note in the documentation might be a little hard.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 23, 2022

The problem here is that the longer it is not fixed, the harder it will be to fix it eventually.

And who really wants to have to explain this behavior indefinitely to people new to argparse?

@ericvsmith
Copy link
Member

The problem here is that the longer it is not fixed, the harder it will be to fix it eventually.

That's always true for all such changes.

I'm not saying it shouldn't be done. I'm saying we'll probably break working code, and that's a very high bar for a change. I occasionally make changes to argparse, but here I'll wait for other core devs to weigh in. In the meantime, a doc PR would be welcomed.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 24, 2022

I'm having trouble finding the right place to add a warning about this behavior.

@CSDUMMI
Copy link
Author

CSDUMMI commented Jul 26, 2022

To be more specific:

Should the warning + work around be added as inline documentation of argparse module or as part of Doc/library/argparse.rst?

@erlend-aasland erlend-aasland added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jul 30, 2022
@hpaulj
Copy link

hpaulj commented Nov 4, 2022

For optionals, there is a POSIX convention of accepting dashes in the flag strings, so conversion to underscore makes sense.

For positionals, there isn't any good reason to use dashes - unless you want to make life difficult for yourself. You are free to use any 'dest' string, even ones that start with numbers and contain odd characters. Internally, argparse uses the dest with setattr/getattr, so it isn't bothered by odd characters.

And if you must have dashes in the usage or help, use the metavar.

During the debugging phase it's a good idea to include a 'print(args)' line, so you aren't surprised by changes, or nonchanges to the 'dest.

In _get_optional_kwargs, the dash replace is done only when it is inferred from the long option string. It is not done when you provide an explicit 'dest' parameter.

For a positional, _get_positional_kwargs gets the first (and only) string as the 'dest'. It does not do any checking or replacement.

I think the 'dest' documentation is clear enough. "For positional argument actions, dest is normally supplied as the first argument ". The dash conversion is clearly identified as an optionals feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
Status: Features
Development

No branches or pull requests

4 participants