New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string formatting: normalize negative zero #90153
Comments
proposal: add a string formatting option to normalize negative 0 values to 0 use case: rounded display of a float that is nominally 0, where the distraction of a flashing minus sign from minute changes around 0 is unwanted example:
>>> '%~5.1f' % -.00001
' 0.0' format spec before: implementation: if '~' is present in the spec, add 0 to the value after applying precision |
To normalize negative 0.0 to 0.0 you can just add 0.0. It will work with any method of converting floats to string, there is no need to change all formatting specifications. >>> x = -0.0
>>> x
-0.0
>>> x + 0.0
0.0 |
yes, I'm aware-- see "implementation" in the original post
For adding 0 to work, it must be done after the rounding. That means if you want make use of anything in the current formatting spec regarding precision, normalizing negative zero would need to be a proper option of the formatting spec. |
It was decided long ago that % formatting would not be enhanced with new features. I think that it is supposed to match the standard C formatting codes, and nothing else. So this is unlikely to be approved for % formatting. It *might* be approved for the format method and f-strings. (I suspect that unless you get immediate and uncontroversial agreement from multiple core developers, you may need to take it for further discussion and perhaps even a PEP.) What you call a "distraction" I consider to be critical part of the display. The numbers you are displaying actually are negative, and rounding them for display does not change that. So they ought to show the minus sign. So I'm not really very sympathetic to this feature request. |
Here is the same proposal made for C++ http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1496r2.pdf It makes fair arguments for the feature's use, and explains why the problem is hard to work around. It was withdrawn by the author for C++20, but a consensus proposal is promised for C++23. |
I'd support having this functionality available for It did take me a few minutes to get my head around the idea that I'm not 100% convinced by the particular spelling proposed, but I don't have anything better to suggest. If C++ might be going with a "z", would it make sense to do the same for Python? I don't forsee any implementation difficulties for float and complex types. For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that. Once we've done that, again the implementation doesn't seem onerous. |
Well, it makes sense for negative zero produced by rounding. But if we add a special support for this case, it would be useful to have some control on the type of rounding. Currently floats are rounded to the nearest decimal number, but in some cases it would be better to round up, down, toward zero or infinity (seed for example bpo-44884). You can round explicitly before formatting, but this solution is also applicable for this issue: >>> '%5.1f' % (round(-.00001, 1) + 0.0)
' 0.0' |
I'm concerned about treating %-formatting specially. As far as float/complex, the logical and efficient place to put this change seems to be PyOS_double_to_string(), which affects all three formatting options. For example, the dtoa case is as simple as this change to format_float_short():
|
Sorry John, I don't understand your comment about "treating %-formatting |
%-formatting already doesn't support some formats that float.__format__ does, for example ','. So I agree we shouldn't modify %-formatting. I don't have much of an opinion on whether changing __format__ is a good idea or not. |
I see now. PyOS_double_to_string() could gain the extra flag to coerce negative zero but, out of the three formatting methods, only format() and f-string would use the flag. |
PyOS_double_to_string is part of the stable ABI. I don't recall if we're allowed to add new bitfield flags to a stable ABI function. We'd use a new Py_DTSF_NORMALIZE_NEGATIVE_0 flag for this feature. I suspect we can't add a flag, due to comparability reasons (new code setting the flag being used in old versions of python without it), but we'd need to research. I saw a similar discussion within the last few years, but of course now I can't find it. Maybe old versions would correctly ignore the new bit being set. This proposal becomes less interesting if we'd need to add a new function to support it. Although I guess we could do something that's internal-only. |
I'll share a draft PR soon (excluding Decimal), so far it's still looking straightforward.
That's one of the benefits of using bit flags in an ABI: backward-compatible extensibility. (An implementation can defeat it by intentionally failing if unknown bits are encountered, but the code in question doesn't appear to be doing this.)
Yes, I have experience with it.
|
After some digging, I believe this is the background on forking pieces of mpdecimal (and why the existing source copy inside Python doesn't count as a fork): If I understand correctly, the PR for supporting underscore separators in Decimal formatting is only taking control of generating a mpd_spec_t from the spec string. Formatting itself is still done by mpd_qformat_spec(). So there's outstanding work to also pull the formatting code itself into _decimal.c. (And this is wanted anyway to reconcile existing libmpdec formatting modifications: 298131a) And this is all because vendors have the crazy practice of unbundling libmpdec from Python. (If a project is bundling the source of another, there may be some reason...?) |
potential short-term solution for Decimal: if negative zero option is set and sign is negative:
pre-round into a temp using mpd_qrescale()
if mpd_iszero(temp):
change sign to positive |
implemented float and Decimal-- PR is ready for review |
Thanks, John. I should have time to review within the next week or so. |
Mark, would you give it a review this month? (PR has been marked stale.) |
[John]
Apologies; my holiday-break free time was nobbled from unexpected quarters. I can't promise to find time this month, but I can promise to try. I did at least skim through the PR, and while there are still likely some iterations needed I'm satisfied that this is technically feasible. But I'm afraid that's the easy part. If this is to go in, the other problem we still have to solve is achieving some consensus among the core developers that this is worth doing. Right now, judging by comments on this issue, I think I'm the only core dev who thinks this is a good idea; others are lukewarm at best, and I'm not willing to unilaterally approve and merge these changes without something closer to a consensus. There are a couple of ways forward here:
The fact that you've already created a working implementation so that people can experiment is a bonus when it comes to trying to sell this to others. I don't have the bandwidth to write a PEP, but I would be happy to act as PEP sponsor. |
Wow, thanks, Mark! I'm generally in favor. The selling points to me are that it needs to happen post-rounding, and the C++ discussion. It would be better if this were already accepted in C++. I'll note that the paper is proposing a 'z' modifier to the sign, so I guess for us that would translate to: [sign[optional-z]] instead of just sign. I'd have to noodle through the differences between that the proposed [sign][~]. I guess this would all be worked out in a PEP. My only reservation is Mark's comment: """For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that.""" If Mark is okay with that (right back at you, Mark!), then I think a PEP is the next step. It doesn't need to be huge, sort of like PEP-378. |
Thank you Mark and Eric.
The C++ paper proposes [sign][z] (i.e. you can have the
In the PR I was able to avoid taking that on by preprocessing the format string before handing it to mpdecimal. The code was already doing such things to handle the NULL fill character. |
PEP at python/peps#2295 |
I forgot to update here:
For the record, PEP-682 has been accepted. |
…0049) Add "z" format specifier to coerce negative 0 to zero. See #90153 (originally https://bugs.python.org/issue45995) for discussion. This covers `str.format()` and f-strings. Old-style string interpolation is not supported. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
PR #30049 has been merged! |
…onGH-30049) Add "z" format specifier to coerce negative 0 to zero. See python#90153 (originally https://bugs.python.org/issue45995) for discussion. This covers `str.format()` and f-strings. Old-style string interpolation is not supported. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Add what's new entry for PEP 682 in Python 3.11.
Add what's new entry for PEP 682 in Python 3.11. (cherry picked from commit 010284b) Co-authored-by: John Belmonte <john@neggie.net>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: