string formatting: normalize negative zero #90153

belm0 · 2021-12-06T12:03:47Z

BPO	45995
Nosy	@mdickinson, @belm0, @ericvsmith, @stevendaprano, @serhiy-storchaka, @belm0
PRs	#30049

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/mdickinson'
closed_at = None
created_at = <Date 2021-12-06.12:03:46.943>
labels = ['type-feature', 'library', '3.11']
title = 'string formatting: normalize negative zero'
updated_at = <Date 2022-03-13.12:04:51.159>
user = 'https://github.com/belm0'

bugs.python.org fields:

activity = <Date 2022-03-13.12:04:51.159>
actor = 'mark.dickinson'
assignee = 'mark.dickinson'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2021-12-06.12:03:46.943>
creator = 'John Belmonte'
dependencies = []
files = []
hgrepos = []
issue_num = 45995
keywords = ['patch']
message_count = 23.0
messages = ['407792', '407793', '407796', '407810', '407822', '407857', '407905', '407928', '407929', '407930', '407934', '407973', '407990', '408367', '408413', '408529', '408800', '410838', '411243', '411295', '411302', '412360', '415034']
nosy_count = 6.0
nosy_names = ['mark.dickinson', 'jbelmonte', 'eric.smith', 'steven.daprano', 'serhiy.storchaka', 'John Belmonte']
pr_nums = ['30049']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue45995'
versions = ['Python 3.11']

belm0 · 2021-12-06T12:03:47Z

proposal: add a string formatting option to normalize negative 0 values to 0

use case: rounded display of a float that is nominally 0, where the distraction of a flashing minus sign from minute changes around 0 is unwanted

example:
>>> '%~5.1f' % -.00001
'  0.0'

format spec before:
format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type]
after:
format_spec ::= [[fill]align][sign][~][#][0][width][grouping_option][.precision][type]
where '~' is only allowed for number types

implementation: if '~' is present in the spec, add 0 to the value after applying precision

serhiy-storchaka · 2021-12-06T12:15:52Z

To normalize negative 0.0 to 0.0 you can just add 0.0. It will work with any method of converting floats to string, there is no need to change all formatting specifications.

>>> x = -0.0
>>> x
-0.0
>>> x + 0.0
0.0

belm0 · 2021-12-06T12:30:06Z

To normalize negative 0.0 to 0.0 you can just add 0.0.

yes, I'm aware-- see "implementation" in the original post

there is no need to change all formatting specifications

For adding 0 to work, it must be done after the rounding. That means if you want make use of anything in the current formatting spec regarding precision, normalizing negative zero would need to be a proper option of the formatting spec.

stevendaprano · 2021-12-06T14:18:04Z

It was decided long ago that % formatting would not be enhanced with new features. I think that it is supposed to match the standard C formatting codes, and nothing else.

So this is unlikely to be approved for % formatting.

It *might* be approved for the format method and f-strings. (I suspect that unless you get immediate and uncontroversial agreement from multiple core developers, you may need to take it for further discussion and perhaps even a PEP.)

What you call a "distraction" I consider to be critical part of the display. The numbers you are displaying actually are negative, and rounding them for display does not change that. So they ought to show the minus sign. So I'm not really very sympathetic to this feature request.

belm0 · 2021-12-06T15:10:31Z

Here is the same proposal made for C++ std::format:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1496r2.pdf

It makes fair arguments for the feature's use, and explains why the problem is hard to work around.

It was withdrawn by the author for C++20, but a consensus proposal is promised for C++23.

mdickinson · 2021-12-06T19:03:36Z

I'd support having this functionality available for format and for f-strings. (As Steven says, changing %-formatting doesn't seem viable.) It really is awkward to do this in any other way, and I'm reliably informed that normal people don't expect to see negative zeros in formatted numeric output.

It did take me a few minutes to get my head around the idea that f"{-0.01:+.1f}" would return "+0.0" rather than "-0.0" or " 0.0" or just plain "0.0" under this proposal, but I agree that it seems like the only thing that can be consistent and make sense.

I'm not 100% convinced by the particular spelling proposed, but I don't have anything better to suggest. If C++ might be going with a "z", would it make sense to do the same for Python?

I don't forsee any implementation difficulties for float and complex types. For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that. Once we've done that, again the implementation doesn't seem onerous.

serhiy-storchaka · 2021-12-07T09:54:43Z

Well, it makes sense for negative zero produced by rounding.

But if we add a special support for this case, it would be useful to have some control on the type of rounding. Currently floats are rounded to the nearest decimal number, but in some cases it would be better to round up, down, toward zero or infinity (seed for example bpo-44884).

You can round explicitly before formatting, but this solution is also applicable for this issue:

>>> '%5.1f' % (round(-.00001, 1) + 0.0)
'  0.0'

belm0 · 2021-12-07T12:22:15Z

changing %-formatting doesn't seem viable

I'm concerned about treating %-formatting specially. As far as float/complex, the logical and efficient place to put this change seems to be PyOS_double_to_string(), which affects all three formatting options.

For example, the dtoa case is as simple as this change to format_float_short():

/* coerce negative zero to positive */
if (sign == 1 && ((digits_len == 0 && decpt == -1) ||
                  (digits_len == 1 && digits[0] == '0'))) {
    sign = 0;
}

stevendaprano · 2021-12-07T12:53:02Z

Sorry John, I don't understand your comment about "treating %-formatting
specifically". Isn't the point here not to change %-formatting at all?

ericvsmith · 2021-12-07T12:55:36Z

%-formatting already doesn't support some formats that float.__format__ does, for example ','.

So I agree we shouldn't modify %-formatting. I don't have much of an opinion on whether changing __format__ is a good idea or not.

belm0 · 2021-12-07T13:14:13Z

I see now. PyOS_double_to_string() could gain the extra flag to coerce negative zero but, out of the three formatting methods, only format() and f-string would use the flag.

ericvsmith · 2021-12-07T21:48:17Z

PyOS_double_to_string is part of the stable ABI. I don't recall if we're allowed to add new bitfield flags to a stable ABI function. We'd use a new Py_DTSF_NORMALIZE_NEGATIVE_0 flag for this feature.

I suspect we can't add a flag, due to comparability reasons (new code setting the flag being used in old versions of python without it), but we'd need to research. I saw a similar discussion within the last few years, but of course now I can't find it. Maybe old versions would correctly ignore the new bit being set.

This proposal becomes less interesting if we'd need to add a new function to support it. Although I guess we could do something that's internal-only.

belm0 · 2021-12-08T01:51:59Z

I'll share a draft PR soon (excluding Decimal), so far it's still looking straightforward.

Maybe old versions would correctly ignore the new bit being set.

That's one of the benefits of using bit flags in an ABI: backward-compatible extensibility. (An implementation can defeat it by intentionally failing if unknown bits are encountered, but the code in question doesn't appear to be doing this.)

You can round explicitly before formatting

>>> '%5.1f' % (round(-.00001, 1) + 0.0)

Yes, I have experience with it.

even as a one-off, it's questionable. If someone accidentally changes the precision in only one of the spec string or round call, that's a bug.
since applications and libraries may pass around format specs, and because of (1), you'll try to make a programmatic solution. Now you're parsing format spec strings.
while a programmatic solution can be done for a function API like format() that takes a separate spec and value, there is no sane way to wrap f-strings

belm0 · 2021-12-12T08:31:27Z

For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that.

After some digging, I believe this is the background on forking pieces of mpdecimal (and why the existing source copy inside Python doesn't count as a fork):
https://bugs.python.org/issue45708#msg405895
#29438

If I understand correctly, the PR for supporting underscore separators in Decimal formatting is only taking control of generating a mpd_spec_t from the spec string. Formatting itself is still done by mpd_qformat_spec().

So there's outstanding work to also pull the formatting code itself into _decimal.c. (And this is wanted anyway to reconcile existing libmpdec formatting modifications: 298131a)

And this is all because vendors have the crazy practice of unbundling libmpdec from Python. (If a project is bundling the source of another, there may be some reason...?)

belm0 · 2021-12-12T23:52:33Z

potential short-term solution for Decimal:

  if negative zero option is set and sign is negative:
      pre-round into a temp using mpd_qrescale()
      if mpd_iszero(temp):
          change sign to positive

belm0 · 2021-12-14T13:25:50Z

implemented float and Decimal-- PR is ready for review

mdickinson · 2021-12-17T16:49:46Z

Thanks, John. I should have time to review within the next week or so.

belm0 · 2022-01-18T00:45:27Z

Mark, would you give it a review this month? (PR has been marked stale.)

mdickinson · 2022-01-22T10:00:59Z

[John]

Mark, would you give it a review this month?

Apologies; my holiday-break free time was nobbled from unexpected quarters. I can't promise to find time this month, but I can promise to try. I did at least skim through the PR, and while there are still likely some iterations needed I'm satisfied that this is technically feasible.

But I'm afraid that's the easy part. If this is to go in, the other problem we still have to solve is achieving some consensus among the core developers that this is worth doing. Right now, judging by comments on this issue, I think I'm the only core dev who thinks this is a good idea; others are lukewarm at best, and I'm not willing to unilaterally approve and merge these changes without something closer to a consensus. There are a couple of ways forward here:

Post the proposal on python-ideas to get wider visibility and feedback. If everyone agrees this is a great idea (from experience, this seems an unlikely outcome), then we can go ahead and merge. Otherwise we'd likely need a PEP to move forward.
Bypass the python-ideas step, write the PEP, discuss in the appropriate forums, and then submit to the SC for approval / rejection.
Convince Eric Smith. :-) With apologies to Eric for singling him out: Eric could reasonably be described as the steward/maintainer of the formatting machinery, so if he's persuaded, that's good enough for me.

The fact that you've already created a working implementation so that people can experiment is a bonus when it comes to trying to sell this to others.

I don't have the bandwidth to write a PEP, but I would be happy to act as PEP sponsor.

ericvsmith · 2022-01-22T22:17:17Z

Wow, thanks, Mark!

I'm generally in favor. The selling points to me are that it needs to happen post-rounding, and the C++ discussion. It would be better if this were already accepted in C++. I'll note that the paper is proposing a 'z' modifier to the sign, so I guess for us that would translate to: [sign[optional-z]] instead of just sign. I'd have to noodle through the differences between that the proposed [sign][~]. I guess this would all be worked out in a PEP.

My only reservation is Mark's comment: """For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that."""

If Mark is okay with that (right back at you, Mark!), then I think a PEP is the next step. It doesn't need to be huge, sort of like PEP-378.

belm0 · 2022-01-22T23:18:51Z

Thank you Mark and Eric.

I'll note that the paper is proposing a 'z' modifier to the sign, so I guess for us that would translate to: [sign[optional-z]] instead of just sign. I'd have to noodle through the differences between that the proposed [sign][~].

The C++ paper proposes [sign][z] (i.e. you can have the z alone without an explicit +/-), and this is what I implemented in the Python PR. My original proposal with tilde was discarded.

My only reservation is Mark's comment: """For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that."""

In the PR I was able to avoid taking that on by preprocessing the format string before handing it to mpdecimal. The code was already doing such things to handle the NULL fill character.

belm0 · 2022-02-02T13:54:51Z

PEP at python/peps#2295

mdickinson · 2022-03-13T12:04:51Z

I forgot to update here:

PEP at python/peps#2295

For the record, PEP-682 has been accepted.

…0049) Add "z" format specifier to coerce negative 0 to zero. See #90153 (originally https://bugs.python.org/issue45995) for discussion. This covers `str.format()` and f-strings. Old-style string interpolation is not supported. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>

mdickinson · 2022-04-11T15:49:31Z

PR #30049 has been merged!

…onGH-30049) Add "z" format specifier to coerce negative 0 to zero. See python#90153 (originally https://bugs.python.org/issue45995) for discussion. This covers `str.format()` and f-strings. Old-style string interpolation is not supported. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>

Add what's new entry for PEP 682 in Python 3.11.

Add what's new entry for PEP 682 in Python 3.11. (cherry picked from commit 010284b) Co-authored-by: John Belmonte <john@neggie.net>

belm0 mannequin added stdlib type-feature labels Dec 6, 2021

stevendaprano added 3.11 labels Dec 6, 2021

mdickinson assigned mdickinson Dec 17, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

mdickinson closed this as completed Apr 11, 2022

mdickinson pushed a commit that referenced this issue Jun 11, 2022

gh-90153: whatsnew: "z" option in format spec (GH-93624)

010284b

Add what's new entry for PEP 682 in Python 3.11.

miss-islington added a commit that referenced this issue Jun 11, 2022

gh-90153: whatsnew: "z" option in format spec (GH-93624)

36ece2e

Add what's new entry for PEP 682 in Python 3.11. (cherry picked from commit 010284b) Co-authored-by: John Belmonte <john@neggie.net>

string formatting: normalize negative zero #90153

string formatting: normalize negative zero #90153

belm0 mannequin commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

serhiy-storchaka commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

stevendaprano commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

mdickinson commented Dec 6, 2021

serhiy-storchaka commented Dec 7, 2021

belm0 mannequin commented Dec 7, 2021

stevendaprano commented Dec 7, 2021

ericvsmith commented Dec 7, 2021

belm0 mannequin commented Dec 7, 2021

ericvsmith commented Dec 7, 2021

belm0 mannequin commented Dec 8, 2021

belm0 mannequin commented Dec 12, 2021

belm0 mannequin commented Dec 12, 2021

belm0 mannequin commented Dec 14, 2021

mdickinson commented Dec 17, 2021

belm0 mannequin commented Jan 18, 2022

mdickinson commented Jan 22, 2022

ericvsmith commented Jan 22, 2022

belm0 mannequin commented Jan 22, 2022

belm0 mannequin commented Feb 2, 2022

mdickinson commented Mar 13, 2022

mdickinson commented Apr 11, 2022

string formatting: normalize negative zero #90153

string formatting: normalize negative zero #90153

Comments

belm0 mannequin commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

serhiy-storchaka commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

stevendaprano commented Dec 6, 2021

belm0 mannequin commented Dec 6, 2021

mdickinson commented Dec 6, 2021

serhiy-storchaka commented Dec 7, 2021

belm0 mannequin commented Dec 7, 2021

stevendaprano commented Dec 7, 2021

ericvsmith commented Dec 7, 2021

belm0 mannequin commented Dec 7, 2021

ericvsmith commented Dec 7, 2021

belm0 mannequin commented Dec 8, 2021

belm0 mannequin commented Dec 12, 2021

belm0 mannequin commented Dec 12, 2021

belm0 mannequin commented Dec 14, 2021

mdickinson commented Dec 17, 2021

belm0 mannequin commented Jan 18, 2022

mdickinson commented Jan 22, 2022

ericvsmith commented Jan 22, 2022

belm0 mannequin commented Jan 22, 2022

belm0 mannequin commented Feb 2, 2022

mdickinson commented Mar 13, 2022

mdickinson commented Apr 11, 2022