In version 3.10, zip() has gained the strict option, which causes a ValueError to be raised when the iterables have different lengths (cf. #84816). The documentation describes this as follows:
Unlike the default behavior, it checks that the lengths of iterables are identical, raising a ValueError if they aren’t:
In my opinion, this is confusing at best. zip() in strict mode does not check the len() of the iterables, as that sentence might lead one to think. Rather – and exactly as I expected before reading the documentation – it checks that all iterables are exhausted at the same time (or more specifically, it checks that if next() on the iterator for one iterable raises a StopIteration, the others do as well). This distinction is important in the context of iterables that do not have a length, e.g. generator functions. It also makes it clear that the error is only raised when one of the iterables reaches exhaustion, which may be important e.g. in a for ... in zip(...) loop, since the loop body would be executed for the matching pairs before an error is raised. Depending on what the user is doing, they may therefore still want an explicit len() check before running zip() to avoid having to roll back later, for example.
Note that PEP-618 (which is not linked from the docs) does not contain this misleading language:
When enabled, a ValueError is raised if one of the arguments is exhausted before the others.
And likewise in zip()'s docstring:
If strict is true and one of the arguments is exhausted before the others, raise a ValueError.
I think this language of an 'exhausted iterable' should be used in the documentation as well. It is already used for other core functions such as map() and next(). Even the zip() documentation uses it, but only in the description of the default behaviour without strict. I feel like, compared to the two quotes above, the timing of the exception deserves an extra explanation though.
I will submit a PR with my proposed changes shortly.
JustAnotherArchivist commentedDec 8, 2022
•
edited by bedevere-bot
In version 3.10,
zip()
has gained thestrict
option, which causes aValueError
to be raised when the iterables have different lengths (cf. #84816). The documentation describes this as follows:In my opinion, this is confusing at best.
zip()
instrict
mode does not check thelen()
of the iterables, as that sentence might lead one to think. Rather – and exactly as I expected before reading the documentation – it checks that all iterables are exhausted at the same time (or more specifically, it checks that ifnext()
on the iterator for one iterable raises aStopIteration
, the others do as well). This distinction is important in the context of iterables that do not have a length, e.g. generator functions. It also makes it clear that the error is only raised when one of the iterables reaches exhaustion, which may be important e.g. in afor ... in zip(...)
loop, since the loop body would be executed for the matching pairs before an error is raised. Depending on what the user is doing, they may therefore still want an explicitlen()
check before runningzip()
to avoid having to roll back later, for example.Note that PEP-618 (which is not linked from the docs) does not contain this misleading language:
And likewise in
zip()
's docstring:I think this language of an 'exhausted iterable' should be used in the documentation as well. It is already used for other core functions such as
map()
andnext()
. Even thezip()
documentation uses it, but only in the description of the default behaviour withoutstrict
. I feel like, compared to the two quotes above, the timing of the exception deserves an extra explanation though.I will submit a PR with my proposed changes shortly.
Linked PRs
The text was updated successfully, but these errors were encountered: