Description
The current version of takewhile()
has a problem. The element that first fails the predicate condition is consumed from the iterator and there is no way to access it. This is the premise behind the existing recipe before_and_after()
.
I propose to extend the current API to allow that element to be captured. This is fully backwards compatible but addresses use cases that need all of the data not returned by the takewhile iterator.
Option 0:
In pure Python, the new takewhile()
could look like this:
def takewhile(predicate, iterable, *, transition=None):
# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
for x in iterable:
if predicate(x):
yield x
else:
if transition is not None: # <-- This is the new part
transition.append(x) # <-- This is the new part
break
It could be used like this:
>>> input_it = iter([1, 4, 6 ,4, 1])
>>> transition_list = []
>>> takewhile_it = takewhile(lambda x: x<5, input_it, transition=transition_list)
>>> print('Under five:', list(takewhile_it))
[1, 4]
>>> remainder = chain(transition_list, input_it)
>>> print('Remainder:', list(remainder))
[6, 4, 1]
The API is a bit funky. While this pattern is common in C programming, I rarely see something like it in Python. This may be the simplest solution for accessing the last value (if any) consumed from the input. The keyword argument transition
accurately describes a list containing the transition element if there is one, but some other parameter name may be better.
Option 1:
We could have a conditional signature that returns two iterators if a flag is set:
true_iterator = takewhile(predicate, iterable, remainder=False)
true_iterator, remainder_iterator = takewhile(predicate, iterable, remainder=True)
Option 2:
Create a completely separate itertool by promoting the before_and_after()
recipe to be a real itertool:
true_iterator, remainder_iterator = before_and_after(predicate, iterable)
I don't really like option 2 because it substantially duplicates takewhile()
leaving a permanent tension between the two.