I keep thinking there should be a function for this, but I’ve searched the likely places (google, itertools docs, list methods, other SO questions), but nowhere found quite what I was looking for.
Naive and working implementation:
def split_at_first_false(pred, seq):
first = []
second = []
true_so_far = True
for item in seq:
if true_so_far and pred(item):
first.append(item)
else:
true_so_far = False
second.append(item)
return first, second
print split_at_first_false(str.isalpha, "abc1a2b")
# (['a', 'b', 'c'], ['1', 'a', '2', 'b'])
It works, but it doesn’t feel right. There should be a better way to do this!
EDIT: I ended up with using a slightly modified version of senderle’s final suggestion after reviewing the answers:
from itertools import chain
def split_at_pred(pred, seq):
head = []
it = iter(seq)
for i in it:
if not pred(i):
head.append(i)
else:
return iter(head), chain([i], it)
return iter(head), iter([])
It’s short and elegant, output is two iterators no matter the input (strings, lists, iterators), and as a bonus, it even works with the following input:
from itertools import count
split_at_pred(lambda x: x == 5, count())
The other solutions, those that work at all with iterators, will run out of memory with this input. (Note that this is just a bonus. Infinite iterators was something I hadn’t even considered when I wrote this question)
This seems like a job for itertools.
This needs to be altered if
lis an iterator rather than a sequence.The downside of
teeis that the initial values are cached and tested twice (by bothtakewhileanddropwhile). That’s wasteful. But caching values is unavoidable if you want to both accept and return iterators.However, if you can return lists from an iterator, I can think of one solution that doesn’t make extra copies or tests, and it’s very close to yours:
The only sneaky bit is that where there would normally be a
breakstatement (i.e. after theelseclause), I’ve simply consumed the iterator, causing theforloop to terminate early.Finally, if you still want to return iterators, but don’t want to do extra tests, here’s a variation on the above that I believe is optimal.