Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8022771
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T22:21:57+00:00 2026-06-04T22:21:57+00:00

Here are two functions that split iterable items to sub-lists. I believe that this

  • 0

Here are two functions that split iterable items to sub-lists. I believe that this type of task is programmed many times. I use them to parse log files that consist of repr lines like (‘result’, ‘case’, 123, 4.56) and (‘dump’, ..) and so on.

I would like to change these so that they will yield iterators rather than lists. Because the list may grow pretty large, but I may be able to decide to take it or skip it based on first few items. Also, if the iter version is available I would like to nest them, but with these list versions that would waste some memory by duplicating parts.

But deriving multiple generators from an iterable source wan’t easy for me, so I ask for help. If possible, I wish to avoid introducing new classes.

Also, if you know a better title for this question, please tell me.

Thank you!

def cleave_by_mark (stream, key_fn, end_with_mark=False):
    '''[f f t][t][f f] (true) [f f][t][t f f](false)'''
    buf = []
    for item in stream:
        if key_fn(item):
            if end_with_mark: buf.append(item)
            if buf: yield buf
            buf = []
            if end_with_mark: continue
        buf.append(item)
    if buf: yield buf

def cleave_by_change (stream, key_fn):
    '''[1 1 1][2 2][3][2 2 2 2]'''
    prev = None
    buf = []
    for item in stream:
        iden = key_fn(item)
        if prev is None: prev = iden
        if prev != iden:
            yield buf
            buf = []
            prev = iden
        buf.append(item)
    if buf: yield buf

edit: my own answer

Thanks to everyone’s answer, I could write what I asked for! Of course, as for the “cleave_for_change” function I could also use itertools.groupby.

def cleave_by_mark (stream, key_fn, end_with_mark=False):
    hand = []
    def gen ():
        key = key_fn(hand[0])
        yield hand.pop(0)
        while 1:
            if end_with_mark and key: break
            hand.append(stream.next())
            key = key_fn(hand[0])
            if (not end_with_mark) and key: break
            yield hand.pop(0)
    while 1:
        # allow StopIteration in the main loop
        if not hand: hand.append(stream.next())
        yield gen()

for cl in cleave_by_mark (iter((1,0,0,1,1,0)), lambda x:x):
    print list(cl),  # start with 1
# -> [1, 0, 0] [1] [1, 0]
for cl in cleave_by_mark (iter((0,1,0,0,1,1,0)), lambda x:x):
    print list(cl),
# -> [0] [1, 0, 0] [1] [1, 0]
for cl in cleave_by_mark (iter((1,0,0,1,1,0)), lambda x:x, True):
    print list(cl),  # end with 1
# -> [1] [0, 0, 1] [1] [0]
for cl in cleave_by_mark (iter((0,1,0,0,1,1,0)), lambda x:x, True):
    print list(cl),
# -> [0, 1] [0, 0, 1] [1] [0]

/

def cleave_by_change (stream, key_fn):
    '''[1 1 1][2 2][3][2 2 2 2]'''
    hand = []
    def gen ():
        headkey = key_fn(hand[0])
        yield hand.pop(0)
        while 1:
            hand.append(stream.next())
            key = key_fn(hand[0])
            if key != headkey: break
            yield hand.pop(0)
    while 1:
        # allow StopIteration in the main loop
        if not hand: hand.append(stream.next())
        yield gen()

for cl in cleave_by_change (iter((1,1,1,2,2,2,3,2)), lambda x:x):
    print list(cl),
# -> [1, 1, 1] [2, 2, 2] [3] [2]

CAUTION: If anyone’s going to use these, be sure to exhaust the generators at every level, as Andrew pointed out. Because otherwise the outer generator-yielding loop will restart right where the inner generator left instead of where the next “block” begins.

stream = itertools.product('abc','1234', 'ABCD')
for a in iters.cleave_by_change(stream, lambda x:x[0]):
    for b in iters.cleave_by_change(a, lambda x:x[1]):
        print b.next()
        for sink in b: pass
    for sink in a: pass

('a', '1', 'A')
('b', '1', 'A')
('c', '1', 'A')
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T22:21:58+00:00Added an answer on June 4, 2026 at 10:21 pm

    adam’s answer is good. this is just in case you’re curious how to do it by hand:

    def cleave_by_change(stream):
        def generator():
            head = stream[0]
            while stream and stream[0] == head:
                yield stream.pop(0)
        while stream:
            yield generator()
    
    for g in cleave_by_change([1,1,1,2,2,3,2,2,2,2]):
        print list(g)
    

    which gives:

    [1, 1, 1]
    [2, 2]
    [3]
    [2, 2, 2, 2]
    

    (previous version required a hack or, in python 3, nonlocal because i assigned to stream inside generator() which made (a second variable also called) stream local to generator() by default – credit to gnibbler in the comments).

    note that this approach is dangerous – if you don’t “consume” the generators that are returned then you will get more and more, because stream is not getting any smaller.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Here's my send email function that sends the email however i have two different
here are my two functions: public void SetCompanies() //set the Companies table from Shret.net
Here are two example PHP functions: function myimage1() { global $post; $attachment_id = get_post_meta($post->ID,
Here's two screen shots, showing the effect with a small viewport that has to
I am trying to use the functions suggested here to split a string by
I have two functions that I have overloaded for my debug class: template<class IteratorT>
I've got a bit of code here that calculates the hours difference between two
Here is two variants. First: int n = 42; int* some_function(int* input) { int*
simonn helped me to code an ordered integer partition function here. He posted two
What am i doing wrong here? Java code for computing prefix function. Two input

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.