Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7814261
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T05:08:44+00:00 2026-06-02T05:08:44+00:00

I am having a hard time formulating my question so I’ll just show by

  • 0

I am having a hard time formulating my question so I’ll just show by example.

x = ['abc', 'c', 'w', 't', '3']
a, b = random_split(x, 3)      # first list should be length 3
# e.g. a => ['abc', 'w', 't']
# e.g. b => ['c', '3']

Is there an easy way of splitting a list into two random samples while maintaining the original ordering?


Edit: I know that I could use random.sample and then reorder, but I was hoping for an easy, simple, one line method.

Edit 2: Here’s another solution, see if you can improve it:

def random_split(l, a_size):
    a, b = [], []
    m = len(l)
    which = ([a] * a_size) + ([b] * (m - a_size)) 
    random.shuffle(which)

    for array, sample in zip(which, l):
        array.append(sample)

    return a, b

Edit 3: My concern in avoiding sorting was that in the best case scenario it is O(N*log(N)). It should be possible to get a function that scales O(N) Unfortunately, none of the solutions posted so far actually achieve O(N) Though, after a little thought I found one that works and is comparable to @PedroWerneck’s answer in performance. Though, I’m not 100% sure that is truly random.

def random_split(items, size):
  n = len(items)
  a, b = [], []
  for item in items:
    if size > 0 and random.random() < float(size)/n:
      b.append(item)
      size -= 1
    else:
      a.append(item)

    n -= 1

  return a, b
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T05:08:45+00:00Added an answer on June 2, 2026 at 5:08 am

    I believe it’s impossible to do the limiting and no sorting after splitting while keeping the randomness in a simpler way than just sampling and reordering.

    If there was no limit, it would be as random as the RNG can by by iterating over the list, and choosing randomly which destination list to send the values to:

    >>> import random
    >>> x = range(20)
    >>> a = []
    >>> b = []
    >>> for v in x:
    ...     random.choice((a, b)).append(v)
    ... 
    >>> a
    [0, 2, 3, 4, 6, 7, 10, 12, 15, 17]
    >>> b
    [1, 5, 8, 9, 11, 13, 14, 16, 18, 19]
    

    If you can deal with some bias, you can stop appending to the first list when it reaches the limit and still use the solution above. If you’ll deal with small lists like in your example, it shouldn’t be a big deal to retry it until you get the first list length right.

    If you want it to be really random and be able to limit the first list size, then you’ll have to give up and reorder at least one of the lists. The closest to a one liner implementation I can think is something like:

    >>> x = range(20)
    >>> b = x[:]
    >>> a = sorted([b.pop(b.index(random.choice(b))) for n in xrange(limit)])
    >>> a
    [0, 1, 5, 10, 15, 16, 17]
    >>> b
    [2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 18, 19]
    

    You have to sort a, but b has the order kept.

    edit

    Now, do you really have to avoid reordering at all costs? Many neat solutions were posted, and your second solution is very nice, but none of them is simpler, easier and shorter than:

    def random_split(items, size):
        sample = set(random.sample(items, size))
        return sorted(sample), sorted(set(items) - sample)
    

    Even considering both sorting operations, I think it’s hard to beat that one for simplicity and efficiency. Consider how optimized Python’s Timsort is and how most other methods have to iterate over the n items at least once for each list.

    If you really must avoid reordering, I guess this one also works and is very easy and simple, but iterates twice:

    def random_split(items, size):
        sample = set(random.sample(items, size))
        a = [x for x in items if x in sample]
        b = [x for x in items if x not in sample]
        return a, b
    

    This is essentially the same as Hexparrot’s solution with the set(sample) suggested by senderle to make comparisons O(1), and removing the redundant index sample and enumerate calls. You don’t need that if you deal only with hashable objects.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Hehe I'm having hard time on choosing the question title. But let me explain
I am having hard time getting hasErrors to work with indexed properties. For example
I'm having hard time figuring out what collection should I use, either the array
This question has probably already been answered somewhere but I'm having a hard time
i am having hard time determining the length of a Decimal data type. The
I am just having hard time understanding the difference between virutal memory vs physical
I am having hard time understanding how a decorated recursive function works. For the
We are having hard time figuring out the properties defined , minIdle , maxIdle
I'm having hard time trying to figure out how to auto-save user data in
I am having hard time in understanding Wait() , Pulse() , PulseAll() . Will

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.