Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 487605
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T01:37:20+00:00 2026-05-13T01:37:20+00:00

After reading this question I started to wonder: is it possible to have a

  • 0

After reading this question I started to wonder: is it possible to have a shuffling algorithm which does not modify or copy the original list?

To make it clear:

Imagine you are given a list of objects. The list size can be arbitrary, but assume it’s pretty large (say, 10,000,000 items). You need to print out the items of the list in random order, and you need to do it as fast as possible. However, you should not:

  • Copy the original list, because it’s very large and copying would waste a LOT of memory (probably hitting the limits of available RAM);
  • Modify the original list, because it’s sorted in some way and some other part later on depends on it being sorted.
  • Create an index list, because, again, the list is very large and copying takes all too much time and memory. (Clarification: this is meant any other list, which has the same number of elements as the original list).

Is this possible?

Added: More clarifications.

  1. I want the list to be shuffled in true random way with all permutations equally likely (of course, assuming we have a proper Rand() function to start with).
  2. Suggestions that I make a list of pointers, or a list of indices, or any other list that would have the same number of elements as the original list, is explicitly deemed as inefficient by the original question. You can create additional lists if you want, but they should be serious orders of magnitude smaller than the original list.
  3. The original list is like an array, and you can retrieve any item from it by its index in O(1). (So no doubly-linked list stuff, where you have to iterate through the list to get to your desired item.)

Added 2: OK, let’s put it this way: You have a 1TB HDD filled with data items, each 512 bytes large (a single sector). You want to copy all this data to another 1TB HDD while shuffling all the items. You want to do this as fast as possible (single pass over data, etc). You have 512MB of RAM available, and don’t count on swap. (This is a theoretical scenario, I don’t have anything like this in practice. I just want to find the perfect algorithm.item.)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T01:37:21+00:00Added an answer on May 13, 2026 at 1:37 am

    Here is a very simple proof that no PRNG scheme can work:

    The PRNG idea has two phases: first, select a PRNG and its initial state; second, use the PRNG to shuffle the output. Well, there are N! possible permutations, so you need at least N! different possible start states, entering phase 2. This means that at the start of phase 2 you must have at least log2 N! bits of state, which isn’t allowed.

    However this does not rule out schemes where the algorithm receives new random bits from the environment as it goes. There might be, say, a PRNG that reads its initial state lazily and yet is guaranteed not to repeat. Can we prove there isn’t?

    Suppose we do have a perfect shuffling algorithm. Imagine we start running it, and when it’s halfway done, we put the computer to sleep. Now the full state of the program has been saved somewhere. Let S be the set of all possible states the program could be in at this halfway mark.

    Since the algorithm is correct and guaranteed to terminate, there is a function f which, given the saved program state plus any long enough string of bits, produces a valid sequence of disk reads and writes completing the shuffle. The computer itself implements this function. But consider it as a mathematical function:

    f : (S × bits) → sequence of reads and writes

    Then, trivially, there exists a function g which, given only the saved program state, produces the set of disk locations yet to be read and written. (Simply pass some arbitrary string of bits to f, then look at the results.)

    g : S → set of locations to read and write

    The remaining bit of the proof is to show that the domain of g contains at least NCN/2 different sets regardless of the choice of algorithm. If that’s true, there must be at least that many elements of S, and so the state of the program must contain at least log2 NCN/2 bits at the halfway mark, in violation of the requirements.

    I’m not sure how to prove that last bit, though, since either the set-of-locations-to-read or the set-of-locations-to-write can be low-entropy, depending on the algorithm. I suspect there’s some obvious principle of information theory that can cut the knot. Marking this community wiki in the hopes someone will supply it.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

After reading this question , I was reminded of when I was taught Java
After reading this on the question How do I uniquely identify computers visiting my
After reading the nice answers in this question , I watched the screencasts by
After posting this question and reading that one I realized that it is very
After reading this very informative (albeit somewhat argumentative) question I would like to know
After reading this description of late static binding (LSB) I see pretty clearly what
After reading this answer: best way to pick a random subset from a collection?
After reading this discussion and this discussion about using CrashRpt to generate a crash
I got a little curious after reading this /. article over hijacking HTTPS cookies.
I've tried to do this several times with no luck. After reading this post

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.