Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1091017
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T23:27:34+00:00 2026-05-16T23:27:34+00:00

I would like to load a random document out of a set of documents

  • 0

I would like to load a random document out of a set of documents stored in a CouchDB database. The method for picking and loading the document should conform to the following requirements:

  • Efficiency: The lookup of the document should be efficient, most importantly the time to load the document must not grow linearly with the total number of documents. This means the skip query argument cannot be used.

  • Uniform distribution: The choice should be truly random (as far as possible, using standard random number generators), every document should have equal chances of being chosen.

What is the best way to implement this in CouchDB?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T23:27:35+00:00Added an answer on May 16, 2026 at 11:27 pm

    After giving this some more thought, I came up with a solution. For completeness sake, I will first show two simple approaches and explain why they are flawed. The third solution is the one I’m going with.

    Approach 1: Skip

    This is the trivial solution: You have a simple view (let’s call it random) with a map function that emits all documents you want to choose from and the built-in _count reduce function. To pick a random document, follow these steps:

    • Find the total number of documents N in the view by calling:
      http://localhost:5984/db/_design/d/_view/random
    • Pick random number 0 <= i < N
    • Load the i‘th document:
      http://localhost:5984/db/_design/d/_view/random?reduce=false&skip=i&limit=1

    This approach is bad because it doesn’t scale well for large numbers of documents. According to this section of “CouchDB – The Definitive Guide” the skip argument should only be used with single-digit values.

    The solution above would have to loop through i documents before returning the chosen one. In SQL terms it’s the equivalent of a full table scan as opposed to an index lookup.

    Approach 2: Random Number in Document

    With this approach, a random number is generated for each document at creation time and stored in the document. An example document:

    {
      _id: "4f12782c39474fd0a498126c0400708c",
      rand: 0.4591819887660398,
      // actual data...
    }
    

    The random view then has the following map function:

    function(doc) {
      if (doc.rand) {
        emit(doc.rand, doc);
      }
    }      
    

    These are the steps to pick a random document:

    • Pick a random number 0 <= r < 1
    • Load the document:
      http://localhost:5984/db/_design/d/_view/random?startkey=r&limit=1
    • If no document is returned (because r is larger than the largest random number stored in the database), wrap around and load the first document.

    This is very fast and looks great at first sight. However, there’s a serious flaw: not all documents have the same chance of being picked.

    In the most simple example, there are two documents in the database. When I choose a random document a very large number of times, I want each document to come up half of the time. Let’s say the documents were assigned the random numbers 0.2 and 0.9 at creation time. So document A is picked when (r <= 0.2) or (r > 0.9) and document B is chosen when 0.2 < r <= 0.9. The chance of being picked is not 50% for each document, but 30% for A and 70% for B.

    You might think the situation improves when there are more documents in the database, but it really doesn’t. The intervals between documents get smaller, but the variation in interval size get’s even worse: Imagine three documents A, B and C with the random numbers 0.30001057, 0.30002057 and 0.30002058 (no other documents are in between). The chances of B being chosen are 1000 times greater than C being chosen. In the worst case, two documents are assigned the same random number. Then only one of them can be found (the one with the lower document id), the other is essentially invisible.

    Approach 3: A combination of 1 and 2

    The solution I came up with combines the speed of approach 2 with the fairness of approach 1. Here it is:

    As in approach 2, each document is assigned a random number at creation time, the same map function is used for the view. As in approach 1, I also have a _count reduce function.

    These are the steps for loading a random document:

    • Find the total number of documents N in the view by calling:
      http://localhost:5984/db/_design/d/_view/random
    • Pick random number 0 <= r < 1
    • Calculate random index: i = floor(r*N)
      My goal is to load the i‘th document (as in approach 1). Assuming the distribution of random numbers is more or less uniform, I’m guessing the i‘th document has a random value of approximately r.
    • Find the number of documents L with a random value lower than r:
      http://localhost:5984/db/_design/d/_view/random?endkey=r
    • See how far off our guess is: s = i - L
    • if (s>=0)
      http://localhost:5984/db/_design/d/_view/random?startkey=r&skip=s&limit=1&reduce=false
    • if (s<0)
      http://localhost:5984/db/_design/d/_view/random?startkey=r&skip=-(s+1)&limit=1&descending=true&reduce=false

    So, the trick is to guess the random number assigned to the i‘th document, look that up, see how far we’re off and then skip the number of documents by which we missed.

    The number of documents skipped should remain small even for large databases, since the accuracy of the guess will increase with the number of documents. My guess is that s remains constant when the database grows, but I haven’t tried and I don’t feel qualified to prove it theoretically.

    If you have a better solution, I’d be very interested!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to load some data stored in a file into my mysql
I would like to load a HTML document and modify it's text in PHP.
I would like to pick a random UIViewController from a stored list of view
I would like to load a new extension in a symfony2 project for twig
i would like to load xml into dataset with only 2 columns (name, price)
$('#selector').click(function() { // here I would like to load a javascript file // let's
I have an assembly I would like to load from a sub-folder of the
I have requirement where some times I would like to load children as well
I have several PDF templates that I would like to load and modify and
I have a C# .Net 2.0CF application where I would like to load a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.