Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8215065
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T11:39:54+00:00 2026-06-07T11:39:54+00:00

I wonder how to efficiently store a website URL in a database (mongoDB in

  • 0

I wonder how to efficiently store a website URL in a database (mongoDB in my case)…

The problem:
It should be indexed to achieve fast query performance but mongo allows indexes on fields smaller than 1024 bytes “only”.

I thought about hashing or base64 to shrink the URL… but since I use
a single threaded webserver (node.js) I don’t want to do heavy stuff on it…

Are there any good ideas about other ways to achieve this (the alternative representation
should be unique…)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T11:39:55+00:00Added an answer on June 7, 2026 at 11:39 am

    This very question comes up during 10gen’s MongoDB training and hashing is presented as the viable solution. Generating an MD5 hash for a URL shouldn’t be computationally intensive. I definitely wouldn’t suggest base64-encoding, as that’s only going to expand the URL string.

    The goal is to create an index with high cardinality, but that doesn’t mean the hashes have to be unique. If you include both the hash and URL in your query, you’ll take advantage of the highly-selective hash index and then MongoDB will match the URL among the index hits. In the following example, let’s pretend there is a hash collision for both URL’s:

    $ mongo --quiet
    > db.urls.insert({_id: 1, url: "http://google.com", hash: "c7b920f"});
    > db.urls.insert({_id: 2, url: "http://yahoo.com", hash: "c7b920f"});
    > db.urls.find({hash: "c7b920f"})
    { "_id" : 1, "url" : "http://google.com", "hash" : "c7b920f" }
    { "_id" : 2, "url" : "http://yahoo.com", "hash" : "c7b920f" }
    
    > db.urls.find({hash: "c7b920f", url: "http://google.com"})
    { "_id" : 1, "url" : "http://google.com", "hash" : "c7b920f" }
    
    > db.urls.ensureIndex({hash: 1})
    > db.urls.find({hash: "c7b920f", url: "http://google.com"}).explain()
    {
        "cursor" : "BtreeCursor hash_1",
        "nscanned" : 2,
        "nscannedObjects" : 2,
        "n" : 1,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
            "hash" : [
                [
                    "c7b920f",
                    "c7b920f"
                ]
            ]
        },
        "server" : "localhost:27017"
    }
    

    I’m not sure if you have additional business requirements to guarantee URL uniqueness throughout the collection, but the example above is just to show that it isn’t necessary from a querying standpoint. Of course, any hash algorithm is going to have some chance of collision, but you have better options than MD5 that would still satisfy the 1024-byte limit.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

wonder if someone could help me with a little problem. I have session values
I was wanting to create a personal note database, to store notes in HTML
I wonder what is the best way to transfert data from the database to
When I make a query, I often wonder about this: Is it better to
I wonder, is it possible to achieve similar using bit operations: if a >
wonder why when I export my project to *.air and I install it on
wonder what's wrong <table id=tblDomainVersion> <tr> <td>Version</td> <td>No of sites</td> </tr> <tr> <td class=clsversion>1.25</td>
wonder whether someone can help me with the following one... I have a struct
I wonder about that can I write native SQL to add or delete operations
I wonder, using Eclipse's PyDev plugin , how come documentation does not always show

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.