Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6134885
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T17:24:43+00:00 2026-05-23T17:24:43+00:00

TL;DR version: I want to avoid adding duplicate Javascript objects to an array of

  • 0

TL;DR version: I want to avoid adding duplicate Javascript objects to an array of similar objects, some of which might be really big. What’s the best approach?

I have an application where I’m loading large amounts of JSON data into a Javascript data structure. While it’s a bit more complex than this, assume that I’m loading JSON into an array of Javascript objects from a server through a series of AJAX requests, something like:

var myObjects = [];

function processObject(o) {
    myObjects.push(o);
}

for (var x=0; x<1000; x++) {
    $.getJSON('/new_object.json', processObject);
}

To complicate matters, the JSON:

  • is in an unknown schema
  • is of arbitrary length (probably not enormous, but could be in the 100-200 kb range)
  • might contain duplicates across different requests

My initial thought is to have an additional object to store a hash of each object (via JSON.stringify?) and check against it on each load, like this:

var myHashMap = {};

function processObject(o) {
    var hash = JSON.stringify(o);
    // is it in the hashmap?
    if (!(myHashMap[hash])) {
        myObjects.push(o);
        // set the hashmap key for future checks
        myHashMap[hash] = true;
    }
    // else ignore this object
}

but I’m worried about having property names in myHashMap that might be 200 kb in length. So my questions are:

  • Is there a better approach for this problem than the hashmap idea?
  • If not, is there a better way to make a hash function for a JSON object of arbitrary length and schema than JSON.stringify?
  • What are the possible issues with super-long property names in an object?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T17:24:44+00:00Added an answer on May 23, 2026 at 5:24 pm

    I’d suggest you create an MD5 hash of the JSON.stringify(o) and store that in your hashmap with a reference to your stored object as the data for the hash. And to make sure that there are no object key order differences in the JSON.stringify(), you have to create a copy of the object that orders the keys.

    Then, when each new object comes in, you check it against the hash map. If you find a match in the hash map, then you compare the incoming object with the actual object that you’ve stored to see if they are truly duplicates (since there can be MD5 hash collisions). That way, you have a manageable hash table (with only MD5 hashes in it).

    Here’s code to create a canonical string representation of an object (including nested objects or objects within arrays) that handles object keys that might be in a different order if you just called JSON.stringify().

    // Code to do a canonical JSON.stringify() that puts object properties 
    // in a consistent order
    // Does not allow circular references (child containing reference to parent)
    JSON.stringifyCanonical = function(obj) {
        // compatible with either browser or node.js
        var Set = typeof window === "object" ? window.Set : global.Set;
    
        // poor man's Set polyfill
        if (typeof Set !== "function") {
            Set = function(s) {
                if (s) {
                    this.data = s.data.slice();
                } else {
                    this.data = [];
                }
            };
            Set.prototype = {
                add: function(item) {
                    this.data.push(item);
                },
                has: function(item) {
                    return this.data.indexOf(item) !== -1;
                }
            };
        }
    
        function orderKeys(obj, parents) {
            if (typeof obj !== "object") {
                throw new Error("orderKeys() expects object type");
            }
            var set = new Set(parents);
            if (set.has(obj)) {
                throw new Error("circular object in stringifyCanonical()");
            }
            set.add(obj);
            var tempObj, item, i;
            if (Array.isArray(obj)) {
                // no need to re-order an array
                // but need to check it for embedded objects that need to be ordered
                tempObj = [];
                for (i = 0; i < obj.length; i++) {
                    item = obj[i];
                    if (typeof item === "object") {
                        tempObj[i] = orderKeys(item, set);
                    } else {
                        tempObj[i] = item;
                    }
                }
            } else {
                tempObj = {};
                // get keys, sort them and build new object
                Object.keys(obj).sort().forEach(function(item) {
                    if (typeof obj[item] === "object") {
                        tempObj[item] = orderKeys(obj[item], set);
                    } else {
                        tempObj[item] = obj[item];
                    }
                });
            }
            return tempObj;
        }
    
        return JSON.stringify(orderKeys(obj));
    }
    

    And, the algorithm

    var myHashMap = {};
    
    function processObject(o) {
        var stringifiedCandidate = JSON.stringifyCanonical(o);
        var hash = CreateMD5(stringifiedCandidate);
        var list = [], found = false;
        // is it in the hashmap?
        if (!myHashMap[hash] {
            // not in the hash table, so it's a unique object
            myObjects.push(o);
            list.push(myObjects.length - 1);    // put a reference to the object with this hash value in the list
            myHashMap[hash] = list;             // store the list in the hash table for future comparisons
        } else {
            // the hash does exist in the hash table, check for an exact object match to see if it's really a duplicate
            list = myHashMap[hash];             // get the list of other object indexes with this hash value
            // loop through the list
            for (var i = 0; i < list.length; i++) {
                if (stringifiedCandidate === JSON.stringifyCanonical(myObjects[list[i]])) {
                    found = true;       // found an exact object match
                    break;
                }
            }
            // if not found, it's not an exact duplicate, even though there was a hash match
            if (!found) {
                myObjects.push(o);
                myHashMap[hash].push(myObjects.length - 1);
            }
        }
    }
    

    Test case for jsonStringifyCanonical() is here: https://jsfiddle.net/jfriend00/zfrtpqcL/

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to add a condition to avoid the load of some javascript code
I want to replace memcpy with my own optimized version to do some benchmarks.
Short version: I want to trigger the Form_Load() event without making the form visible.
I want install Android version 1.6 SDK. I already have Android development setup with
I want to get my databases under version control. I'll always want to have
I want to use the macports version of python instead of the one that
I want to publish a beta version of my application every time it builds,
I want to commit a version of jruby into my svn repository and there
I want to know the version of a mp3 file format. What should I
I want to deploy the release version of my application done in C#. When

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.