Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6094647
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T12:42:14+00:00 2026-05-23T12:42:14+00:00

In one of my Java 6 projects I have an array of LinkedHashMap instances

  • 0

In one of my Java 6 projects I have an array of LinkedHashMap instances as input to a method which has to iterate through all keys (i.e. through the union of the key sets of all maps) and work with the associated values. Not all keys exist in all maps and the method should not go through each key more than once or alter the input maps.

My current implementation looks like this:

Set<Object> keyset = new HashSet<Object>();

for (Map<Object, Object> map : input) {
    for (Object key : map.keySet()) {
        if (keyset.add(key)) {
            ...
        }
    }
}

The HashSet instance ensures that no key will be acted upon more than once.

Unfortunately this part of the code is rather critical performance-wise, as it is called very frequently. In fact, according to the profiler over 10% of the CPU time is spent in the HashSet.add() method.

I am trying to optimise this code us much as possible. The use of LinkedHashMap with its more efficient iterators (in comparison to the plain HashMap) was a significant boost, but I was hoping to reduce what is essentially book-keeping time to the minimum.

Putting all the keys in the HashSet before-hand, by using addAll() proved to be less efficient, due to the cost of calling HashSet.contains() afterwards.
At the moment I am looking at whether I can use a bitmap (well, a boolean[] to be exact) to avoid the HashSet completely, but it may not be possible at all, depending on my key range.

Is there a more efficient way to do this? Preferrably something that will not pose restrictions on the keys?

EDIT:

A few clarifications and comments:

  • I do need all the values from the maps – I cannot drop any of them.

  • I also need to know which map each value came from. The missing part (...) in my code would be something like this:

    for (Map<Object, Object> m : input) {
        Object v = m.get(key);
    
        // Do something with v
    }
    

    A simple example to get an idea of what I need to do with the maps would be to print all maps in parallel like this:

    Key Map0 Map1 Map2
    F   1    null 2
    B   2    3    null
    C   null null 5
    ...
    

    That’s not what I am actually doing, but you should get the idea.

  • The input maps are extremely variable. In fact, each call of this method uses a different set of them. Therefore I would not gain anything by caching the union of their keys.

  • My keys are all String instances. They are sort-of-interned on the heap using a separate HashMap, since they are pretty repetitive, therefore their hash code is already cached and most hash validations (when the HashMap implementation is checking whether two keys are actually equal, after their hash codes match) boil down to an identity comparison (==). The profiler confirms that only 0.5% of the CPU time is spent on String.equals() and String.hashCode().

EDIT 2:

Based on the suggestions in the answers, I made a few tests, profiling and benchmarking along the way. I ended up with roughly a 7% increase in performance. What I did:

  • I set the initial capacity of the HashSet to double the collective size of all input maps. This gained me something in the region of 1-2%, by eliminating most (all?) resize() calls in the HashSet.

  • I used Map.entrySet() for the map I am currently iterating. I had originally avoided this approach due to the additional code and the fear that the extra checks and Map.Entry getter method calls would outweigh any advantages. It turned out that the overall code was slightly faster.

  • I am sure that some people will start screaming at me, but here it is: Raw types. More specifically I used the raw form of HashSet in the code above. Since I was already using Object as its content type, I do not lose any type safety. The cost of that useless checkcast operation when calling HashSet.add() was apparently important enough to produce a 4% increase in performance when removed. Why the JVM insists on checking casts to Object is beyond me…

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T12:42:15+00:00Added an answer on May 23, 2026 at 12:42 pm

    Can’t provide a replacement for your approach but a few suggestions to (slightly) optimize the existing code.

    1. Consider initializing the hash set with a capacity (the sum of the sizes of all maps). This avoids/reduces resizing of the set during an add operation
    2. Consider not using the keySet() as it will always create a new set in the background. Use the entrySet(), that should be much faster
    3. Have a look at the implementations of equals() and hashCode() – if they are “expensive”, then you have a negative impact on the add method.
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have 4 different Java Projects created in Eclipse. One of them contains all
I have an ear app with two EJB and one WEB (war) projects all
Hypothetically, you have two products, one written in Java, the other in C#. You
In Java, one can declare a variable parameterised by an unknown generic type, which
I want to be able to have two projects, one that contains production code
My company creates multiple BlackBerry apps that all use a few shared Java projects.
I have written two simple Java classes (one of them containing main(), and the
I have an eclipse workspace containing two projects, which I'll call foo-lib and foo-app.
I have a mainActivity which is Customer.java with listview of 5 diff. activities. I
Env: Maven 2.2.1 I have two java projects under svn (projectA, projectB). My maven

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.