Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 436263
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T20:27:35+00:00 2026-05-12T20:27:35+00:00

I have a large (more than 100K objects) collection of Java objects like below.

  • 0

I have a large (more than 100K objects) collection of Java objects like below.

public class User
{
   //declared as public in this example for brevity...
   public String first_name;
   public String last_name;
   public String ssn;
   public String email;
   public String blog_url;
   ...
}

Now, I need to search this list for an object where at least 3 (any 3 or more) attributes match those of the object being searched.

For example, if I am searching for an object that has

 first_name="John",
 last_name="Gault",
 ssn="000-00-0000",
 email="xyz@abc.com", 
 blog_url="http://myblog.wordpress.com" 

The search should return me all objects where first_name,last_name and ssn match or those where last_name, ssn, email and blog_url match. Likewise, there could be other combinations.

I would like to know what’s the best data-structure/algorithm to use in this case. For an exact search, I could have used a hashset or binary search with a custom comparator, but I am not sure what’s the most efficient way to perform this type of search.

P.S.

  • This is not a homework exercise.

  • I am not sure if the question title is appropriate. Please feel free to edit.

EDIT
Some of you have pointed out the fact that I could use ssn (for ex.) for the search as it is more or less unique. The exmaple above is only illustrative of the real scenario. In reality, I have several objects where some of the fields are null so I would like to search on other fields.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T20:27:36+00:00Added an answer on May 12, 2026 at 8:27 pm

    I don’t think that there are any specific data structures to make this kind of matching / comparison fast.

    At the simple level of comparing two objects, you might implement a method like this:

    public boolean closeEnough(User other) {
        int count = 0;
        count += firstName.equals(other.firstName) ? 1 : 0;
        count += lastName.equals(other.lastName) ? 1 : 0;
        count += ssn.equals(other.ssn) ? 1 : 0;
        count += email.equals(other.email) ? 1 : 0;
        ...
        return count >= 3;
    }
    

    To do a large scale search, the only way I can think of that would improve on a simple linear scan (using the method above) would be

    1. create a series of multimaps for each of the properties,
    2. populate them with the User records

    Then each time you want to do a query:

    1. query each multimap to get a set of possible candidates,
    2. iterate all of the sets using closeEnough() to find the matches.

    You could improve on this by treating the SSN, email address and blog URL properties differently to the name properties. Multiple users with matches on the first three properties should be a rare occurrence, compared with (say) finding multiple users called “John”. The way that you have posed the question requires at least 1 of SSN, email or URL to match (to get 3 matches), so maybe you could not bother indexing the name properties at all.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 231k
  • Answers 231k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Use the following function like this: Image('/path/to/original.image', '1/1', '150*', './thumb.jpg');… May 13, 2026 at 2:13 am
  • Editorial Team
    Editorial Team added an answer Check you database schema to see if the field (referenced… May 13, 2026 at 2:13 am
  • Editorial Team
    Editorial Team added an answer I figured out the problem - there was a session… May 13, 2026 at 2:13 am

Related Questions

I recently ran into an issue with Groovy where I was attempting to deal
I was using a not-so-up-to-date version of OOo.calc (Open Office Spreasheet component, version 2.04,
I don't know how authoritative this is but I found this: http://www.sqlite.org/cvstrac/wiki?p=PerformanceConsiderations and it
I have a large table (more than 10 millions records). this table is heavily

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.