Suppose now you have a group of data: Data 1: (1, 2); Data 2:

Question

0

Asked: June 12, 20262026-06-12T09:22:55+00:00 2026-06-12T09:22:55+00:00

Suppose now you have a group of data: Data 1: (1, 2); Data 2:

0

Suppose now you have a group of data:

Data 1: (1, 2);
Data 2: (1, 3);
Data 3: (7, 8);
Data 4: (8, 20);

Now the task is to merge the data set if it has a common element with another data set. In our example, Data 1 will be merged with Data 2 as they share the common number 1. So will Data 3 and Data 4. My question is how can we implement this function in C++ in a very efficient. For the time being my implementation is based on std::vector > data structure, which is illustrated in the following codes:

#include <iostream>
#include <map>
#include <set>
#include <algorithm>
#include <vector>


using namespace std;
bool find_the_element(const set<int> &mysets, const vector<int> &myvector)
{
    for(int i=0; i<myvector.size(); i++)
    {
        set<int>::iterator it;
        it = mysets.find(myvector[i]);
        if (it != mysets.end())
            return true;
    }
    return false;

}





int main () 
{



    set<vector<int> > myset;
    vector<int> a;
    a.push_back(1);
    a.push_back(2);

    vector<int> b;
    b.push_back(1);
    b.push_back(3);

    vector<int> c;
    c.push_back(7);
    c.push_back(8);

    vector<int> d;
    d.push_back(8);
    d.push_back(20);
    vector<vector<int> > my_vector_array;
    my_vector_array.push_back(a);
    my_vector_array.push_back(b);
    my_vector_array.push_back(c);
    my_vector_array.push_back(d);


    vector<set<int> > my_sets;
    for(int i=0; i<my_vector_array.size(); i++)
    {
        vector<int> temp_vector = my_vector_array[i];

        if (my_sets.empty())
        {
            set<int> temp_set;
            for(int j=0; j<temp_vector.size(); j++)
                temp_set.insert(temp_vector[j]);

            my_sets.push_back(temp_set);
        }
        else
        {
            bool b_find = false;
            for(int j=0; j<my_sets.size(); j++)
            {
                set<int>temp_set;
                temp_set = my_sets[j];
                if (find_the_element(temp_set,temp_vector))
                {
                    b_find = true;
                    my_sets[j].insert(temp_vector.begin(), temp_vector.end());

                    break;
                }

            }
            if (b_find)
            {
                // something already done
            }
            else
            {
                set<int> temp_set;
                for(int j=0; j<temp_vector.size(); j++)
                    temp_set.insert(temp_vector[j]);

                my_sets.push_back(temp_set);
            }

        }
    }
}

I was wondering whether there are more effective data structure in C++ or efficient algorithms to do the job. Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T09:22:55+00:00

One of the most efficient ways to implement sets that can be quickly merged is by using Disjoint-set Data Structure.

The idea is to represent each set initially as a linked list, with the head of the list serving as the identifier for the entire set. As sets get merged, nodes are re-pointed to the head to speed up further searches.

The article at the link has pseudo-code; C++ implementation should not be too difficult.

You would need to keep a separate map that connects the integers that you have seen so far with their node within the disjoint-set forest. You would go through your data sets, take their items one by one, look up the item in the map, and either follow the link to its set, or create a new “singleton” disjoint set with the item that you are adding.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Suppose now you have a group of data: Data 1: (1, 2); Data 2:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply