I would like to store equivalences from Connected-component labeling algorithm. It’s basically making a kind of map from one value (one label’s ID) to multiple values (IDs from labels that are equivalent to the former.)
I have already done something like this but it does not work really well:
std::map<unsigned short, std::list<unsigned int>> equivalences;
for(int i = 0; i < MAX_NUMBER_OF_LABELS; ++i )
{
std::list<unsigned int> temp;
temp.push_back(i);
// note that a label is equivalent to itself
equivalences.insert( std::pair< int, std::list<unsigned int>>(i, temp) );
}
Then I add proper equivalence by:
equivalences.at( i ).push_back( equivalent_labels_int );
The main drawback of this method is that I have to declare map‘s size up front (it has to be big enough) and then for large sizes (e.g. 9999) the initialization time is approximately 2.5s.
Anyone have a better idea?
You do not need to size the
mapup-front in C++ (or most languages, for that matter).maps can dynamically grow by having new elements added into them, so if you find a new key, you can always add it to the map. For example:This works because the
map‘s square brackets operator (operator[]) will automatically add a new key/value pair to themapwith the given key and a default value if one doesn’t already exist.Additionally, I would advise not using
listas the container for storing the sequence of connected blobs.listis good when you don’t need random access and are frequently removing elements in the middle of the sequence, which I don’t think you’re actually doing here. Instead, I would suggest usingvectorordeque, since those structures are more space efficient and have better locality.Finally, depending on your particular needs, you may want to switch data structures entirely. If your algorithm works by running a depth-first search out from some starting point and then storing all of the results it encounters, the approach you have now may be quite good. However, if instead your algorithm works by finding pairs of points that are similar and then merging together the blobs they contain, you may be interested in the disjoint-set forest data structure, which has a simple implementation but extremely good performance. That said, using this structure loses you the ability to check what points are connected to a given point, but the boost in efficiency is pretty remarkable.
Hope this helps!