Suppose we have a graph with bidirectional edges, no weights. How can I store it so that I don’t waste tons of memory, make it fast and have a fast access to every vertex’s neighbors? I mean, until now for sth like this: {(1,2)(1,5)(1,3)(2,4)(2,3)} I have been using an array: array[1][2]=1 meaning that there is a connection between 1 and 2. There are two problems with that:
-
a) as the graph is bidirectional,
(1,2)means(2,1)exists as well. If I want to have easy access to 2’s neighbors later, I have to make two changes per iteration:array[1][2]=1,array[2][1]=1 -
b) when I know some vertex (say 5) has only one neighbor left, I have to search through the whole
array[5][x]checking every possiblex -
c) for a graph of a million vertexes, this monster becomes too vast to be used in any competition
Could you please help me and point me the solution to my problems?
Looks like you want a map of sets.
std::map< int, std::set< int > >So for an int you can store a collection of all its neighbours in the set. You will want functions to manipulate this collection.
If the number of nodes is countable, i.e. they range from 0 to N and include all these numbers then you can use
std::vector< std::set<int> >and it would be more efficient to do so. You could also usestd::vector< std::bitset<N> >orstd::vector< boost::dynamic_bitset > >if you have, say, 20,000 nodes and can therefore afford 20,000 bitsets of 2500 bytes (plus a bit of overhead) each = 50MB of memory (approx).This is a slightly more compact model to the one you have but not by a lot. If you have a million vertices it will be about 125GB so obviously you can’t use this model but should use the set. Also, iterating through a vertex to see what its neighbours are is a much faster operation with a set than a bitset.
Unless there are many vertices with no neighbours at all though and that they are sequentially numbered, there is no advantage of map over vector though.
Not sure how much memory you call “tons”. The model I just outlaid uses constant memory whereas the map of sets uses memory proportional to the number of neighbourhood-relationships you have but as it gets full will be far less compact than the vector of bitsets so will consume more.