What data structure supports the following set operations efficiently both in time and space?
- union
- difference
- ismemberof
- add
- delete
I can think of 3 different ways to do these operations, suppose that we have two sets, their sizes are both N:
Bit Array:
1. O(N) 2.O(N) 3.O(1) 4.O(1) 5.O(1)
HashTable:
1. O(N) 2.O(N) 3.O(1) 4.O(1) 5.O(1)
Ordered Tree:
1. O(NlogN) 2.O(NlogN) 3.O(logN) 4.O(logN) 5.O(logN)
Bit Array and HashTable are fast but they use too much memory, Ordered Tree is slow but consumes less memory.
Please note: the set may contain other types except integer, such as float number or string
What other data structures are both fast and general, and space efficient?
One option is to augment your ordered tree with a bloom filter to speed up the
ismemberoftype tests.I think that the overall behaviour would be something like:
However the exact details will depend on the size of the filter, the size of your sets and the size of your domain.
Another option may be Judy Arrays. I’ve heard good things about them for this kind of use, but have no personal experience.
Yet another option is a forrest approach (rather than a pure binary tree).