I’m using C# but even if you don’t know it, it should be pretty easy to follow along with this question.
Here’s my problem: I have some objects that I’d like to keep in a hashset-like-data structure so that I can look them up based on an int ID. These objects have mutable properties, so hashing them is not an option (I would need something constant about them to hash, yes?).
What I’ve done is develop the following interface:
public interface IUniqueIDCollection
{
// Can return any int that hasn't been requested yet.
public int RequestUniqueID();
// Undos the requesting of an int
public int ReleaseUniqueID(int uniqueID);
}
My initial thought is to just store an internal counter in the IUniqueIDCollection that increments as ID’s are requested. However once ID’s are released, I would have to keep track of ranges or individual ID’s that have been removed. I think the latter would be better. But if I used a counter (or any cyclic function) to generate the ID’s, I would have the problem of having to go through checking sequences of ID’s that have been successively requested by not released once the counter wraps around.
The heuristics are this: Let’s say a maximum of 5,000 ID’s will be requested at once. HOWEVER, very often ID’s will requested and then released. Releasing will tend to happen in ranges — i.e. maybe 100 will be requested all at once, and then all 100 will be released in a short time interval.
I know I could use a GUID or something instead of an int, but I’d like to save space/bandwidth/processing time of the ID’s.
So my question is: What should the request and release methods look like in the interface I gave above, in terms of pseudo code, given the heuristics?
If you’re sure that released ID’s are safe to be reused immediately (i.e., there won’t be stale references to old ID’s hanging around that would be confused if a new object was assigned a recently-released ID), you can use the released ID’s first. So when an ID is released, you put it at the end of a queue. When a new ID is requested, you use the first one in the queue. If the queue is empty, you increment the internal counter and give out the new number.
Advantage of this implementation:
Disadvantages: