Sitation:
overview:
I have something like this:
std::vector<SomeType> values;
std::vector<int> indexes;
struct Range{
int firstElement;//first element to be used in indexes array
int numElements;//number of element to be used from indexed array
int minIndex;/*minimum index encountered between firstElement
and firstElements+numElements*/
int maxIndex;/*maximum index encountered between firstElement
and firstElements+numElements*/
Range()
:firstElement(0), numElements(0), minIndex(0), maxIndex(0){
}
}
std::vector<Range> ranges;
I need to sort values, remap indexes, and recalculate ranges to minimize maxValueIndex-minValueIndex for each range.
details:
values is an array(okay, “vector”) of some type (irrelevant which one). elements in values may be unique, but this is not guaranteed.
indexes is an vector of ints. each element in “indexes” is an indexes that correspond to some element in values. Elements in indexes are not unique, one value may repeat multiple types. And indexes.size() >= values.size().
Now, ranges correspond to a “chunk” of data from indexes. firstElement is an index of element to be used from indexes (i.e. used like this: indexes[range.firstElement]), numElements is (obviously) number of elements to be used, minIndex is mininum in (indexes[firstElement]…indexes[firstElement+numElements-1]) a,d maxIndex is maximum in (indexes[firstElement]…indexes[firstElement+numElements-1]). Ranges never overlap.
I.e. for every two ranges a, b
((a.firstElement >= b.firstElement) && (a.firstElement < (b.firstElement+b.numElements)) == false
Obviously, when I do any operation on values (swap to elements, etc), I need to update indexes (so they keep pointing on the same value), and recalculate corresponding range, so range’s minIndex and maxIndex are correct.
Now, I need to rearrange values in the way that will minimize Range.maxIndex – Range.minIndex. I do not need the “best” result after packing, having “probably the best” or “good” packing will be enough.
problem:
Remapping indexes and recalculating ranges is easy. The problem is that I’m not sure how to sort elements in values, because same index may be encountered in multiple ranges.
Any ideas about how to proceed?
restrictions:
Changing container type is not allowed. Containers should be array-like. No maps, not lists.
But you’re free to use whatever container you want during the sorting.
Also, no boost or external libraries – pure C++/STL, I really neeed only an algorithm.
additional info:
There is no greater/less comparison defined for SomeType – only equality/non-equality.
But there should be no need to ever compare two values, only indexes.
The goal of algorithm is to make sure that output of
for (int i = 0; i < indexes.size; i++){
print(values[indexes[i]]); //hypothetical print function
}
Will be identical before and after sorting, while also making sure that for each range
Range.maxIndex-Range.minIndex (after sorting) is as small as possible to achieve with reasonable effort.
I’m not looking for a “perfect” or “most optimal” solution, having a “probably perfect” or “probably most optimal” solution should be enough.
P.S. This is NOT a homework.
Okay, it looks like there is only one way to reliably solve this problem:
Make sure that no index is ever used by two ranges at once by duplicating values.
I.e scan entire array of indexes, and when you find index (of value) that is being used in more than one range, you add copy of that value for each range – each with unique index. After that problem becomes trivial – you simply sort values in the way that will make sure that values array first contains values used only by first range, then values for 2nd range, and so on. I.e. this will get maximum packing.
Because in my app it is more important to minimize sum(ranges[i].maxIndex-ranges[i].minIndex) that to minimize number of values, this approach works for me.
I do not think that there is other reliable way to solve the problem – it is quite easy to get situation when there are indexes used by every range, and in this case it will not be possible to “pack” data no matter what you do. Even allowing index to be used by two ranges at once will lead to problems – you can get ranges a, b and c where a and b, b and c, a and c will have common indexes. In this case it also won’t be possible to pack the data.