I have a c# BitArray that is fairly large (500,000) in length, and I am trying to get the index of all the positive bits set in the array. currently I am achieving this by:
public int[] GetIndexesForPositives()
{
var idIndexes = new int[GetPositiveCount + 1];
var idx = 0;
for (var i = 0; i < Length; i++)
{
if (Get(i))
{
idIndexes[idx++] = i;
}
}
return idIndexes;
}
I create an empty array of the size of known positive bits, then i lopp over the bitarray and add the index value to the return array.
This means I have to perform 500,000 loops over the array and its not exactly fast. (takes around 15ms).
I know the BitArray uses an integer array under the covers (i used it to write the GetPositiveCount function – via an alogrithm I got off stack), I wonder if there is an algorythm to do this aswell?
If you are able to get a int array underlying the BitArray, this should provide much better performance:
Assuming you don’t know the number of bits that are set:
If you do know the number of bits that are set you can do this instead:
My tests have these two working faster than your method, even the one that doesn’t know how large the return array will be in the first place.
My results tested using a random BitArray of 50million records:
edit: furthermore it is worth pointing out that this is a problem that could really benefit from being made multi-threaded. Break the ByteArray up into 4 parts and there you have 4 threads that could run checking the data at once.
Edit: I know this is already accepted but here’s another bit you can do to improve performance if you know that most of the time your list will be very sparse:
it is slightly slower when the list is >40% or more populated however if you know the list is always going to be 10% 1s and 90% 0s then this will run even faster for you.