Okay, this problem is indeed a challenge!
Background
I am working on an arithmetic-based project involving larger than normal numbers. I new I was going to be working with a worst-case-senario of 4 GB-caped file sizes (I was hopping to even extend that to a 5GB cap as I have seen file sizes greater than 4 GB before – specifically image *.iso files)
The Question In General
Now, the algorithm(s) to which I will apply computation to do not matter at the moment, but the loading and handling of such large quantities of data – the numbers – do.
- A
System.IO.File.ReadAllBytes(String)can only read a cap of 2 GB worth of file data, so this is my first problem – How will I go about loading and/or configuring to access memory, such file sizes – twice as much, if not more? - Next, I was writing my own class to treat a ‘stream’ or array of bytes as a big number and add multiple operator methods to perform hex arithmetic until I read about the
System.Numerics.BigInteger()class online – but being that there is noBigInteger.MaxValueand that I can only load a max of 2 GB of data at a time, I don’t know what the potential ofBigIntegerwould be – even compared to the object I was writing calledNumber()(which does have my desired minimum potential). There were also issues with available memory and performance, though I do not care so much about speed, but rather completing this experimental process successfully.
Summary
- How should I load 4-5 gigabytes of data?
- How should I store and handle the data after having been loaded? Stick with
BigIntegeror finish my ownNumberclass? - How should I handle such large quantities of memory during runtime without running out of memory? I’ll be treating the 4-5 GB of data like any other number instead of an array of bytes – performing such arithmetic as division and multiplication.
PS I cannot reveal too much information about this project under a non-discloser agreement. 😉
For those who would like to see a sample operator from my Number object for a per-byte array adder(C#):
public static Number operator +(Number n1, Number n2)
{
// GB5_ARRAY is a cap constant for 5 GB - 5368709120L
byte[] data = new byte[GB5_ARRAY];
byte rem = 0x00, bA, bB, rm, dt;
// Iterate through all bytes until the second to last
// The last byte is the remainder if any
// I tested this algorithm on smaller arrays provided by the `BitConverter` class,
// then I made a few tweeks to satisfy the larger arrays and the Number object
for (long iDx = 0; iDx <= GB5_ARRAY-1; iDx++)
{
// bData is a byte[] with GB5_ARRAY number of bytes
// Perform a check - solves for unequal (or jagged) arrays
if (iDx < GB5_ARRAY - 1) { bA = n1.bData[iDx]; bB = n2.bData[iDx]; } else { bA = 0x00; bB = 0x00; }
Add(bA, bB, rem, out dt, out rm);
// set data and prepare for the next interval
rem = rm; data[iDx] = dt;
}
return new Number(data);
}
private static void Add(byte a, byte b, byte r, out byte result, out byte remainder)
{
int i = a + b + r;
result = (byte)(i % 256); // find the byte amount through modulus arithmetic
remainder = (byte)((i - result) / 256); // find remainder
}
Use
FileStream: http://msdn.microsoft.com/en-us/library/system.io.filestream.aspx