I am looking for an efficient way to convert a large int[] into a string[] of csv strings where each csv is limited to a maximum of 4000 characters. The values in the array could be anything between 1 and int.MaxValue.
Here is my final code:
public static string[] GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts.ToArray();
}
Is there a more efficient way to do this?
So here is what I am now using (I was able to change the return parameter to the List type to save the ToArray() call at the end):
public static List<string> GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts;
}
Performance results:
10,000,000 items csv Limit of 4000 characters
- Original: 2,887.488ms
- GetIntegerDigitCount: 3105.355ms
- Final: 2883.587ms
Whilst I only saved 4ms removing the ToArray() call on my developer machine this seems to make a significant difference on a much slower machine (saved over 200ms on a DELL D620)
You are doing a lot of heap memory allocations when creating a new string for each number just to calculate number of digits. Use following method to calculate number of digits in the number (see method below).
So instead of
Just use:
Results:
EDIT: More results on large csv limit
Code I’ve used to measure time:
GetIntegerDigitCount: