I have one very large (10mb) csv file. I parsed it and put it into memory using a generic list.
I created a class to represent each line. This class has only several fields (data type ip-address, string).
I thoguht that since the file is only 10 megabytes I could expect a similar size in-memory.
I was quite surprised when I found out that the method that is creating the list is allocating 300 mb and not freeing it up.
Is this normal, and what can be causing this.
Note that the csv file has many lines (100 000 +) this could be a factor.
Namespace Geo
Public Class CountryMarker
Public StartAddress As IPAddress
Public EndAddress As IPAddress
Public Country As String
Public CountryCode As String
End Class
Public Class Markers
Private Const DatabasePath = "~/App_Data/ip.csv" '10 MB file
Public Shared List As List(Of CountryMarker) = LoadData()
Shared Function LoadData() As List(Of CountryMarker)
Dim Markers As New List(Of CountryMarker)
Using Stream = New IO.FileStream(Hosting.HostingEnvironment.MapPath(DatabasePath), FileMode.Open)
Dim Reader = New StreamReader(Stream)
Do While Reader.Peek > -1
Dim Line = Reader.ReadLine()
Dim Values = Line.Split(",").Select(Function(i) i.Replace("""", ""))
Markers.Add(New CountryMarker With {.Country = Values(5), .CountryCode = Values(4), .StartAddress = IPAddress.Parse(Values(0)), .EndAddress = IPAddress.Parse(Values(1))})
Loop
End Using
Return Markers
End Function
End Class
End Namespace
First, if the file is ASCII text or UTF-8 with predominately Western European characters (like English), then the in-memory size of the text will be at least double the file’s size on disk. .NET stores strings as 16-bit Unicode values. So “A”, for example, which takes one byte in a text file, requires two bytes in memory.
Each class instance that you create is going to require at least 24 bytes (16 bytes of allocation, plus 8 bytes for the reference.) If your file is 100,000 lines, that’s 2.4 megabytes, minimum. In addition, every string that you allocate will require 24 bytes, plus whatever is required for the string. Things add up quick.
(Note that my 24 bytes number is for a 64-bit system. It’s 16 bytes per allocation in the 32-bit runtime.)
As others have commented, it’s impossible to give you any more detail unless you post some code, including your class definition.
As to not freeing up any memory: that’s kind of difficult to prove. Maybe the garbage collector just hasn’t gotten around to doing a collection yet. If it sees no memory pressure (i.e. there’s plenty of memory available and no other process is begging for memory), the GC might decide it doesn’t need to collect yet.