I have a following piece of code (.net 4) that is consuming a lot of memory:
struct Data
{
private readonly List<Dictionary<string,string>> _list;
public Data(List<Dictionary<string,string>> List)
{
_list = List;
}
public void DoWork()
{
int num = 0;
foreach (Dictionary<string, string> d in _list)
{
foreach (KeyValuePair<string, string> kvp in d)
num += Convert.ToInt32(kvp.Value);
}
Console.Write(num);
//_list = null;
}
}
class Test1
{
BlockingCollection<Data> collection = new BlockingCollection<Data>(10);
Thread th;
public Test1()
{
th = new Thread(Work);
th.Start();
}
public void Read()
{
List<Dictionary<string, string>> l = new List<Dictionary<string, string>>();
Random r = new Random();
for (int i=0; i<100000; i++)
{
Dictionary<string, string> d = new Dictionary<string,string>();
d["1"] = r.Next().ToString();
d["2"] = r.Next().ToString();
d["3"] = r.Next().ToString();
d["4"] = r.Next().ToString();
l.Add(d);
}
collection.Add(new Data(l));
}
private void Work()
{
while (true)
{
collection.Take().DoWork();
}
}
}
class Program
{
Test1 t = new Test1();
static void Main(string[] args)
{
Program p = new Program();
for (int i = 0; i < 1000; i++)
{
p.t.Read();
}
}
}
The size of blocking collection is 10. In my knowledge, gc should collect references in ‘Data’ struct as soon its DoWork method is complete. However, the memory keeps on increasing at a rapid rate until the program crashes or it come down on its own and this is happening more often on low end machines (on some machines memory does not increase).Further, when I add the following line “_list = null;” at the end of DoWork method and convert ‘Data’ into class (from struct), memory does not increase.
What could be happening here. I need some suggestions here.
Update: the issue is occuring on machines with .net framework 4 installed (4.5 not installed)
If you read Stephen Toub’s explanation of how
ConcurrentQueueworks, the behavior makes sense.BlockingCollectionusesConcurrentQueueby default, which stores its elements in linked lists of 32-element segments.For the purposes of concurrent access, elements in the linked list are never overwritten, so they don’t get unreferenced until the last of a whole segment of 32 is consumed. Since you have a bounded capacity of 10 elements, let’s say that you have produced 41 elements and consumed 31. That means you will have one segment of 31 consumed element plus one queued element, and another segment with the remaining 9 elements. At this point all 41 elements are referenced, so if each element is 25MB, your collection will be taking up 1GB! Once the next item is consumed, all 32 of the elements in the head segment will be unreferenced and can be collected.
You may think there should only ever need to be 10 elements in the queue, and that would be the case for a non-concurrent queue, but that would not allow one thread to enumerate the elements in the queue while another thread was producing or consuming elements.
The reason that the .Net 4.5 framework doesn’t leak is that they changed the behavior to null out elements as soon as they’re produced as long as there is nobody enumerating the queue. If you start enumerating
collection, you should see memory leak even with the .Net 4.5 framework.The reason that setting
_list = nullworks when you have aclassis that you are creating a “box” wrapper that allows you to unreference the list in every place that it’s used. Setting the value in your local variable changes the same copy that the queue has a reference to.The reason that setting
_list = nulldoesn’t work when you have astructis that you can only ever change copies of astruct. The “original” version of it sitting in that queue segment is effectively immutable becauseConcurrentQueuedoesn’t provide a way to change it. In other words, you’re changing only the copy of the value in your local variable rather than chaging the copy in the queue.