I have a live Lucene index that is updated throughout the day. When several successive batches of updates for the index come through, I want those updates to be available for searching as quickly as possible. Therefore I have to recreate the IndexSearcher.
The problem is that the IndexSearcher can take around 100mb of memory and when a lot of updates are coming through, it can be recreated relatively often and I’ve noticed the .Net garbage collector seems slow to clean up the reference to the old IndexSearcher object. This results in the memory usage of the process climbing out of control as the collector seems to free up memory from old IndexSearchers more slowly than they are being recreated.
I’ve found this problem is mitigated by crossing the line into taboo territory and calling GC.Collect(), which frees up the memory immediately. The performance impact doesn’t seem to be noticeable but as I’m doing something that many advice against, I’d be curious if anyone else has experience of objects being created and released faster than the garbage collector is cleaning them up. I’d be particularly interested if anyone has had this problem with the Lucene IndexSearcher.
I should note that the IndexSearcher is being recreated at peak times around once every 10-20 seconds.
I consider it acceptable to call
GC.Collectif you just released a ton of memory, and the memory can and should be freed now to reduce memory pressure. The GC does not know this memory is now available until it runs again, and you don’t know when that will be.In your case, you said “it can be recreated relatively often”. If so, calling
GC.Collectwhen you recreate it sounds reasonable.