FILE: I’m working with a refined csv version of a searchlog file which contains

Question

0

Asked: June 6, 20262026-06-06T09:13:28+00:00 2026-06-06T09:13:28+00:00

FILE: I’m working with a refined csv version of a searchlog file which contains

0

FILE:
I’m working with a refined csv version of a searchlog file which contains 3.3mio lines of data, with each line resembling a single query and containing various data about that query.
The entries in the file are sorted ascending by the session / userid.

GOAL:
Coupling entries that submitted the same queryterm while belonging to the same userid

APPROACH:
I’m reading the csv file line by line, saving the data in selfmade ‘Entry’-object and adding these objects to an arraylist. When this is done, I’ll sort the list by two criteria with a custom comparator

PROBLEM:

While reading the lines and adding the Entry-objects to the list (which takes very long) the program terminates with a OutOfMemoryException “Java heap”

So it seems that my approach is too hard on memory (and runtime).
Any ideas for a better approach?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T09:13:30+00:00

Your approach itself may be valid, and perhaps the simplest solution is to simply boost the memory available to the JVM.

The JVM will only allocate itself a maximum amount of system memory, and you can increase this value via the -Xmx command line attribute. See here for more details.

Obviously this solution doesn’t scale, and if (in the future) you want to read much bigger files, then you’ll likely need a better solution to reading these files.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

FILE: I’m working with a refined csv version of a searchlog file which contains

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply