I am working on a mapping utility that requires a moderate amount of data to be input into the app in CSV format. These CSV files may contain 100000+ records with each record containing roughly 50 items. I may need to open several of these files at a time. The data needs double precision, but not with every item in a record. These items may be cast to int or have toString called.
My question is this: My first thought was to create an ArrayList of double[]. My second thought was to create a custom data object (an ArrayList of MyDataClass) to hold this data in the forms I require. This would have me create a class with roughly 45-50 instance variables. I’ve never done anything on this scale and could use a little guidance on best practice for such a task!
Both aproaches are fine. It all depends on what you’ll do with the data. If it’s only data and doesn’t have any methods except getters, creating a class to hold them might be overkill. If you want to add some behavior to a row, then create a class. 50 fields in a unique class is a bit too much. You might split the class into logical groups, but it all depends on what the data represents.
If we suppose that you have 10 files open, each with 100000 * 50 doubles, that makes around 380 MBs. You must add the memory for every array of doubles, and for the ArrayLists. Such an amount of memory might be too much, or might be fine. It all depends on the memory you have in your JVM. If you can’t hold everything in memory, consider reading files as needed, or store the data in a database.