So I have a file with arbitrary length with the following format:
@HEADER1
//arbitrary lines of data
@HEADER2
//arbitrary lines of data
....
I will extract each and every header and save it in a Hashmap and then I will start parsing (sequentially) another file which is a superset of file1 e.g. has the following format:
@HEADER1
//arbitrary lines of data
//extended information
@HEADER2
//arbitrary lines of data
//extended information
So my idea is that I will built a hashmap of headers – going through file 1 once and then I will go through file 2 and on every header in it I will check if I have it in the hasmap if yes – I will do something with the data. So I was wondering whether this is an optimal solution – according to my back-of-the-head calculations this is going to be O(n) whereas if I had an arraylist and for every header in file 2 check whether it is also in the arraylist would have yield O(n^2) whereas N is the number of headers in the Arraylist – am I correct?
If there is an even more efficient way I’d be glad to head it.
EDIT:
I can’t guarantee that the order of headers is going to be the same only that what is in file1 should exist in file2. Also – I don’t really need to save anything for the VALUE, in this case I just need quick access to the key.
A HashMap is a perfectly good choice here.
So the next thing to think about is what that HashMap will store. The key could probably be a String and would be
"@HEADER###". But what about the data?You have a few options for the value in the HashMap. You could use a String, but take some time and think through what your data is. Is it the original lines of data AND the extended information you’re adding? Does that data represent something structured? Like a List of items?
If you find yourself getting a String value from the map and doing additional processing consider replacing that String with a Class that better represents your data so you have something like
HashMap<String, DoskiasData>.