Given a 1GB(very large) file containing words (some repeated), we need to read the file and output how many times each word is repeated. Please let me know if my solution is high performant or not.
(For simplicity lets assume we have already captured the words in an arraylist<string>)
I think the big O(n) is “n”. Am I correct??
public static void main(String[] args) {
ArrayList al = new ArrayList();
al.add("math1");
al.add("raj1");
al.add("raj2");
al.add("math");
al.add("rj2");
al.add("math");
al.add("rj3");
al.add("math2");
al.add("rj1");
al.add("is");
Map<String,Integer> map= new HashMap<String,Integer>();
for (int i=0;i<al.size();i++)
{
String s= (String)al.get(i);
map.put(s,null);
}
for (int i=0;i<al.size();i++)
{
String s= (String)al.get(i);
if(map.get(s)==null)
map.put(s,1);
else
{
int count =(int)map.get(s);
count=count+1;
map.put(s,count);
}
}
System.out.println("");
}
Theoretically , since HashMap access is generally O(1), I guess your algorithm is O(n), but in reality has several inefficiencies. Ideally you would iterate over the contents of the file just once, processing (i.e. counting) the words while you read them in. There’s no need to store the entire file contents in memory (your ArrayList). You loop over the contents three times – once to read them, and the second and third times in the two loops in your code above. In particular, the first loop in your code above is completely unnecessary. Finally, your use of HashMap will be slower than needed because the default size at construction is very small, and it will have to grow internally a number of times, forcing a rebuilding of the hash table each time. Better to start it off a size appropriate for what you expect it to hold. You also have to consider the load factor into that.