I have created a search and replace program using regular expression for large no of files using eclipse ide.In this program I have given the name of directory in which search and replace to be performed(It may have sub directories also).For small no of files it runs smoothly but for directories having 1000 of files it hangs in between as does nothing(even after increasing the jvm memory size).
I have used BufferedReader to read each file line by line and used regex to match the pattern in the line and then replaced it with some other text.
Can any body suggest me the possible solution(Algorithms,Library,trick,hack) for it?
BufferedReader br = new BufferedReader(new FileReader(fileName));
BufferedWriter bw = new BufferedWriter(new FileWriter(changedFile));
StringBuilder sb = new StringBuilder();
for (String line = br.readLine(); line != null; line = br.readLine()) {
sb.append(line).append("\n");
}
br.close();
sb.trimToSize();
String code = sb.toString();
code = code.replaceAll("System", "PrintWriter");
bw.write(code);
bw.flush();
bw.close();
I suspect the write buffer in your OS is filling up and it must wait for the data to flush to disk unless you can determine the program really is hanging up due to a bug in it. Using the debugger is a simple way to test this or using
jstackto take a stack trace.I suspect the problem is in the speed of your hard drive. If you have a HDD which has a seek time of 8 ms;
The total time take is about 32 – 48 ms, which means you can update about 20 – 30 files per second.
For <$50 you can buy a 32 GB SSD with an access time of 0.1 ms. You can buy double the size for not much more.
The total time might be 0.5 ms allowing you to process up to 2000 files per second.
The only reason it appears you can do more is that the OS caches reads and buffers writes, to a point. When these are exhaused (which they seem to be fairly quickly on Windows) you are limited by the speed of the drive.