I’m trying to learn Java/Android and right now I’m doing some experiments with the replaceAll function. But I’ve found that with large text files the process gets sluggish so I was wondering if there is a way to skip the “useless” parts of a file to have a better performance. (Note: Just skip them, not delete them)
Note: I am not trying to “count lines” or “println” or “system.out”, I’m just replacing strings and saving the changes in the same file.
Example
AAAA
CCCC- 9234802394819102948102948104981209381’238901’2309’129831’2381’2381’23081’23081’284091824098304982390482304981’20841’948023984129048’1489039842039481’204891’29031’923481290381’20391’294872385710239841’20391’20931’20853029573098341’290831’20893’12894093274019799919208310293810293810293810293810298’120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿… DDDD
AAAA
CCCC- 9234802394819102948102948104981209381’238901’2309’129831’2381’2381’23081’23081’284091824098304982390482304981’20841’948023984129048’1489039842039481’204891’29031’923481290381’20391’294872385710239841’20391’20931’20853029573098341’290831’20893’12894093274019799919208310293810293810293810293810298’120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿… DDDD
and so on….like a zillion times
I want to replace all “AAAA” with “BBBB”, but there are large portions of data between the strings I am replacing. Also, this portions always begin with “CCCC” and end with “DDDD”.
Here’s the code I am using to replace the string.
File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null) {
oldtext += line + "\r\n";
}
reader.close();
// Replacing "AAAA" strings
String newtext= oldtext.replaceAll("AAAA", "BBBB");
FileWriter writer = new FileWriter("my_file.txt");
writer.write(newtext);
writer.close();
I think reading all lines is inefficient, especially when you won’t be modifying these parts (and they represent the 90% of the file).
Does anyone know a solution???
You are wasting a lot of time on this line —
In Java,
Stringis immutable, which means you can’t modify them. Therefore, when you do the concatenation, Java is actually making a complete copy ofoldtext. So, for every line in your file, you are recopying every line that came before in your newString. Take a look atStringBuilderfor a a way to build aStringavoiding these copies.However, in your case, you do not need the whole file in memory, because you can process line by line. By moving your
replaceAllandwriteinto your loop, you can operate on each line as you read it. This will keep the memory footprint of the routine down, because you are only keeping a single line in memory.Note that since the
FileWriteris opened before you read the input file, you need to have a different name for the output file. If you want to keep the same name, you can do arenameToon theFileafter you close it.