i want to parse a text file that represents a log. i want it to be powerful enough to handle all erros that might occur.. although i am clueless about the best practices and the errors i should account for . i will be using JAVA to implement this.
Sample log :
2012-07-16 10:23:40,558 – 127.0.0.1 – Paremter array[param1=1,param2=1,param3=0,] – 383
I already wrote a prasing code that works as follows :
public Parser(String log) {
this.log = log;
this.parse();
}
public void parse() {
String[] temp = new String[10];
String[] temp2 = new String[10];
temp = log.split(" - ");
key = temp[3];
id = Integer.parseInt(key);
String IP = temp[1];
String str;
String temp3 = temp[2].substring(temp[2].indexOf("g"), temp[2].indexOf("]"));
temp = temp3.split(",");
str = "param1";
boolean ordered = CheckOrder(temp);
if (ordered) {
for (int q = 0; q < temp.length; q++) {
temp[q] = temp[q].substring(temp[q].indexOf("=") + 1);
}
if (temp[0].equals("q")) {
param= 0;
} else if (temp[0].equals("k")) {
param= 1;
} else {
param= 2;
}
// Same way for all parameters
}
}
Check the javadoc of all the methods you use, and make sure to handle all the nominal and exceptional cases:
String.indexOf()doesn’t find what it looks for. It returns -1. Handle this case correctlyString.split()doesn’t return an array of the length I expect. Handle this case correctlySplit your big method into several sub-methods, each doing only one thing.
Write unit tests to check that your methods do what they’re supposed to do, with all the possible inputs.
Note that “handling things correctly” might very well mean: throw an exception because the input is incorrect, if the contract is that the logs follow a well-defined format. In this case, it’s the code generating the logs that is incorrect. But it’s better to have an exception telling which format you expected and which format you got instead, rather than an obscure NullPointerException or ArrayIndexOutOfBoundsException.
The above applies to any kind of code you write, and not just to file parsing.
Side note:
What’s the point in creating an array of 10 elements to discard it right after and replace it by another array (the one returned by
log.split(" - ")).