I’m calling importtsv from within a Java class and for some reason it’s not loading all of the records. Not sure if this is an actual importtsv problem or something I’m doing.
Details:
I’m trying to load data in batches as it’s being captured. So, I’m writing to a temp file in directory with no other data and every X number of records, I call importtsv to load the data into HBase.
I’ve changed the program to call “wc -l” to double-check that I’m putting the correct number of records in the temp file and everything looks good. So, either there’s an error with importtsv or for some reason there are invalid records not making it in, but I don’t know where this is being logged.
Thoughts? Even if I can just find where any errors are being logged, that would be great.
Nevermind…it turns out I had duplicate records in my data.