I have a large set of text files (tab delimited data) I need to parse. They are mostly well formatted. However, there are randomly interspersed rows that include erroneous characters, like what is shown below. The location of the bad rows is different in each file, but the characters added are always the same.
1 3
2 873
3 46
23 99798
23 1
353 79
"23 ," 967
35 8028
253 615
"235 ," 3924
345 188
345 579
345 419
56 16835
23 449
importdata(filename) imports all of the data up to the first badly formatted line, then ignores the rest of the file. I think I could do what I am trying to do with a combination of fopen and textscan, but I can’t seem to get the right combination of arguments to make it work.
Have a go at using
textreadfunction with the%qformat string. Assuming the test data in the question is saved astest.txt:Then you can use
str2doubleto remove the trailing columns ina. For example: