I have a .txt file which has about 500k entries, each separated by new line. The file size is about 13MB and the format of each line is the following:
SomeText<tab>Value<tab>AnotherValue<tab>
My problem is to find a certain “string” with the input from the program, from the first column in the file, and get the corresponding Value and AnotherValue from the two columns.
The first column is not sorted, but the second and third column values in the file are actually sorted. But, this sorting is of no good use to me.
The file is static and does not change. I was thinking to use the Regex.IsMatch() here but I am not sure if that’s the best approach here to go line by line.
If the lookup time would increase drastically, I could probably go for rearranging the first column (and hence un-sorting the second & third column). Any suggestions on how to implement this approach or the above approach if required?
After locating the string, how should I fetch those two column values?
EDIT
I realized that there will be quite a bit of searches in the file for atleast oe request by the user. If I have an array of values to be found, how can I return some kind of dictionary having a corresponding values of found matches?
How many times do you need to do this search?
Is the cost of some pre-processing on startup worth it if you save time on each search?
Is loading all the data into memory at startup feasible?
Parse the file into objects and stick the results into a hashtable?
I don’t think Regex will help you more than any of the standard string options. You are looking for a fixed string value, not a pattern, but I stand to be corrected on that.
Update
Presuming that the “SomeText” is unique, you can use a dictionary like this
Data represents the values coming in from the file.
MyData is a class to hold them in memory.
hth,
Alan