I have a sizeable txt file (3.5 MB) structured like so:
sweep#1 expanse#1 0.375
loftiness#1 highness#2 0.375
lockstep#1 0.25
laziness#2 0.25
treponema#1 0.25
rhizopodan#1 rhizopod#1 0.25
plumy#3 feathery#3 feathered#1 -0.125
ruffled#2 frilly#1 frilled#1 -0.125
fringed#2 -0.125
inflamed#3 -0.125
inlaid#1 -0.125
Each word is followed by a #, an integer and then its “score.” There are tab breaks in between the word and score. As of right now, the textfile is loaded as a string using file_get_contents().
From an array of strings made up of individual, lower-case, character-stripped words, I need to look up each value, find its corresponding score and add it to a running total.
I imagine I would need some form of regex to first find the word, continue to the next \t and then add the integer to a running total. What’s the best way of going about this?
Yes, there are probably better ways of doing this. But this is so oh-so-simple: