I have a string like this:
// string1
horse|cow|goat|zebra|
and another string like this:
// string2
horse:a,pig:b,cow:z,monkey:g,goat:a,
my goal is to split string1, then pick out any occurrences of it in string2, to build a histogram. I am currently doing this:
var histogram = {};
var animals = string1.split("|");
for (var i = 0; i < animals.length; i++) {
var animal = animals[i];
var animalColon = animal + ":";
var index = string2.indexOf(animalColon);
while (index != -1) {
var indexColon = index + animalColon.length;
var indexFinal = string2.indexOf(",", indexColon);
var letter = string2.substring(indexColon, indexFinal);
if (histogram[letter] == null) {
histogram[letter] = 1;
}
else {
histogram[letter] = histogram[letter] + 1;
}
index = string2.indexOf(animalColon, index + 1);
}
}
at the end, it might print something like:
// histogram:
a: 2 instances // from { horse, goat }
z: 1 instance // from { cow }
the above will work, but I have to dp animals.length passes through string2 to check everyone.
Is there a way to use regular expressions to do this parsing – essentially run all tests in parallel, instead of doing multiple passes through? Since string2 is const, it seems that all checks could be done simultaneously (not sure if regexes are implemented like this).
I increased the number of elements in string1 and string2 on the order of thousands of elements and it still runs quite fast, but am worried about slower machines, maintainability, and stuff like that,
Thanks
I’d start by pre-processing your string2, which you say is constant. Working with an object is better than keep searching in the string:
Next, when you get the string, you have easier time looking up the letters (you may also want to check for
if(letter), if you get a new animal instring1):As per your question, you could probably abuse regular expression to count the letters, but it isn’t parallel, but linear at best, and probably complex enough not to worthwhile.