I am doing PHP and have 11 million domains in text files loaded into an array and then I search through them using regex.
To do this I need to jack up memory limit to 2gigs and then it takes like 10 seconds to process. I will soon have 100 million domains and plan on moving to a database solution, but still, how do you get good performance when searching through a list of 100 million domains?
I search using regex like this:
$domains = preg_grep("/store\./", $array);
foreach($domains as $domain) {echo $domain;}
How about a search engine like lucene:
http://lucene.apache.org/java/docs/index.html
It is meant for this very purpose.