I’m trying to separate the hosts from this file, but my regex selects both hosts together:
timestamps|||scan_start|Tue May 1 23:00:29 2012|timestamps||foo.com|host_start|Tue May 1 23:16:51 2012|results|-0017\ntimestamps||foo.com|host_end|Tue May 1 23:19:17 2012|timestamps||bar.com|host_start|Tue May 1 23:24:31 2012|results|general/tcp|Sendmail 8.13.8\n\n\ntimestamps||bar.com|host_end|Tue May 1 23:29:11 2012|timestamps|||scan_end|Wed May 2 00:19:40 2012|
regex:
timestamps\|\|[\w,\.]*\|host_start.*host_end
Make the star lazy:
.*is “greedy”, matching as much as it can..*?is “lazy” and matches as little as possible to achieve a match. Therefore it will match only until the closesthost_endand not until the last one.Also, no need to escape the dot inside a character class. And do you really want to allow a comma in the character class, or did you mean
[\w.]*?