I’ve not been using regex very long, and I’m struggling with defining the right regex pattern. I’ve searched this site and many others not quite finding what I need.
Here’s the sub-string from a file I need to parse:
As of 10 AM on:
9/7/2012 227,134 mmcf.
9/9/2011 1,224,376 mmcf.
9/10/2010 424 mmcf.
What I need to extract is any number that is not a date from a line(s) inside the file. Each of the lines in the example above are newlines in the file, with the date being the first word in the line (as you’d probably expect). The whitespace following the date is actually two tabs and a single space. I need to extract the value 227,134 only, and I need to be able to grab that value for any number 1 – 999,999,999. As you can see, the commas are included in the value.
I’ve been able to create a pattern that matches any of the values (123,456; 123,224,376; and 424), but it also matches each of the date properties (month, day, year). I have a pattern that grabs the date & white space, but I’m not sure how to grab the value after that.
Here is the current pattern I am using:
^(?:3[01]|[12][0-9]|[1-9])[/.-](?:1[0-2]|[1-9])[/.-][0-9]{4} [,0-9]+\b
This matches the following:
9/7/2012 227,134
9/9/2011 1,224,376
9/10/2010 424
Is there a way to match part of a pattern and exclude it from the remainder of the pattern?
What is the best approach for this?
I’m really only concerned with finding the first value (in this case, 227,134) in the list.
Thanks in advance for your help.
One or more digits or commas, followed by a space and “mmcf.”