I’m trying to split a chunck of html code by the “table” tag and its contents.
So, I tried
my $html = 'aaa<table>test</table>bbb<table>test2</table>ccc';
my @values = split(/<table*.*\/table>/, $html);
After this, I want the @values array to look like this:
array('aaa', 'bbb', 'ccc').
But it returns this array:
array('aaa', 'ccc').
Can anyone tell me how I can specify to the split function that each table should be parsed separately?
Thank you!
Your regex is greedy, change it to
/<table.*?\/table>/and it will do what you want. But you should really look into a proper HTML parser if you are going to be doing any serious work. A search of CPAN should find one that is suited to your needs.