Using Perl, I need to empty a string which contains several spaces
I can’t come out with the correct regex
Here is my text:
<sentence="I am walking on the street and it is raining" >
</sentence>
I want to empty this string to get:
<sentence="" >
</sentence>
Here is my code (it is just replacing a string without space):
sub empty_it {
print "\nSTART replacing WO info !!!\n";
my $find = "\<sentence\=\"\\S*\"";
my $replace = "\<sentence\=\"\"";
{
local @ARGV = ("$_[0]");
local $^I = '.baz';
while ( <> ) {
if (s/$find/$replace/ig) {
print;
}
else {
print;
}
}
}
}
What you are looking for is probably a way to match all content between the two quotes. This can be done by using a negative character class (i.e /”[^”]*”/)
So this would probably work:
But in general I wouldn’t recommend using regular expressions for mangling xml. It is often to fragile and will often break if your input changes the least. For example if it starts using single quotes because it suddenly have to include double quotes inside the content.