I need to parse some string from an input file. These string are needed in Hadoop.
The problem is, these string are in markup tags.
Can someone suggest me a pattern, to match and store them
<id>INIcE89C561</id> <id>INIcE89C560</id> <id>Q1S5WLipQW2</id>
I need the string between id tag. All the tags are from different input file.
I need to use them as a Value. In key value pair.
To get the text between the id tag you can use something like:
And then extract the first captured group (which is your value).
In general, however, regex is not the best option for parsing XML.
There are much better suited XML parsers, that I recommend using.
Take a look at this tutorial for example.