I am trying to parse the keywords from google suggest, this is the url:
http://google.com/complete/search?output=toolbar&q=test
I’ve done it with php using:
'|<CompleteSuggestion><suggestion data="(.*?)"/><num_queries int="(.*?)"/></CompleteSuggestion>|is'
But that wont work with python re.match(pattern, string), I tried a few but some show error and some return None.
How can I parse that info? I dont want to use minidom because I think regex will be less code.
You could use
etree:It is more code than a regex, but it also does more. Specifically, it will fetch the entire list of matches in one go, and unescape any weird stuff like double-quotes in the
dataattribute. It also won’t get confused if additional elements start appearing in the XML.