Need to pull strings between href attribute tags in Python using the re module.
I’ve tried numerous patterns such as:
patFinderLink = re.compile('\>"(CVE.*)"\<\/a>')
Example: I need to pull what is between the tags (in this case “CVE-2010-3718“) from:
<pre>
<a href="https://www.redhat.com/security/data/cve/CVE-2010-3718.html">CVE-2010-3718</a>
</pre>
What am I doing wrong here? Any advice is greatly appreciated. Thank you in advance.
Sun
I am surprised no one suggested to use BeautifulSoup:
here is how I would do it :
Result: