I’m trying to read a pdf file and get all hyperlinks from this file.
I’m using iTextSharp for C# .net.
PdfReader reader = new PdfReader("test.pdf");
List<PdfAnnotation.PdfImportedLink> list = reader.GetLinks(36);
This method “GetLinks” return a list with a lot of information about the links, but this method does not return the value that I want, the hyperlink string and I exactly know that there are hyperlinks in 36th page
PdfReader.GetLinks()is only meant to be used with links internal to the document, not external hyperlinks. Why? I don’t know.The code below is based off of code I wrote earlier but I’ve limited it to links stored in the PDF as a
PdfName.URI. Its possible to store the link as Javascript that ultimately does the same thing and there’s probably other types but you’ll need to detect for that. I don’t believe there’s anything in the spec that says that a link actually needs to be a URI, its just implied, so the code below returns a string that you can (probably) convert to a URI on your own.And call it: