Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6815755
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T20:50:41+00:00 2026-05-26T20:50:41+00:00

I need to post several (read: a lot) PDF files to the web but

  • 0

I need to post several (read: a lot) PDF files to the web but many of them have hard coded file:// links and links to non-public locations. I need to read through these PDFs and update the links to the proper locations. I’ve started writing an app using itextsharp to read through the directories and files, find the PDFs and iterate through each page. What I need to do next is find the links and then update the incorrect ones.

string path = "c:\\html";
DirectoryInfo rootFolder = new DirectoryInfo(path);

foreach (DirectoryInfo di in rootFolder.GetDirectories())
{
    // get pdf
    foreach (FileInfo pdf in di.GetFiles("*.pdf"))
    {
        string contents = string.Empty;
        Document doc = new Document();
        PdfReader reader = new PdfReader(pdf.FullName);

        using (MemoryStream ms = new MemoryStream())
        {
            PdfWriter writer = PdfWriter.GetInstance(doc, ms);
            doc.Open();

            for (int p = 1; p <= reader.NumberOfPages; p++)
            {
                byte[] bt = reader.GetPageContent(p);

            }
        }
    }
}

Quite frankly, once I get the page content I’m rather lost on this when it comes to iTextSharp. I’ve read through the itextsharp examples on sourceforge, but really didn’t find what I was looking for.

Any help would be greatly appreciated.

Thanks.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T20:50:41+00:00Added an answer on May 26, 2026 at 8:50 pm

    This one is a little complicated if you don’t know the internals of the PDF format and iText/iTextSharp’s abstraction/implementation of it. You need to understand how to use PdfDictionary objects and look things up by their PdfName key. Once you get that you can read through the official PDF spec and poke around a document pretty easily. If you do care I’ve included the relevant parts of the PDF spec in parenthesis where applicable.

    Anyways, a link within a PDF is stored as an annotation (PDF Ref 12.5). Annotations are page-based so you need to first get each page’s annotation array individually. There’s a bunch of different possible types of annotations so you need to check each one’s SUBTYPE and see if its set to LINK (12.5.6.5). Every link should have an ACTION dictionary associated with it (12.6.2) and you want to check the action’s S key to see what type of action it is. There’s a bunch of possible ones for this, link’s specifically could be internal links or open file links or play sound links or something else (12.6.4.1). You are looking only for links that are of type URI (note the letter I and not the letter L). URI Actions (12.6.4.7) have a URI key that holds the actual address to navigate to. (There’s also an IsMap property for image maps that I can’t actually imagine anyone using.)

    Whew. Still reading? Below is a full working VS 2010 C# WinForms app based on my post here targeting iTextSharp 5.1.1.0. This code does two main things: 1) Create a sample PDF with a link in it pointing to Google.com and 2) replaces that link with a link to bing.com. The code should be pretty well commented but feel free to ask any questions that you might have.

    using System;
    using System.Text;
    using System.Windows.Forms;
    using iTextSharp.text;
    using iTextSharp.text.pdf;
    using System.IO;
    
    namespace WindowsFormsApplication1
    {
        public partial class Form1 : Form
        {
    
            //Folder that we are working in
            private static readonly string WorkingFolder = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Hyperlinked PDFs");
            //Sample PDF
            private static readonly string BaseFile = Path.Combine(WorkingFolder, "OldFile.pdf");
            //Final file
            private static readonly string OutputFile = Path.Combine(WorkingFolder, "NewFile.pdf");
    
            public Form1()
            {
                InitializeComponent();
            }
    
            private void Form1_Load(object sender, EventArgs e)
            {
                CreateSamplePdf();
                UpdatePdfLinks();
                this.Close();
            }
    
            private static void CreateSamplePdf()
            {
                //Create our output directory if it does not exist
                Directory.CreateDirectory(WorkingFolder);
    
                //Create our sample PDF
                using (iTextSharp.text.Document Doc = new iTextSharp.text.Document(PageSize.LETTER))
                {
                    using (FileStream FS = new FileStream(BaseFile, FileMode.Create, FileAccess.Write, FileShare.Read))
                    {
                        using (PdfWriter writer = PdfWriter.GetInstance(Doc, FS))
                        {
                            Doc.Open();
    
                            //Turn our hyperlink blue
                            iTextSharp.text.Font BlueFont = FontFactory.GetFont("Arial", 12, iTextSharp.text.Font.NORMAL, iTextSharp.text.BaseColor.BLUE);
    
                            Doc.Add(new Paragraph(new Chunk("Go to URL", BlueFont).SetAction(new PdfAction("http://www.google.com/", false))));
    
                            Doc.Close();
                        }
                    }
                }
            }
    
            private static void UpdatePdfLinks()
            {
                //Setup some variables to be used later
                PdfReader R = default(PdfReader);
                int PageCount = 0;
                PdfDictionary PageDictionary = default(PdfDictionary);
                PdfArray Annots = default(PdfArray);
    
                //Open our reader
                R = new PdfReader(BaseFile);
                //Get the page cont
                PageCount = R.NumberOfPages;
    
                //Loop through each page
                for (int i = 1; i <= PageCount; i++)
                {
                    //Get the current page
                    PageDictionary = R.GetPageN(i);
    
                    //Get all of the annotations for the current page
                    Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
    
                    //Make sure we have something
                    if ((Annots == null) || (Annots.Length == 0))
                        continue;
    
                    //Loop through each annotation
    
                    foreach (PdfObject A in Annots.ArrayList)
                    {
                        //Convert the itext-specific object as a generic PDF object
                        PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
    
                        //Make sure this annotation has a link
                        if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
                            continue;
    
                        //Make sure this annotation has an ACTION
                        if (AnnotationDictionary.Get(PdfName.A) == null)
                            continue;
    
                        //Get the ACTION for the current annotation
                        PdfDictionary AnnotationAction = (PdfDictionary)AnnotationDictionary.Get(PdfName.A);
    
                        //Test if it is a URI action
                        if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
                        {
                            //Change the URI to something else
                            AnnotationAction.Put(PdfName.URI, new PdfString("http://www.bing.com/"));
                        }
                    }
                }
    
                //Next we create a new document add import each page from the reader above
                using (FileStream FS = new FileStream(OutputFile, FileMode.Create, FileAccess.Write, FileShare.None))
                {
                    using (Document Doc = new Document())
                    {
                        using (PdfCopy writer = new PdfCopy(Doc, FS))
                        {
                            Doc.Open();
                            for (int i = 1; i <= R.NumberOfPages; i++)
                            {
                                writer.AddPage(writer.GetImportedPage(R, i));
                            }
                            Doc.Close();
                        }
                    }
                }
            }
        }
    }
    

    EDIT

    I should note, this only changes the actual link. Any text within the document won’t get updated. Annotations are drawn on top of text but aren’t really tied to the text underneath in anyway. That’s another topic completely.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

First, I must say that I have read several post about this at StackOverflow
I have read several posts about WSDL and SOAP but still I'm confusing the
I have an action which I need to post forward onto another action if
I don't think I need to post my code (there's lots of it) but
I am working on a ASP.NET app and i have a need to post
I have an array that contains several domain names. I need to replace those
I have a situation where I need access to a shopping cart over several
I read several documents with conflicting view: This post says that you cannot use
I need to access data from a webpage using several different post requests. For
I've read several different posts on paging w/ in MVC but none describe a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.