I want to extract images from an PDF. I’m using iTextSharp right now. Some

Question

0

Asked: June 1, 20262026-06-01T15:23:59+00:00 2026-06-01T15:23:59+00:00

I want to extract images from an PDF. I’m using iTextSharp right now. Some

0

I want to extract images from an PDF. I’m using iTextSharp right now.
Some images can be extracted correct, but most of them don’t have the right colors and are distorted.
I did some experiments with different PixelFormats, but I didn’t get a solution for my Problem…

This is the Code which separates the image-types:

if (filter == "/FlateDecode")
{
   // ...
   int w = int.Parse(width);
   int h = int.Parse(height);
   int bpp = tg.GetAsNumber(PdfName.BITSPERCOMPONENT).IntValue;

   byte[] rawBytes = PdfReader.GetStreamBytesRaw((PRStream)tg);
   byte[] decodedBytes = PdfReader.FlateDecode(rawBytes);
   byte[] streamBytes = PdfReader.DecodePredictor(decodedBytes, tg.GetAsDict(PdfName.DECODEPARMS));

   PixelFormat[] pixFormats = new PixelFormat[23] { 
         PixelFormat.Format24bppRgb,
         // ... all Pixel Formats
    };
    for (int i = 0; i < pixFormats.Length; i++)
    {
        Program.ToPixelFormat(w, h, pixFormats[i], streamBytes, bpp, images));
    }
}

This is the Code to save the Image in a MemoryStream. Saving the image in a folder is implemented later.

private static void ToPixelFormat(int width, int height, PixelFormat pixelformat, byte[] bytes, int bpp, IList<Image> images)
{
    Bitmap bmp = new Bitmap(width, height, pixelformat);
    BitmapData bmd = bmp.LockBits(new Rectangle(0, 0, width, height),
       ImageLockMode.WriteOnly, pixelformat);
    Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length);
    bmp.UnlockBits(bmd);
    using (var ms = new MemoryStream())
    {
       bmp.Save(ms, System.Drawing.Imaging.ImageFormat.Tiff);
       bytes = ms.GetBuffer();
    }
    images.Add(bmp);
}

Please help me.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T15:24:00+00:00

Editorial Team

2026-06-01T15:24:00+00:00Added an answer on June 1, 2026 at 3:24 pm

I found an solution for my own problem.
To extract all Images on all Pages, it is not necessary to implement different filters.
iTextSharp has an Image Renderer, which saves all Images in their original image type.

Just do the following found here: http://kuujinbo.info/iTextSharp/CCITTFaxDecodeExtract.aspx
You don’t need to implement HttpHandler…

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want to extract images from an PDF. I’m using iTextSharp right now. Some

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply