I am successfully able to extract images from a pdf using pdfsharp. The image

Question

0

Asked: May 27, 20262026-05-27T02:49:05+00:00 2026-05-27T02:49:05+00:00

I am successfully able to extract images from a pdf using pdfsharp. The image

0

I am successfully able to extract images from a pdf using pdfsharp. The image are of CCITFFaxDecode. But in the tiff image created , the image is getting rotated. Any idea what might be going wrong?

This is the code im using :

byte[] data = xObject.Stream.Value;
Tiff tiff = BitMiracle.LibTiff.Classic.Tiff.Open("D:\\clip_TIFF.tif", "w");
tiff.SetField(TiffTag.IMAGEWIDTH, (uint)(width));
tiff.SetField(TiffTag.IMAGELENGTH, (uint)(height));
tiff.SetField(TiffTag.COMPRESSION, (uint)BitMiracle.LibTiff.Classic.Compression.CCITTFAX4);
tiff.SetField(TiffTag.BITSPERSAMPLE, (uint)(bpp));
tiff.WriteRawStrip(0,data,data.Length);
tiff.Close();

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T02:49:06+00:00

Since the question is still tagged w/iTextSharp might as add some code, even though it doesn’t look like you’re using the library here. PDF parsing support was added starting in iText[Sharp] 5.

Didn’t have an test PDF with the image type you’re using, but found one here (see the attachment). Here’s a very simple working example in ASP.NET (HTTP handler .ashx) using that test PDF document to get you going:

<%@ WebHandler Language="C#" Class="CCITTFaxDecodeExtract" %>
using System;
using System.Collections.Generic;
using System.IO;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
using Dotnet = System.Drawing.Image;
using System.Drawing.Imaging;

public class CCITTFaxDecodeExtract : IHttpHandler {
  public void ProcessRequest (HttpContext context) {
    HttpServerUtility Server = context.Server;
    HttpResponse Response = context.Response;
    string file = Server.MapPath("~/app_data/CCITTFaxDecode.pdf");
    PdfReader reader = new PdfReader(file);
    PdfReaderContentParser parser = new PdfReaderContentParser(reader);
    MyImageRenderListener listener = new MyImageRenderListener();
    for (int i = 1; i <= reader.NumberOfPages; i++) {
      parser.ProcessContent(i, listener);
    } 
    for (int i = 0; i < listener.Images.Count; ++i) {
      string path = Server.MapPath("~/app_data/" + listener.ImageNames[i]);
      using (FileStream fs = new FileStream(
        path, FileMode.Create, FileAccess.Write
      ))
      {
        fs.Write(listener.Images[i], 0, listener.Images[i].Length);
      }
    }         
  }
  public bool IsReusable { get { return false; } }
/*
 * see: TextRenderInfo & RenderListener classes here:
 * http://api.itextpdf.com/itext/
 * 
 * and Google "itextsharp extract images"
 */
  public class MyImageRenderListener : IRenderListener {
    public void RenderText(TextRenderInfo renderInfo) { }
    public void BeginTextBlock() { }
    public void EndTextBlock() { }

    public List<byte[]> Images = new List<byte[]>();
    public List<string> ImageNames = new List<string>();
    public void RenderImage(ImageRenderInfo renderInfo) {
      PdfImageObject image = renderInfo.GetImage();
      PdfName filter = image.Get(PdfName.FILTER) as PdfName;
      if (filter == null) {
        PdfArray pa = (PdfArray) image.Get(PdfName.FILTER);
        for (int i = 0; i < pa.Size; ++i) {
          filter = (PdfName) pa[i];
        }
      }
      if (PdfName.CCITTFAXDECODE.Equals(filter)) {
        using (Dotnet dotnetImg = image.GetDrawingImage()) {
          if (dotnetImg != null) {
            ImageNames.Add(string.Format(
              "{0}.tiff", renderInfo.GetRef().Number)
            );
            using (MemoryStream ms = new MemoryStream()) {
              dotnetImg.Save(
              ms, ImageFormat.Tiff);
              Images.Add(ms.ToArray());
            }
          }
        }
      }
    }
  }
}

If the image(s) is/are being rotated, see this thread on the iText mailing list; perhaps some of the pages in the PDF document have been rotated.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am successfully able to extract images from a pdf using pdfsharp. The image

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply