Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8311889
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T19:57:45+00:00 2026-06-08T19:57:45+00:00

1.I need to convert a PDF File into a txt.file. My Command seems to

  • 0

1.I need to convert a PDF File into a txt.file. My Command seems to work, since i get the converted text on the screen, but somehow im incapable to direct the output into a textfile.

public static string[] GetArgs(string inputPath, string outputPath)
{ 
    return new[] {
                "-q", "-dNODISPLAY", "-dSAFER",
                "-dDELAYBIND", "-dWRITESYSTEMDICT", "-dSIMPLE",
                "-c", "save", "-f",
                "ps2ascii.ps", inputPath, "-sDEVICE=txtwrite",
                String.Format("-sOutputFile={0}", outputPath),
                "-c", "quit"
    }; 
}

2.Is there a unicode speficic .ps?

Update:
Posting my complete Code, maybe the error is somewhere else.

public static string[] GetArgs(string inputPath, string outputPath)
{
    return new[]    
    {   "-o c:/test.txt",    
        "-dSIMPLE",
        "-sFONTPATH=c:/windows/fonts",
        "-dNODISPLAY",
        "-dDELAYBIND",
        "-dWRITESYSTEMDICT",
        "-f",
        "C:/Program Files/gs/gs9.05/lib/ps2ascii.ps",               
        inputPath,
    };
}

[DllImport("gsdll64.dll", EntryPoint = "gsapi_new_instance")]
private static extern int CreateAPIInstance(out IntPtr pinstance, IntPtr caller_handle);

[DllImport("gsdll64.dll", EntryPoint = "gsapi_init_with_args")]
private static extern int InitAPI(IntPtr instance, int argc, string[] argv);

[DllImport("gsdll64.dll", EntryPoint = "gsapi_exit")]
private static extern int ExitAPI(IntPtr instance);

[DllImport("gsdll64.dll", EntryPoint = "gsapi_delete_instance")]
private static extern void DeleteAPIInstance(IntPtr instance);`

private static object resourceLock = new object();

private static void Cleanup(IntPtr gsInstancePtr)
{
    ExitAPI(gsInstancePtr);
    DeleteAPIInstance(gsInstancePtr);
}

private static object resourceLock = new object();

public static void ConvertPdfToText(string inputPath, string outputPath) 
{ 
    CallAPI(GetArgs(inputPath, outputPath));
}

public static void ConvertPdfToText(string inputPath, string outputPath) 
{ 
    CallAPI(GetArgs(inputPath, outputPath));
}

private static void CallAPI(string[] args)      
{       
    // Get a pointer to an instance of the Ghostscript API and run the API with the current arguments       
    IntPtr gsInstancePtr;   
    lock (resourceLock)     
    {           
        CreateAPIInstance(out gsInstancePtr, IntPtr.Zero);      
        try
        {
            int result = InitAPI(gsInstancePtr, args.Length, args);                    
            if (result < 0)     
            {
                throw new ExternalException("Ghostscript conversion error", result);        
            }       
        }           
        finally     
        {               
            Cleanup(gsInstancePtr);     
        }       
    }   
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T19:57:49+00:00Added an answer on June 8, 2026 at 7:57 pm

    2 questions, 2 answers:

    1. To get output to a file, use -sOutputFile=/path/to/file on the commandline, or add the line

       "-sOutputFile=/where/it/should/go",
      

      to your c# code (can be the first argument, but should be before your first "-c". But first get rid of your other -sOutputFile stuff you have already in there… 🙂

    2. No, PostScript isn’t aware of Unicode.


    Update

    (Remark: Extracting text from PDF reliably is (for various technical reasons) notoriously difficult. And it may not work at all, whichever tool you try…)

    On the commandline, the following two should work for recent releases of Ghostscript (current version is v9.05). It would be your own job…

    • …to test which command works better for your use case, and
    • …to translate these into c# code.

    1. txtwrite device:

    gswin32c.exe ^
       -o c:/path/to/output.txt ^
       -dTextFormat=3 ^
       -sDEVICE=txtwrite ^
        input.pdf
    

    Notes:

    1. You may want to use gswin64c.exe (if available) on your system if it is 64bit.
    2. The -o syntax for the output works only with recent versions of Ghostscript.
    3. The -o syntax does implicitely also set the -dBATCH and -dNOPAUSE parameters.
    4. If your Ghostscript is too old and the -o shorthand doesn’t work, replace it with -dBATCH -dNOPAUSE -sOutputFile=....
    5. Ghostscript can handle forward slashes inside path arguments even on Windows.
    6. The -dTextFormat is by default set to 3 anyway, so it is not required here. ‘Legal’ values for it are:
      • 0 : This outputs XML-escaped Unicode along with info related to the format of the text (position, font name, point size, etc). Intended for developers only.
      • 1 : Same as 0, but will output blocks of text.
      • 2 : This outputs Unicode (UCS2) text with BMO (Byte Order Mark); tries to approximate layout of text in original document.
      • 3 : (default) Same as 2, but the text is encoded in UTF-8.
    7. The txtwrite device with this -dTextFormat modifier is a rather new asset of Ghostscript, so please report bugs if you find ones.

    2. Using ps2ascii.ps

    gswin32c.exe ^
       -sstdout=c:/path/to/output.txt ^
       -dSIMPLE ^
       -sFONTPATH=c:/windows/fonts ^
       -dNODISPLAY 
       -dDELAYBIND ^
       -dWRITESYSTEMDICT ^
       -f /path/to/ps2ascii.ps ^
        input.pdf
    

    Notes:

    1. This is a completely different method from the txtwrite device one and cannot be mixed with it!
    2. ps2ascii.ps is a file, a PostScript program that Ghostscript invokes to extract the text. It is usually located in the Ghostscript installdir’s /lib subdirectory. Go and see if it is really there.
    3. -dSIMPLE may be replaced by dCOMPLEX in order to print out extra info lines (current color, presence of an image, rectangular fills).
    4. -sstdout=... is required because the ps2ascii.ps PostScript program does print to stdout only and can’t be told to write to a file. So -sstdout=... tells Ghostscript to redirect its stdout to a file.

    3. Non-Ghostscript methods

    Do not ignore other, non-Ghostscript methods that may be easier to work with. All of the following are cross-platform and should be available on Windows too:

    • mudraw -t
      GPL licensed (or commercial, if you need). Commandline utility from MuPDF to extract text from PDF (which is developed by the same group of developers that do Ghostscript).
    • pdftotext
      GPL licensed. Commandline utility from Poppler (which is a fork from XPDF, that also provides a pdftotext).
    • podofotxtextract
      GPL licensed. Commandline utility based the PoDoFo PDF processing library.
    • TET
      The Text Extraction Toolkit from PDFlib.com (commercial, but may be gratis for personal use — I didn’t check recent news). Probably the most powerful text extraction tool of them all…
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to convert the html file into PDF, I am having command line
I have an situation where i need to convert Doc file into PDF file.
I need to convert PDF file into Excelshet in java.is there any available thridy
I need to convert a pdf file into a jpeg using C#. And the
Using C#, I need to convert each page of a PDF file into separate
I need to load a pdf file and then convert it to a text
I need to be able to convert a PDF file to images (one image
I have a PDF which is searchable and I need to convert it into
need to convert a pdf file to a doc file. I found different type
Need to convert pdf file to image file (jpg, png, gif) to show on

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.