I use iText 5.3.3
I try to extract text from pdf file.
I use something like that:
File f (....)
FileInputStream fis = new FileInputStream(f);
r = new PdfReader(fis);
String s=PdfTextExtractor.getTextFromPage(r, 1);
System.out.print(s);
I get this:
“(…)Singapore Airlines to the crisis caused by the ?rst fatal crash in the history(…)”
for text:
“(…)Singapore Airlines to the crisis caused by the first fatal crash in the history(…)”
or:
“(…)national carriers and ?nal conclusions suggest the need for(…)”
for text:
“(…)national carriers and final conclusions suggest the need for(…)”
as you see, i get “?” insted of “fi”.
Problem solved.
I changed default encoding for .txt files.
In Eclipse:
Window>>General>>Content Types>>Text
Default encoding: UTF-8