Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8688167
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T23:17:26+00:00 2026-06-12T23:17:26+00:00

I am using PDFKitten for searching strings within PDF documents with highlighting of the

  • 0

I am using PDFKitten for searching strings within PDF documents with highlighting of the results. FastPDFKit or any other commercial library is no option so i sticked to the most close one for my requirements.

Wrong coordinate

As you can see in the screenshot i searched for the string “in” which is always correctly highlighted except the last one. I got a more complex PDF document where the highlighted box for “in” is nearly 40% wrong.

I read the whole syntax and checked the issues tracker but except line height problems i found nothing regarding the width calculation. For the moment i dont see any pattern where the calculation goes or could be wrong and i hope that maybe someone else had a close problem to mine.

My current expectation is that the coordinates and character width is wrong calculated somewhere in the font classes or RenderingState.m. The project is very complex and maybe someone of you had a similar problem with PDFKitten in the past.

I have used the original sample PDF document from PDFKitten for my screenshot.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T23:17:27+00:00Added an answer on June 12, 2026 at 11:17 pm

    This might be a bug in PDFKitten when calculating the width of characters whose character identifier does not coincide with its unicode character code.

    appendPDFString in StringDetector works with two strings when processing some string data:

    // Use CID string for font-related computations.
    NSString *cidString = [font stringWithPDFString:string];
    
    // Use Unicode string to compare with user input.
    NSString *unicodeString = [[font stringWithPDFString:string] lowercaseString];
    

    stringWithPDFString in Font transforms the sequence of character identifiers of its argument into a unicode string.

    Thus, in spite of the name of the variable, cidString is not a sequence of character identifiers but instead of unicode chars. Nonetheless its entries are used as argument of didScanCharacter which in Scanner is implemented to forward the position by the character width: It is using the value as parameter of widthOfCharacter in Font to determine the character width, and that method (according to the comment “Width of the given character (CID) scaled to fontsize”) expects its argument to be a character identifier.

    So, if CID and unicode character code don’t coincide, the wrong character widths is determined and the position of any following character cannot be trusted. In the case at hand, the /fi ligature has a CID of 12 which is way different from its Unicode code 0xfb01.

    I would propose PDFKitten to be enhanced to also define a didScanCID method in StringDetector which in appendPDFString should be called next to didScanCharacter for each processed character forwarding its CID. Scanner then should make use of this new method instead to calculate the width to forward its cursor.

    This should be triple-checked first, though. Maybe some widthOfCharacter implementations (there are different ones for different font types) in spite of the comment expect the argument to be a unicode code after all…

    (Sorry if I used the wrong vocabulary here or there, I’m a ‘Java guy… :))

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm (trying) to write a PDF to plain text parser and I'm using pdfKitten
All i am working on pdf search function using this link https://github.com/KurtCode/PDFKitten for single
Using Microsoft SQL Server 2005, is there any way to see when a table
Using RavenDB within an ASP.NET MVC (4) application, what is the correct pattern for
Using php/html, I want to retrieve email addresses (plus other information) from MySQL and
Using CMake I want to check if a particular function (cv::getGaborKernel) from OpenCV library
Using the Exiv2 library to write some exif tags to an image i'm running
Using Point Cloud Library on Ubuntu, I am trying to take multiple point clouds
Using a CSS image sprite, I'm creating an 'interactive' image where hovering over certain
Using a populated Table Type as the source for a TSQL-Merge. I want to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.