I copied and pasted text from a PDF file but it didn’t extract the

Question

0

Asked: June 17, 20262026-06-17T18:10:07+00:00 2026-06-17T18:10:07+00:00

I copied and pasted text from a PDF file but it didn’t extract the

0

I copied and pasted text from a PDF file but it didn’t extract the numbers. If I do less or more on the exported txt file I see the following:

"Christina, daughter of David Brodie, on <U+F735> November <U+F731><U+F736><U+F736><U+F735>. She was the sister of"

It should read:

“Christina, daughter of David Brodie, on 5 November 1665. She was the sister of”

Initially I though it would be a simple search and replace, but the <U+F73n> numbers are encoded and I’m not sure how to extract them or even how they’re encoded, although I did save the file as utf-8 originally. I tried to use php’s mb_string functions to see if I could extract the codes in some way but I haven’t been successful.

Has anyone else come across this problem and is there a simple solution that has eluded me?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T18:10:08+00:00

Editorial Team

2026-06-17T18:10:08+00:00Added an answer on June 17, 2026 at 6:10 pm

Unfortunately U+Fxxx is in the Private Use Area of Unicode. There is no automatic way to fix this, short of knowing the mapping ahead of time. Based on the codepoints in your sample, I would venture to say that you could subtract 0xF731 from the character values and then add 0x30 to convert them to ASCII numbers.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I copied and pasted text from a PDF file but it didn’t extract the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply