Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7128571
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T11:09:08+00:00 2026-05-28T11:09:08+00:00

Interview Question I have been asked this question in an interview, and the answer

  • 0

Interview Question

I have been asked this question in an interview, and the answer doesn’t have to be specific programming language, platform- or tool- specific.

The question was phrased as following:

How would you get the instance count of a given word in a PDF. The answer doesn’t have to be programming, platform, or tool specific. Just let me know how would you do it in a memory and speed efficient way

I am posting this question for following reasons:

  1. To better understand the context – I still fail to understand the context of this question, what might the interviewer be looking for by asking this question?
  2. To get diverse opinions – I tend to answer such questions based on my skills on a programming language (C#), but there might be other valid options to get this done.

Thanks for your interest.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T11:09:09+00:00Added an answer on May 28, 2026 at 11:09 am

    If I had to write a program to do it, I’d find a PDF rendering library capable of extracting text from PDF files, such as Xpdf and then count the words.
    If this was a one-of task or something that needed to be automated for a non-production quality task, I’d just feed the file into pdftotext program and then parsed the output file with python, splitting into words, putting them in a dictionary and counting number of occurances.

    If I was asking this interviewing question, I’d be looking for a couple of things:

    1. understanding the difference between the setting for this task:
      one-off script thingy vs production code
    2. not attempting to
      implement PDF rendered yourself and trying to find a library
      instead.

    Now I wouldn’t expect this from any random candidate with no PDF experience, but you can have a very meaningful discussion about what PDF is and what a “word” is. You see, PDF stored text as a bunch of string with coordinates. Each string is not necessarily a word. Often times, the words will be split into a couple of completely separate strings which are absolutely positioned in the document to make a single word. This is why sometimes when searching for words in a PDF document you get strange looking results. So to implement word searching in a document you’d have to glue these strings back together (pdftotext takes care of that for you).

    It’s not a bad question at all.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In interview I have been asked following question. I tried to answer the question
I have been asked this question in interview how does XML and Soap work
I have been asked in an interview the following question: What is the default
Recently I have been asked an interview question What are the events order in
This question was asked at interview. Say I have a contract. [ServiceContract] public interface
Recently, I have been asked a question in an interview what's the difference between
I have been looking all over the Internet for an answer to this question
I have been asked a question in an interview about interfaces. I am not
I have been asked the flowing question in an interview: What does it mean
I have been asked this question today. When debugging, there is an error. But

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.