Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8067245
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T12:13:41+00:00 2026-06-05T12:13:41+00:00

I was recently asked a theoretical C question and I was wondering what the

  • 0

I was recently asked a theoretical C question and I was wondering what the best way to approach it would be:

If I had a document with 10 words on it what would be the best way to determine if there were duplicate words and if there were duplicates how would I keep track of how many there were?

Any insight on how you would approach this would be great.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T12:13:42+00:00Added an answer on June 5, 2026 at 12:13 pm

    Theoretical interview questions like this always deal in small numbers (like 10 words). However, the number means nothing; it’s there to separate out those candidates who really can think around the problem in the general form from those who simply regurgitate fixed answers to fixed interview questions they find on the internet.

    The best software houses will only favour solutions that are scalable. Therefore, you will gain top marks in an interview if your answer is simple, but also scalable to any size of problem (or, in this case, document). Therefore, sorting, loops inside loops, O(n^2) complexity, forget them all. If you presented any solutions like these to a leading-edge software company at interview you would fail.

    Your particular question is checking to see if you know about Hash Tables. The most efficient solution to this problem can be written in pseudo-code as follows:

    1. Initialise a new hash table.
       For each word in the document...
    2.     Generate a hash key for the word.
    3.     Lookup the word in the hash table using the key. If it is found,
    4.         Increment the count for the word.
           Otherwise,
    5.         Store the new word in table and set its count to one.
    

    The most important benefit of the above solution is that only a single scan of the document is required. No reading words into memory and then processing (two scans), no loops in loops (many scans), no sorting (even more passes). After exactly one pass of the document, if you read out the keys in the hash table, the count of each word tells you exactly how many times each word appeared in the document. Any word with a count greater than one is a duplicate.

    The secret to this solution is its use of hash tables. Generation of the hash key (step 2), key lookup (step 3), and key storage (step 5) can be implemented as near constant-time operations. This means the time these steps take hardly changes as the size of the input set (i.e. number of words) grows. It means that whether it’s the 10th word in a document, or the 10 millionth word, inserting that word into the hash table (or looking it up) will take roughly the same very small amount of time. In this case, we additionally keep a count of each word’s frequency in step 5. Incrementing a value is known to be a very efficient fixed-time operation.

    Any solution to this problem must scan all words in the document at least once. As our solution processes each word exactly once, with all words taking approximately the same very small constant time to process, we say our solution performs optimally and scales linearly, yielding O(n) performance (put simply, processing 1,000,000 words will take around 1000 times longer than processing 1000 words). In all, a scalable and efficient solution to the problem

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I recently asked this question on the subject, which was closed as a duplicate
I recently asked a question (and had it answered) here: jQuery Load JSON I
Musing over a recently asked question , I started to wonder if there is
I recently asked this question: Best choice? Edit bytecode (asm) or edit java file
I recently asked/accepted an answer to a question I had earlier: How can I
I was recently asked this question in one of my telephonic interview. There is
I recently asked a question about IIf vs. If and found out that there
I recently asked a question about Oracle Encryption. Along the way to finding a
I recently asked a question about counting the number of times an element had
I recently asked a question about formatting JavaScript code in Vim. And I've also

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.