Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8334857
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T03:29:02+00:00 2026-06-09T03:29:02+00:00

I was tasked with creating a word frequency analysis program that reads the content

  • 0

I was tasked with creating a word frequency analysis program that reads the content from a text file, and produces the following example output:

SUMMARY:

27340 words
2572 unique words

WORD FREQUENCIES (TOP 10):

the 1644
and  872
to  729
a  632
it  595
she  553
i 545
of  514
said 462
you 411

I attempted to create a program to achieve such an output. I’m very new to C programming, so although it works to a certain extent, there are probably a lot of efficiency issues / flaws. Here is what I wrote so far:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define MAX_WORD 32
#define MAX_TEXT_LENGTH 10000

// ===========================================
//                 STRUCTURE
//============================================


typedef struct word {
char *str;              /* Stores the word */
int freq;               /* Stores the frequency */
struct word *pNext;     /* Pointer to the next word counter in the list */
} Word;

// ===========================================
//             FUNCTION PROTOTYPES
//============================================

int getNextWord(FILE *fp, char *buf, int bufsize);   /* Given function to get words */
void addWord(char *pWord);                          /* Adds a word to the list or updates exisiting word */
void show(Word *pWordcounter);        /* Outputs a word and its count of occurrences */
Word* createWordCounter(char *word);  /* Creates a new WordCounter structure */

// ===========================================
//             GLOBAL VARIABLES
//============================================

Word *pStart = NULL;                  /* Pointer to first word counter in the list */

int totalcount = 0;                  /* Total amount of words */
int uniquecount = 0;                /* Amount of unique words */



// ===========================================
//                 MAIN
//============================================      


int main () {

    /* File pointer */
    FILE * fp;
    /* Read text from here */
    fp = fopen("./test.txt","r");

    /* buf to hold the words */
    char buf[MAX_WORD];

    /* Size */
    int size = MAX_TEXT_LENGTH;


    /* Pointer to Word counter */
    Word *pCounter = NULL;


    /* Read all words from text file */

    while (getNextWord(fp, buf, size)) {

        /* Add the word to the list */
        addWord(buf); 

        /* Increment the total words counter */
        totalcount++;
    }


    /* Loop through list and figure out the number of unique words */
    pCounter = pStart;
    while(pCounter != NULL)
    {
        uniquecount++;
        pCounter = pCounter->pNext;
    }

    /* Print Summary */

    printf("\nSUMMARY:\n\n");
    printf("   %d words\n", totalcount); /* Print total words */
    printf("   %d unique words\n", uniquecount); /* Print unique words */




    /* List the words and their counts */
    pCounter = pStart;
    while(pCounter != NULL)
    {
        show(pCounter);
        pCounter = pCounter->pNext;
    }
    printf("\n");


    /* Free the allocated  memory*/
    pCounter = pStart;
    while(pCounter != NULL)
    {
        free(pCounter->str);        
        pStart = pCounter;           
        pCounter = pCounter->pNext;  
        free(pStart);                  
    }

    /* Close file */
    fclose(fp);

    return 0;

}


// ===========================================
//                 FUNCTIONS
//============================================


void show(Word *pWordcounter)
{
  /* output the word and it's count */
  printf("\n%-30s   %5d", pWordcounter->str,pWordcounter->freq);

}

void addWord(char *word)
{
  Word *pCounter = NULL;
  Word *pLast = NULL;

  if(pStart == NULL)
  {
    pStart = createWordCounter(word);
    return;
  }

  /* If the word is in the list, increment its count */
  pCounter = pStart;
  while(pCounter != NULL)
  {
    if(strcmp(word, pCounter->str) == 0)
    {
      ++pCounter->freq;

      return;
    }
    pLast = pCounter;            
    pCounter = pCounter->pNext;  
  }

  /* Word is not in the list, add it */
  pLast->pNext = createWordCounter(word);
}

Word* createWordCounter(char *word)
{
  Word *pCounter = NULL;
  pCounter = (Word*)malloc(sizeof(Word));
  pCounter->str = (char*)malloc(strlen(word)+1);
  strcpy(pCounter->str, word);
  pCounter->freq = 1;
  pCounter->pNext = NULL;
  return pCounter;
}

int getNextWord(FILE *fp, char *buf, int bufsize) {
    char *p = buf;
    char c;


    //skip all non-word characters
    do {
        c = fgetc(fp);
        if (c == EOF) 
            return 0;
        } while (!isalpha(c));

    //read word chars

    do {
        if (p - buf < bufsize - 1)
        *p++ = tolower(c);
        c = fgetc(fp);
        } while (isalpha(c));

        //finalize word
        *p = '\0';
        return 1;
        }

It displays the summary correctly. The amount of words and unique words is completely correct. It then lists every single unique word found in the file and displays the correct number of occurrences.

What I need to do now (and what I’m having a lot of trouble with) is sorting my linked list by the number of occurrences in a descending order. On top of that, it should only display the top 10 words and not all of them (this should be doable once I have the linked list sorted).

I know the code itself is very inefficient right now, but my primary concern right now is to just get the correct output.

If anybody can help me out with a sorting algorithm, or at least point me in the right direction it would be greatly appreciated.

Thank you.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T03:29:04+00:00Added an answer on June 9, 2026 at 3:29 am

    This idea might be a little ambitious for a beginning C programmer, but it is always a good idea to be aware of the functions in the standard library. If you know how big your linked list is, you can use malloc to allocate space for an array holding the same data. Then you can use qsort to sort the data for you.

    Functions malloc and qsort are frequently used members of the standard C library.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have been tasked with creating a program that will generate an amortization schedule.
I'm tasked with creating a program that turns something like ((X+3)*(X+4)) into a binary
We are tasked with creating a reporting frameworks that does the following. Clients can
I was tasked with creating an SSO solution from an existing asp.net app to
I've been tasked with creating an extremely heavy JavaScript site that of course must
I have been tasked with creating an API for retrieving and adding content to
I am new to WIX and have been tasked with creating an installer that
I've been tasked with creating an application that allows users the ability to enter
Recently I have been tasked with creating an application for a business that basically
I am tasked with creating an uploader for a site that can handle very

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.