My program reads a file specified in the argument and prints out each string and its frequency inside the file.
The program works for this file: http://www.cse.yorku.ca/course/3221/dataset1.txt
but not this file: http://www.cse.yorku.ca/course/3221/dataset2.txt.
It gives Segmentation fault (core dumped) error for the second file.
What could be wrong? Please help!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct {
char word[101];
int freq;
} WordArray;
int main(int argc, char *argv[])
{
WordArray *array = malloc(sizeof(WordArray));
FILE *file;
int i = 0;
file = fopen(argv[1], "r");
char *str = (char*) malloc (108);
while(fgets(str, 100, file) != NULL)
{
int pos = 0;
char *word = malloc (100);
while (sscanf(str, "%s%n", word, &pos ) == 1)
{
int j;
for (j = 0; j < i; j++)
{
if (strcmp(array[j].word, word) == 0)
{
array[j].freq = array[j].freq + 1;
break;
}
}
if (j==i)
{
array = (WordArray *) realloc (array, sizeof(WordArray) * (i+1));
strcpy(array[i].word, word);
array[i].freq = 1;
i++;
}
str += pos;
}
}
fclose(file);
int k;
for (k=0; k<i; k++)
{
printf("%s %d\n", array[k].word, array[k].freq);
}
return 0;
}
Several problems:
You increment str as part of the second loop and don’t reset it. I think this means your program is slowly walking through memory.
You fail to free word – probably better to allocate it outside the loop and on the stack but that won’t cause a crash unless you input is huge and you run out of memory.
You don’t need to cast result of malloc for modern compilers (yes, it used to be needed).
May want to check the results of malloc and realloc for safety.
I assume the first item is your problem.