I’m reading a file in my C program and comparing every word in it with my word, which is entered via command line argument. But I get crashes, and I can’t understand what’s wrong. How do I track such errors? What is wrong in my case?
My compiler is clang. The code compiles fine. When running it says ‘segmentation fault’.
Here is the code.
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* temp = argv[1];
char* word = strcat(temp, "\n");
char* c = "abc";
FILE *input = fopen("/usr/share/dict/words", "r");
while (strcmp(word, c))
{
char* duh = fgets(c, 20, input);
printf("%s", duh);
}
if (!strcmp (word, c))
{
printf("FOUND IT!\n");
printf("%s\n%s", word, c);
}
fclose(input);
}
The issue here is that you are trying to treat strings in C as you might in another language (like C++ or Java), in which they are resizable vectors that you can easily append or read an arbitrary amount of data into.
C strings are much lower level. They are simply an array of characters (or a pointer to such an array; arrays can be treated like pointers to their first element in C anyhow), and the string is treated as all of the characters within that array up to the first null character. These arrays are fixed size; if you want a string of an arbitrary size, you need to allocate it yourself using
malloc(), or allocate it on the stack with the size that you would like.One thing here that is a little confusing is you are using a non-standard type
string. Given the context, I’m assuming that’s coming from yourcs50.h, and is just a typedef tochar *. It will probably reduce confusion if you actually usechar *instead ofstring; using a typedef obscures what’s really going on.Let’s start with the first problem.
strcat()appends the second string onto the first; it starts from the null terminator of the first string, and replaces that with the first character of the second string, and so on, until it reaches a null in the second string. In order for this to work, the buffer containing the first string needs to have enough room to fit the second one. If it does not, you may overwrite arbitrary other memory, which could cause your program to crash or have all kinds of other unexpected behavior.Here’s an illustration. Let’s say that
argv[1]contains the wordhello, and the buffer has exactly as much space as it needs for this. After it is some other data; I’ve filled inotherfor the sake of example, though it won’t actually be that, it could be anything, and it may or may not be important:Now if you use
strcat()to append"\n", you will get:You can see that we’ve overwritten the
otherdata that was afterhello. This may cause all kinds of problems. To fix this, you need to copy yourargv[1]into a new string, that has enough room for it plus one more character (and don’t forget the trailing null). You can callstrlen()to get the length of the string, then add 1 for the\n, and one for the trailing null, to get the length that you need.Actually, instead of trying to add a
\nto the word you get in from the command line, I would recommend stripping off the\nfrom your input words, or usingstrncmp()to compare all but the last character (the\n). In general, it’s best in C to avoid appending strings, as appending strings means you need to allocate memory and copy things around, and it can be easy to make mistakes doing so, as well as being inefficient. Higher level languages usually take care of the details for you, making it easier to append strings, though still just as inefficient.After your edit, you changed this to:
However, this has the same problem. A
char *is a pointer to a character array. Yourtempvariable is just copying the pointer, not the actual value; it is still pointing to the same buffer. Here’s an illustration; I’m making up addresses for the purposes of demonstration, in the real machine there will be more objects in between these things, but this should suffice for the purpose of demonstration.Now when you create your
tempvariable, all you are doing is copyingargv[1]into a newchar *:As a side note, you also shouldn’t ever try to access
argv[1]without checking thatargcis greater than 1. If someone doesn’t pass any arguments in, thenargv[1]itself is invalid to access.I’ll move on to the next problem.
Here, you are referring to the static string
"abc". A string that appears literally in the source, like"abc", goes into a special, read-only part of the memory of the program. Remember what I said;stringhere is just a way of sayingchar *. Socis actually just a pointer into this read-only section of memory; and it has only enough room to store the characters that you provided in the text (4, forabcand the null character terminating the string).fgets()takes as its first argument a place to store the string that it is reading, and its second the amount of space that it has. So you are trying to read up to 20 bytes, into a read-only buffer that only has room for 4.You need to either allocate space for reading on the stack, using, for example:
Or dynamically, using
malloc():