Okay, so my simple project is supposed to search for all the cases that a particular string is found in a .txt file. Case matters, and if the word is found in another word matters.
(Ex: if the word is “the”:
valid finds include:
the apple = 1;
the thespian = 2;
invalid finds include:
fourth elephant (the space between th and e)
The apple (Capitalization)
And if the word IS found in a line of the file, I’m supposed to print out the line ONCE.
If it is not found, I’m not supposed to print it at all.
So, for example, one run of my program should output:
Searching for 'the' in file 'test.txt'
2 : that they do not use permeates the [C++] language. Another example
3 : will further illustrate this influence. Imagine that an integer
5 : What bit value should be moved into the topmost position? If we
6 : look at the machine level, architectural designers are divided on
8 : the most significant bit position, while on other machines the sign
9 : bit (which, in the case of a negative number, will be 1) is extended.
10 : Either case can be simulated by the other, using software, by means
# occurrences of 'the' = 13
Unfortunately, I’m getting
Searching for 'the' in the file 'test.txt'
2: that they do not use permeates the [C++] language. Another example
3: will further illustrate this influence. Imagine that an integer
5: What bit value should be moved into the topmost position? If we
6: look at the machine level, architectural designers are divided on
8: the most significant bit position, while on other machines the sign
9: bit (which, in the case of a negative number, will be 1) is extended.
10: Either case can be simulated by the other, using software, by means
11: of a combination of tests and masks.
12:
# occurrences of 'the' = 15
I am NOT understanding why it thinks it found a “the” in lines 11 and 12.
Here is my code:
#include <iostream>
#include <fstream>
#include <string>
#include <cstring>
using namespace std;
int main(int argc, char* argv[]){
//a char pointer is a c-string
//the array is just an array of char pointers
//argv[0] = pointer to the word to search for
//argv[1] = pointer to fileNames
//includes program name @ 0, so three args
if (argc == 3){
int wordCounter = 0;
ifstream myFile(argv[2]);
if (!myFile){
cout << "File '" << argv[2] << "' could not be opened" << endl;
return 1;
}
else {
//counts the number of lines in file
int counter = 0;
//holds the new line in the file
char line[100];
//copies string into buffer that is length of word
const char * word = argv[1];
//holds whether found word
bool found = false;
cout << "Searching for '" << word << "' in the file '" << argv[2] << "'" << endl;
//number of chars in a line
int numChar = 0;
//saves every line
while (!(myFile.getline(line, 100)).eof()) {
//starts every new new at not having found the word
found = false;
//read in new line, so increases line counter
counter ++;
numChar = 0;
//find length of line
for (int i = 0; line[i] != '\n' && i < 101; i++){
numChar++;
}
//finds how many times the key word appears in one line
//checks up to a few before the end of the line for the word
if (numChar >= strlen(argv[1])){
for (int i = 0; i < numChar - strlen(argv[1]); i++){
//if the current line letter equals the first letter of the key word
if (line[i] == word[0]){
//continue looking forward to see if the rest of it match
for (int j = 0; j < strlen(argv[1]); j++){
//if word doesn't match break
if (word[j] != line [i+j]){
break;
}
//if matches all the way to end, add counter
if(j == strlen(argv[1]) - 1){
wordCounter++;
found = true;
}
}//end 2ndfor
}
}//end 1stfor
//if the key word has been found, print the line
if (found){
cout << counter << ": " << line << endl;
}
}
}//endwhile
cout << "# occurrences of '" << word << "' = " << wordCounter << endl;
myFile.close();
}//end else
}//end if
return 0;
}//end main
The cause that your programme believes there is a
"the"in lines 11 and 12 isthat you check for a newline (which isn’t in the buffer, by the way), but not for the terminating
0. So you check the entire 100 characters — one more actually, since you also check the nonexistentline[100], and count the"the"s left over from previous lines.should fix that.
Check the validity of the index first to avoid undefined behaviour due to invalid memory accesses.