I wanna read some text files in a folder line by line. for example of 1 txt :
Fast and Effective Text Mining Using Linear-time Document Clustering
Bjornar Larsen WORD2 Chinatsu Aone
SRA International AK, Inc.
4300 Fair Lakes Cow-l Fairfax, VA 22033
{bjornar-larsen, WORD1
I wanna remove line that does not contain of words = word, word2, word3, and does not end with dot .
so. from the example, the result will be :
Bjornar Larsen WORD2 Chinatsu Aone
SRA International, Inc.
{bjornar-larsen, WORD1
I am confused, hw to remove the line? it that possible? or can we replace them with a space?
here’s the code :
$url = glob($savePath.'*.txt');
foreach ($url as $file => $files) {
$handle = fopen($files, "r") or die ('can not open file');
$ori_content= file_get_contents($files);
foreach(preg_split("/((\r?\n)|(\r\n?))/", $ori_content) as $buffer){
$pos1 = stripos($buffer, $word1);
$pos2 = stripos($buffer, $word2);
$pos3 = stripos($buffer, $word3);
$last = $str[strlen($buffer)-1];//read the las character
if (true !== $pos1 OR true !== $pos2 OR true !==$pos3 && $last != '.'){
//how to remove
}
}
}
please help me, thank you so much 🙂
You’re using a
!== truecomparison to test the return-value of thestripos.!== truemeans “is not absolutely equal-to the boolean value true”. The return-value ofstriposis numeric, unless the word doesn’t exist, in which case it’sfalse. In other words, your condition is always false.Try updating it to use
=== falseinstead. Also, you’re usingORin between each; Your example shows that it needs to only contain 1 of them – so if you’re checking that “none of them were found”, you’ll need to use&&for everything:Regarding “how to remove the line”, you’ll need to keep a list of all lines you want to keep. This means, we’ll actually want to flip the condition above to use
!== falseand an||between everything (because we want to keep all lines that match any rule).Try something like this:
Now, you’ll have every line that matches your ruleset in the
$linesToKeeparray. You can convert this back to a string with$lines = join("\r\n", $linesToKeep);, or iterate through it and process it however you’d like.