I’m trying to create a batch that creates a fileC.txt containing all lines in fileA.txt except for those that contains the strings in the lines in fileB.txt:
Pseudo:
foreach(line L in fileA.txt)
excluded = false
foreach(string str in fileB.txt)
if L contains str
exclude = true
if !excluded
add L to fileC.txt
if L !contains
For example
fileA.txt: (all)
this\here\is\a\line.wav
and\this\is\another.wav
i\am\a\chocolate.wav
peanut\butter\jelly\time.wav
fileB.txt: (those to be excluded)
another.wav
time.wav
fileC.txt: (wanted result)
this\here\is\a\line.wav
i\am\a\chocolate.wav
I’ve been fiddling around with FINDSTR but I just can’t seem to puzzle it together.. any help or pointers greatly appreciated!
Cheers!
/ Fredde
The answer should be this simple:
And with your example, the above does give the correct results.
But there is a nasty FINDSTR bug that makes it unreliable when using multiple case sensitive literal search strings. See Why doesn't this FINDSTR example with multiple literal search strings find a match?, as well as the answer that goes with it. For a “complete” list of undocumented FINDSTR features and bugs, see What are the undocumented features and limitations of the Windows FINDSTR command?.
So the simple code above can fail depending on the content of the files. If you can get away with using a case insensitive search, then the solution is simple.
Edit: Both versions above will fail if fileB.txt contains
\\or\". In order to work properly, those strings must be escaped as\\\and\\"But if you must use a case sensitive search, then there is no simple solution. Your best bet for a pure batch solution might be to use the /R regular expression option. But then you will have to create a modified version of fileB.txt where all regex meta-characters are escaped so that the strings give the correct literal search. That is a mini project in and of itself.
Perhaps your best option for a case sensitive solution is to get a 3rd party tool like grep or sed for Windows.
Edit: Here is a reasonably performing pure batch solution that is nearly bullet proof
I looked into doing something like the proposed logic in your question. But using batch to read all lines in a file is relatively slow. This solution only reads the exclude file line by line. It uses FINDSTR to read the lines in “fileA.txt” repeatedly, once per search string. This is a much faster algorithm for a batch file.
The traditional method to read a file is to use a FOR /F loop, but there is another technique using SET /P that is faster, and it is safe to use with delayed expansion. The only limitations to this method are:
<CR><LF>as is the Windows standard. It will not work with unix style lines terminated by<LF>The search strings must have each
\and"escaped as\\and\"when they are used with the /C option.