Task in CMD.
1) How can I compare if string is in string? I checked manual here for “Boolean Test “does string exist ?”” But I can’t understand the example or it does not work for me. This piece of code, it is just a try. I try to make a string compare of filter some sting if there is a tag <a> in a line.
FOR /f "tokens=* delims= usebackq" %%c in ("%source%") DO (
echo %%c
IF %%c == "<a" (pause)
)
So while I read a file, it should be paused if there is a link on a line.
2) I have one more ask. I would need to filter the line if there is a specific file in the link, and get content of the link. My original idea was to try to use findstr with regex, but it seems not to use sub-patterns. And next problem would be how to get the result to variable.
set "pdf=0_1_en.pdf"
type "%source%" | grep "%pdf%" | findstr /r /c:"%pdf%.*>(.*).*</a>"
So in summary, I want to go through file and if there is a link like this: REPAIRED: *
<a href="/Dokumenter/dsweb/Get/Document-408/EK_GEN_0_1_en.pdf" class="uline"><b>GEN 0.1 Preface</b></a>
- I forgot to style this as a code, so the inside of code was not displayed. Sorry.
- Warnning: we don’t know the path, only the basic filename.
Get the title GEN 0.1 Preface. But you should know, that there are also similar links with same link, which contain image, not a text inside a tag.
Code according Aacini to be changed a little bit:
@echo off
setlocal EnableDelayedExpansion
set "source=GEN 0 GENERAL.html"
set "pdf=0_1_en.pdf"
echo In file:%source%
echo Look for anchor:%pdf%
rem Process each line in %source% file:
for /F "usebackq delims=" %%c in ("%source%") do (
set "line=%%c"
rem Test if the line contain a "tag" that start with "<a" string:
set "tag=!line:*<a=!"
if not "!tag!" == "!line!" (
rem Take the string in tag that end in ">"
for /F "delims=^>" %%a in ("!tag!") do set "link=%%a"
echo Link found: !link!
if "!link!" == "GEN 0.1 Preface" echo Seeked link found
)
)
pause
Still not finished
Although your question is extensive it does not provide to much details, so I assumed several points because I don’t know too much about .PDF files, tags, etc.
Any missing point can be added or fixed, if you give me precise details about them.
EDIT: I fixed the program above after last indications from the OP. I used $ character to get the Title value; if this character may exist in original Tag, it must be changed by another unused one.
I tested this program with this “GEN 0 GENERAL.html” example file:
and get this result:
EDIT: New faster method added
There is a simpler and faster method to solve this problem that, however, may fail if a line contains more than one tag: