I have a text file with catalog names (one per line) and I need to open and cycle through that list. Taking one name at a time, downloading the corresponding HTML page and extracting the “item_id” that is on the page.
The item ID is displayed like this in the HTML: ?item_id=55963573">.
This is what I have so far below.
#!/bin/sh
for productID in (catIDs.txt) #I know this part is not correct
do
wget -q -U Mozilla "http://www.example.com/$productID/" -O - \
| tr '"' '\n' | grep "^item_id" | cut -d ' ' -f 4 >> itemIDs.txt
sleep 15
done
This should work: