I’m trying to loop through a text file with a list of urls and have my python script parse each of the urls in the file.
The code only processes the LAST line in the file, when it should process every line and append the results to the file.
I have no idea what to do, i appreciate your help.
Thanks!
import feedparser # pip install feedparser
from BeautifulSoup import BeautifulStoneSoup
from BeautifulSoup import BeautifulSoup
import re
urls = open("c:/a2.txt", "r") # file with rss urls
for lines in urls:
d = feedparser.parse(lines) # feedparser is supposed to process every url in the file(urls)
statusupdate = d.entries[0].description
soup = BeautifulStoneSoup(statusupdate)
for e in d.entries:
print(e.title)
print(e.link)
print(soup.find("img")["src"])
print("\n") # 2 newlines
# writes title,link,image to a file and adds some characters
f = open(r'c:\a.txt', 'a')
f.writelines('"')
f.writelines(e.title)
f.writelines('"')
f.writelines(",")
f.writelines('"')
f.writelines(e.link)
f.writelines('"')
f.writelines(",")
f.writelines('"')
f.writelines(soup.find("img")["src"])
f.writelines('"')
f.writelines(",")
f.writelines("\n")
f.close()
This loop simply keeps going and it keeps reassigning something to the variable
d. That means, when the loop is finished,dwill have the values associated with the last line.If you wish to process every line, you need to do something with every value of
d. For example, you could put everyd.entries[0].descriptionin a list and then iterate over that list to process it.