So I’ve wrote a small script to download pictures from a website. It goes through a 7 alpha charactor value, where the first char is always a number. The problem is if I want to stop the script and start it up again I have to start all over.
Can I seed itertools.product somehow with the last value I got so I don’t have to go through them all again.
Thanks for any input.
here is part of the code:
numbers = '0123456789'
alnum = numbers + 'abcdefghijklmnopqrstuvwxyz'
len7 = itertools.product(numbers, alnum, alnum, alnum, alnum, alnum, alnum) # length 7
for p in itertools.chain(len7):
currentid = ''.join(p)
#semi static vars
url = 'http://mysite.com/images/'
url += currentid
#Need to get the real url cause the redirect
print "Trying " + url
req = urllib2.Request(url)
res = openaurl(req)
if res == "continue": continue
finalurl = res.geturl()
#ok we have the full url now time to if it is real
try: file = urllib2.urlopen(finalurl)
except urllib2.HTTPError, e:
print e.code
im = cStringIO.StringIO(file.read())
img = Image.open(im)
writeimage(img)
here’s a solution based on pypy’s library code (thanks to agf’s suggestion in the comments).
the state is available via the
.stateattribute and can be reset via.goto(state)wherestateis an index into the sequence (starting at 0). there’s a demo at the end (you need to scroll down, i’m afraid).this is way faster than discarding values.
you should test it more – i may have made a dumb mistake – but the idea is quite simple, so you should be able to fix it :o) you’re free to use my changes; no idea what the original pypy licence is.
also
stateisn’t really the full state – it doesn’t include the original arguments – it’s just an index into the sequence. maybe it would have been better to call it index, but there are already indici[sic]es in the code…update
here’s a simpler version that is the same idea but works by transforming a sequence of numbers. so you just
imapit overcount(n)to get the sequence offset byn.(the downside here is that if you want to stop and restart you need to have kept track yourself of how many you have used)