I’ve been teaching myself Python at my new job, and really enjoying the language. I’ve written a short class to do some basic data manipulation, and I’m pretty confident about it.
But old habits from my structured/modular programming days are hard to break, and I know there must be a better way to write this. So, I was wondering if anyone would like to take a look at the following, and suggest some possible improvements, or put me on to a resource that could help me discover those for myself.
A quick note: The RandomItems root class was written by someone else, and I’m still wrapping my head around the itertools library. Also, this isn’t the entire module – just the class I’m working on, and it’s prerequisites.
What do you think?
import itertools
import urllib2
import random
import string
class RandomItems(object):
"""This is the root class for the randomizer subclasses. These
are used to generate arbitrary content for each of the fields
in a csv file data row. The purpose is to automatically generate
content that can be used as functional testing fixture data.
"""
def __iter__(self):
while True:
yield self.next()
def slice(self, times):
return itertools.islice(self, times)
class RandomWords(RandomItems):
"""Obtain a list of random real words from the internet, place them
in an iterable list object, and provide a method for retrieving
a subset of length 1-n, of random words from the root list.
"""
def __init__(self):
urls = [
"http://dictionary-thesaurus.com/wordlists/Nouns%285,449%29.txt",
"http://dictionary-thesaurus.com/wordlists/Verbs%284,874%29.txt",
"http://dictionary-thesaurus.com/wordlists/Adjectives%2850%29.txt",
"http://dictionary-thesaurus.com/wordlists/Adjectives%28929%29.txt",
"http://dictionary-thesaurus.com/wordlists/DescriptiveActionWords%2835%29.txt",
"http://dictionary-thesaurus.com/wordlists/WordsThatDescribe%2886%29.txt",
"http://dictionary-thesaurus.com/wordlists/DescriptiveWords%2886%29.txt",
"http://dictionary-thesaurus.com/wordlists/WordsFunToUse%28100%29.txt",
"http://dictionary-thesaurus.com/wordlists/Materials%2847%29.txt",
"http://dictionary-thesaurus.com/wordlists/NewsSubjects%28197%29.txt",
"http://dictionary-thesaurus.com/wordlists/Skills%28341%29.txt",
"http://dictionary-thesaurus.com/wordlists/TechnicalManualWords%281495%29.txt",
"http://dictionary-thesaurus.com/wordlists/GRE_WordList%281264%29.txt"
]
self._words = []
for url in urls:
urlresp = urllib2.urlopen(urllib2.Request(url))
self._words.extend([word for word in urlresp.read().split("\r\n")])
self._words = list(set(self._words)) # Removes duplicates
self._words.sort() # sorts the list
def next(self):
"""Return a single random word from the list
"""
return random.choice(self._words)
def get(self):
"""Return the entire list, if needed.
"""
return self._words
def wordcount(self):
"""Return the total number of words in the list
"""
return len(self._words)
def sublist(self,size=3):
"""Return a random segment of _size_ length. The default is 3 words.
"""
segment = []
for i in range(size):
segment.append(self.next())
#printable = " ".join(segment)
return segment
def random_name(self):
"""Return a string-formatted list of 3 random words.
"""
words = self.sublist()
return "%s %s %s" % (words[0], words[1], words[2])
def main():
"""Just to see it work...
"""
wl = RandomWords()
print wl.wordcount()
print wl.next()
print wl.sublist()
print 'Three Word Name = %s' % wl.random_name()
#print wl.get()
if __name__ == "__main__":
main()
Here are my five cents:
__init__.random.sample, it does what yournext()andsublist()does but it’s prepackaged.__iter__(define the method in your class) and you can get rid ofRandomIter. You can read more about at it in the docs (note Py3K, some stuff may not be relevant for lower version). You could useyieldfor this which as you may or may not know creates a generator, thus wasting little to no memory.random_namecould usestr.joininstead. Note that you may need to convert the values if they are not guaranteed to be strings. This can be done through[str(x) for x in iterable]or in-builtmap.