So this is one of the first programs I’ve written in Python. I’m trying to take a string, and output all strings that are real words. I have it completed (I need to find a reference file that contains more words) however it is not scalable as I cannot input more than 8 characters without Python taking a real long time to return something.
def lower_and_remove_spaces(fill_string):
'''
function takes a string of 2 or more characters and prints out all the permutations
of words that the characters can make.
'''
lower_string = ''
for i in fill_string:
if i.isalpha():
lower_string += i.lower()
return lower_string
def fill_list(input_string):
iter_list = []
string_list = []
this_string = lower_and_remove_spaces(input_string)
for num in range(2,len(this_string)+1):
iter_list.append(itertools.permutations(this_string,num))
for iters in iter_list:
for lists in iters:
string_list.append(list(lists))
return string_list
def word_list(string):
string_list = fill_list(string)
a_word_list = []
a_string = ''
for i in string_list:
if not a_string == '':
a_word_list.append(a_string)
a_string = ''
for y in i:
a_string += y
return a_word_list
I understand this jumps around a lot but I’m wondering what’s a better way to do this so that it’s scalable?
Some quick ideas: making all permutations is going to O(n!), there’s no way around this. Even if you optimize your code, you’ll still run into a wall when n approaches larger numbers. If you have a dictionary of valid words, this problem is a bit different. Under a pathological input set ( your dictionary contains all permutations ) you can’t do any better than this.
However, you can do the following
The performance of this will be much better in practice than O(n!)
If you’re unfamiliar with prefix trees, here’s a way to simulate the same thing with a Python hash
Ask questions if you need more help.