I need the following code to finish quicker without threads or multiprocessing. If anyone knows of any tricks that would be greatly appreciated. maybe for i in enumerate() or changing the list to a string before calculating, I’m not sure.
For the example below, I have attempted to recreate the variables using a random sequence, however this has rendered some of the conditions inside the loop useless … which is ok for this example, it just means the ‘true’ application for the code will take slightly longer.
Currently on my i7, the example below (which will mostly bypass some of its conditions) completes in 1 second, I would like to get this down as much as possible.
import random
import time
import collections
import cProfile
def random_string(length=7):
"""Return a random string of given length"""
return "".join([chr(random.randint(65, 90)) for i in range(length)])
LIST_LEN = 18400
original = [[random_string() for i in range(LIST_LEN)] for j in range(6)]
LIST_LEN = 5
SufxList = [random_string() for i in range(LIST_LEN)]
LIST_LEN = 28
TerminateHook = [random_string() for i in range(LIST_LEN)]
#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exclude above from benchmark
ListVar = original[:]
for b in range(len(ListVar)):
for c in range(len(ListVar[b])):
#If its an int ... remove
try:
int(ListVar[b][c].replace(' ', ''))
ListVar[b][c] = ''
except: pass
#if any second sufxList delete
for d in range(len(SufxList)):
if ListVar[b][c].find(SufxList[d]) != -1: ListVar[b][c] = ''
for d in range(len(TerminateHook)):
if ListVar[b][c].find(TerminateHook[d]) != -1: ListVar[b][c] = ''
#remove all '' from list
while '' in ListVar[b]: ListVar[b].remove('')
print(ListVar[b])
That makes a shallow copy of ListVar, so your changes to the second level lists are going to affect the original also. Are you sure that is what you want? Much better would be to build the new modified list from scratch.
Yuck: whenever possible iterate directly over lists.
You want to ignore spaces in the middle of numbers? That doesn’t sound right. If the numbers can be negative you may want to use the
try..exceptbut if they are only positive just use.isdigit().Is that just bad naming? SufxList implies you are looking for suffixes, if so just use
.endswith()(and note that you can pass a tuple in to avoid the loop). If you really do want to find the the suffix is anywhere in the string use theinoperator.Again use the
inoperator. Alsoany()is useful here.and that
whileis O(n^2) i.e. it will be slow. You could use a list comprehension instead to strip out the blanks, but better just to build clean lists to begin with.I think maybe your indentation was wrong on that print.
Putting these suggestions together gives something like: