What would be the best way to limit repeating letters down to 1 and 2 such as:
appppppppple => aple and apple
bbbbbeeeeeer => ber, beer, bber, bbeer
Right now, I have this:
a = "hellllllllllooooooooooooo"
match = re.search('(.)\\1+', a)
if match:
print 'found'
print re.sub('(.)\\1+', '\\1', a)
print re.sub('(.)\\1+', '\\1\\1', a)
else:
print 'not found'
But it only returns:
helo
helloo
How can I make it work the way I want to?
Don’t use REs for this. REs are good for searching, matching, and transforming, but not for generating strings.
We can consider a string as a vector; each letter is a dimension, and the count of repetitions is the length of a component along that dimension. Given a vector V, You want all possible vectors of the same dimension as V, such that the value of each component is 1 if the corresponding component of V is 1, or is either 1 or 2 otherwise. Based on that, here’s a function that does what you want.
Here’s a more compact version that uses slicing. It may be a bit less readable, but at least it keeps within the 78-char limit: