The problem runs as follows: if there are two strings str1 and str2 ,

Question

0

Asked: June 11, 20262026-06-11T18:34:14+00:00 2026-06-11T18:34:14+00:00

The problem runs as follows: if there are two strings str1 and str2 ,

0

The problem runs as follows: if there are two strings str1 and str2, and another string str3, write a function which checks whether str3 contains both str1‘s letters and str2‘s letters in the same sequence as they were in the original sequences, though they may be interleaved. So, adbfec returns true for substrings adf and bec. I have written the following function in Python:

def isinter(str1,str2,str3):
    p1,p2,p3 = 0,0,0
    while p3 < len(str3):
        if p1 < len(str1) and str3[p3] == str1[p1]:
            p1 += 1
        elif p2 < len(str2) and str3[p3] == str2[p2]:
            p2 += 1
        else:
            break
        p3 = p1+p2
    return p3 == len(str3)

There is another version of this program, at ardentart (the last solution). Now which one is better? I think mine, for it probably does it in linear time. Whether it is better or not, is there any further room for optimization in my algo?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T18:34:15+00:00

You could split all three strings in lists:

list1 = list(str1)

and then walk list3 with the same algorithm you use now, checking whether list3[i] is equal to list1[0] or list2[0]. If it was, you’d del the item from the appropriate list.

Premature list end could then be caught as an exception.

The algorithm would be exactly the same, but implementation ought to be more performant.

UPDATE: turns out it actually isn’t (about double the time). Oh well, might be useful to know.

And while benchmarking different scenarios, it turned out that unless it is specified that the three string lengths are “exact” (i.e., len(p1)+len(p2) == len(p3) ), then the most effective optimization is to check first thing. This immediately discards all cases where the two input strings can’t match the third because of bad string lengths.

Then I encountered some cases where the same letter is in both strings, and assigning it to list1 or list2 might lead to one of the strings no longer matching. In those cases the algorithm fails with a false negative, which would require a recursion.

def isinter(str1,str2,str3,check=True):
    # print "Checking %s %s and %s" % (str1, str2, str3)
    p1,p2,p3 = 0,0,0
    if check:
        if len(str1)+len(str2) != len(str3):
            return False
    while p3 < len(str3):
        if p1 < len(str1) and str3[p3] == str1[p1]:
            if p2 < len(str2) and str3[p3] == str2[p2]:
                # does str3[p3] belong to str1 or str2?
                if True == isinter(str1[p1+1:], str2[p2:], str3[p3+1:], False):
                   return True
                if True == isinter(str1[p1:], str2[p2+1:], str3[p3+1:], False):
                   return True
                return False
            p1 += 1
        elif p2 < len(str2) and str3[p3] == str2[p2]:
            p2 += 1
        else:
            return False
        p3 += 1
    return p1 == len(str1) and p2 == len(str2) and p3 == len(str3)

Then I ran some benchmarks on random strings, this the instrumentation (notice that it generates always valid shuffles, which may yield biased results):

for j in range(3, 50):
        str1 = ''
        str2 = ''
        for k in range(1, j):
                if random.choice([True, False]):
                        str1 += chr(random.randint(97, 122))
                if random.choice([True, False]):
                        str2 += chr(random.randint(97, 122))
        p1 = 0
        p2 = 0
        str3 = ''
        while len(str3) < len(str1)+len(str2):
                if p1 < len(str1) and random.choice([True, False]):
                        str3 += str1[p1]
                        p1 += 1
                if p2 < len(str2) and random.choice([True, False]):
                        str3 += str2[p2]
                        p2 += 1
        a = time.time()
        for i in range(1000000):
                isShuffle2(str1, str2, str3)
        a = (time.time() - a)
        b = time.time()
        for i in range(1000000):
                isinter(str1, str2, str3)
        b = (time.time() - b)

        print "(%s,%s = %s) in %f against %f us" % (str1, str2, str3, a, b)

The results seem to point to a superior efficiency of the cached+DP algorithm for short strings. When strings get longer (more than 3-4 characters), the cache+DP algorithm starts losing ground. At around length 10, the algorithm above performs twice as fast as the totally-recursive, cached version.

The DP algorithm performs better, but still worse than the above one, if strings contain repeated characters (I did this by restricting the range from a-z to a-i) and if the overlap is slight. For example in this case the DP loses by only 2us:

(cfccha,ddehhg = cfcchaddehhg) in 68.139601 against 66.826320 us

Not surprisingly, full overlap (one letter from each string in turn) sees the larger difference, with a ratio as high as 364:178 (a bit more than 2:1).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The problem runs as follows: if there are two strings str1 and str2 ,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply