Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7640235
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T08:37:12+00:00 2026-05-31T08:37:12+00:00

We have two sets, A and B. Each one of these sets include strings.

  • 0

We have two sets, A and B. Each one of these sets include strings.
eg.: A – {“abwcd”, “dwas”, “www”} and B – {“opqr”, “tops”, “ibmd”}
How can I count the subsequences that appear in all strings from set A, but in none of the strings in set B? For the example above the answer is 1 (the subsequence “w”).

All this in an optimal way. I thought about using two tries, first time I put all the subsequences of all the strings in B in trie t_B and then, I start putting all the subsequences of all the strings in A in the trie t_A, without updating the trie if the same subsequence was found before in the same string (e.g.: if I have the string “aba”, I don’t count the subsequence “a” twice). In this way, if I find a subsequence that has n (size of A) appearances in t_A, I check if it’s in t_B, and if it’s not, I count it. But this is very very slow, if A and B have size 15 and the strings are about 100 characters long, my programs runs in over 1 second.

EDIT: Since any subsqeunce ends in the last character of the string or in a character before it, we don’t have to generat all the subsequences, but the ones that end in the last character of the string. When I push them into the trie, I note every node with 1. So if I have the string “abcd”, I only push “abcd”, “bcd”, “cd” and “d”, since this should be the ‘skeleton’ of the trie. But this is not a very big optimization, I’m still looking for something better.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T08:37:13+00:00Added an answer on May 31, 2026 at 8:37 am

    You shouldn’t have to put all the subsequences of all the strings in A into the trie.
    Only put in the valid ones. Test if a sequence is valid before adding it. I’m assuming a membership test is faster than adding a new item. A smaller trie should fail membership tests faster, so this strategy is designed to trim down the trie as fast as possible.

    Specifically:
    Put all the subsequences from the first string in A into the trie. (For efficiency, use the shortest string as the first). Keep a set of references to all the leaf nodes.
    Next, for all the strings in B, test each subsequence to see if it exists in A. If it does, remove that sequence and it’s reference. (Start with the longest string in B to pare the trie as fast as possible).

    Now you have the minimum set of possibilities to test against.
    For all the remaining strings in A, test each subsequence to see if it exists in the trie. If it does, mark the node as valid, else move to the next subsequence.
    After each string, remove all the invalid nodes from the trie, and reset the flags on the rest to invalid.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have to generate two random sets of matrices Each containing 3 digit numbers
Have two sets of data (two tables) for patient records, one 1999-2003, the other
I have two sets of datarows. They are each IEnumerable. I want to append/concatenate
I have two sets of elements with (sometimes) corresponding rel and id attributes: <a
I have two Sets. Set b is the subset of Set a . they're
I have two sets of data, (Ax, Ay; Bx, By). I'd like to plot
EDIT: I also have access to ESXLT functions. I have two node sets of
I have a class that contains two sets. They both contain the same key
I have two applications written in Java that communicate with each other using XML
I have two sets of data in this form: x | y | z

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.