Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7060187
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T04:19:21+00:00 2026-05-28T04:19:21+00:00

I am looking for a fast algorithm for search purpose in a huge string

  • 0

I am looking for a fast algorithm for search purpose in a huge string (it’s a organism genome sequence composed of hundreds of millions to billions of chars).

There are only 4 chars {A,C,G,T} present in this string, and “A” can only pair with “T” while “C” pairs with “G”.

Now I am searching for two substrings (with length constraint of both substring between {minLen, maxLen}, and interval length between {intervalMinLen, intervalMaxLen}) that can pair with one another antiparallely.

For example,
The string is: ATCAG GACCA TACGC CTGAT

Constraints: minLen = 4, maxLen = 5, intervalMinLen = 9, intervalMaxLen = 10

The result should be

  1. “ATCAG” pair with “CTGAT”

  2. “TCAG” pair with “CTGA”

Thanks in advance.

Update: I already have the method to determine whether two string can pair with one another. The only concern is doing exhaustive search is very time consuming.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T04:19:21+00:00Added an answer on May 28, 2026 at 4:19 am

    I thought this was an interesting problem, so I put together a program based on considering ‘foldings’, which scans outward for possible symmetrical matches from different ‘fold points’. If N is the number of nucleotides and M is ‘maxInterval-minInterval’, you should have running time O(N*M). I may have missed some boundary cases, so use the code with care, but it does work for the example provided. Note that I’ve used a padded intermediate buffer to store the genome, as this reduces the number of comparisons for boundary cases required in the inner loops; this trades off additional memory allocation for better speed. Feel free to edit the post if you make any corrections or improvements.

    class Program
    {
        public sealed class Pairing
        {
            public int Index { get; private set; }
    
            public int Length { get; private set; }
    
            public int Offset { get; private set; }
    
            public Pairing(int index, int length, int offset)
            {
                Index = index;
                Length = length;
                Offset = offset;
            }
        }
    
        public static IEnumerable<Pairing> FindPairings(string genome, int minLen, int maxLen, int intervalMinLen, int intervalMaxLen)
        {
            int n = genome.Length;
            var padding = new string((char)0, maxLen);
            var padded = string.Concat(padding, genome, padding);
    
            int start = (intervalMinLen + minLen)/2 + maxLen;
            int end = n - (intervalMinLen + minLen)/2 + maxLen;
    
            //Consider 'fold locations' along the genome
            for (int i=start; i<end; i++)
            {
                //Consider 'odd' folding (centered on index) about index i
                int k = (intervalMinLen+2)/2;
                int maxK = (intervalMaxLen + 2)/2;
                while (k<=maxK)
                {
                    int matchLength = 0;
                    while (IsPaired(padded[i - k], padded[i + k]) && (k <= (maxK+maxLen)))
                    {
                        matchLength++;
    
                        if (matchLength >= minLen && matchLength <= maxLen)
                        {
                            yield return new Pairing(i-k - maxLen, matchLength, 2*k - (matchLength-1));
                        }
                        k++;
                    }
                    k++;
                }
    
                //Consider 'even' folding (centered before index) about index i
                k = (intervalMinLen+1)/2;
                while (k <= maxK)
                {
                    int matchLength = 0;
                    while (IsPaired(padded[i - (k+1)], padded[i + k]) && (k<=maxK+maxLen))
                    {
                        matchLength++;
    
                        if (matchLength >= minLen && matchLength <= maxLen)
                        {
                            yield return new Pairing(i - (k+1) - maxLen, matchLength, 2*k + 1 - (matchLength-1));
                        }
                        k++;
                    }
                    k++;
                }
            }
        }
    
        private const int SumAT = 'A' + 'T';
        private const int SumGC = 'G' + 'C';
        private static bool IsPaired(char a, char b)
        {
            return (a + b) == SumAT || (a + b) == SumGC;
        }
    
    
        static void Main(string[] args)
        {
            string genome = "ATCAGGACCATACGCCTGAT";
            foreach (var pairing in FindPairings(genome, 4, 5, 9, 10))
            {
                Console.WriteLine("'{0}' pair with '{1}'",
                                  genome.Substring(pairing.Index, pairing.Length),
                                  genome.Substring(pairing.Index + pairing.Offset, pairing.Length));
            }
            Console.ReadKey();
        }
    
    
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am looking for a simple and FAST algorithm to encrypt/decrypt a string (length
I'm looking for a fast and secure cryptography algorithm with C# implementation. I need
I'm looking for fast string concatenation class or so in Flex. Like StringBuilder in
I am looking for a fast polygon triangulation algorithm that can triangulate not very
I'm looking for a fast asymmetric cypher algorithm to be used in C++ program.
i am looking for a fast Algorithm just to determine if a given directed
I'm looking for a fast algorithm to draw an outlined line. For this application,
I am looking for fast class for to work with text files and comfortable
I've been looking at fast ways to select a random row from a table
I'm looking for a fast way to turn an associative array in to a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.