Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6200893
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T04:27:10+00:00 2026-05-24T04:27:10+00:00

I am trying to learn Java by doing some assignments from a Stanford class

  • 0

I am trying to learn Java by doing some assignments from a Stanford class and am having trouble answering this question.

boolean stringIntersect(String a, String b, int len): Given 2 strings,
consider all the substrings within them of length len. Returns true if
there are any such substrings which appear in both strings. Compute
this in O(n) time using a HashSet.

I can’t figure out how to do it using a Hashset because you cannot store repeating characters. So stringIntersect(hoopla, loopla, 5) should return true.

thanks!

Edit: Thanks so much for all your prompt responses. It was helpful to see explanations as well as code. I guess I couldn’t see why storing substrings in a hashset would make the algorithm more efficient. I originally had a solution like :

public static boolean stringIntersect(String a, String b, int len) {
    assert (len>=1);
    if (len>a.length() || len>b.length()) return false;
    String s1=new String(),s2=new String();
    if (a.length()<b.length()){
        s1=a;
        s2=b;
    }
    else {
        s1=b;
        s2=a;
    }
    int index = 0;
    while (index<=s1.length()-len){
        if (s2.contains(s1.substring(index,index+len)))return true;
        index++;
    }
    return false;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T04:27:10+00:00Added an answer on May 24, 2026 at 4:27 am

    I’m not sure I understand what you mean by “you cannot store repeating characters” A hashset is a Set, so it can do two things: you can add value to it, and you can add values to it, and you can check if a value is already in it. In this case, the problem wants you to answer the question by storing strings, not chars, in the HashSet. To do this in java:

    Set<String> stringSet = new HashSet<String>();
    

    Try breaking this problem into two parts:
    1. Generate all the substrings of length len of a string
    2. Use this to solve the problem.

    The hint for part two is:
    Step 1: For the first string enter the substrings into a hashset
    Step 2: For the second string, check the values in the hashset

    Note (Advanced): this problem is poorly specified. Entering and checking strings in a hashtable is O the length of the string. For string a of length n you have O(n-k) substrings of length k. So for string a being a string of length n and string b being a string of length m you have O((n-k)*k+(m-k)*k) this is not really big Oh of n, since your running time for k = n/2 is O((n/2)*(n/2)) = O(n^2)


    Edit: So what if you actually want to do this in O(n) (or perhaps O(n+m+k))? My belief is that the original homework was asking for something like the algorithm I described above. But we can do better. Whats more, we can do better and still make a HashSet the crucial tool for our algorithm. The idea is to perform our search using a “Rolling Hash.” Wikipedia describes a couple: http://en.wikipedia.org/wiki/Rolling_hash, but we will implement our own.

    A simple solution would be to XOR the values of the character hashes together. This could allow us to add a new char to the hash O(1) and remove one O(1) making computing the next hash trivial. But this simple algorithm wont work for two reasons

    1. The character hashes may not provide sufficient entropy. Okay, we dont know if we will have this problem, but lets solve it anyways, just for fun.
    2. We will hash permutations to the same value … “abc” should not have the same hash as “cba”

    To solve the first problem we can use an idea from AI, namely lets steel from Zobrist hashing. The idea is to assign every possible character a random value of a greater length. If we were using ASCI, we could easily create an array with all the ASCI characters, but that will run into problems when using unicode characters. The alternative is to assign values lazily.

    object LazyCharHash{
      private val map = HashMap.empty[Char,Int]
      private val r = new Random
      def lHash(c: Char): Int = {
        val d = map.get(c)
        d match {
          case None => {
            map.put(c,r.nextInt)
            lHash(c)
          }
          case Some(v) => v
        }
      }
    }
    

    This is Scala code. Scala tends to be less verbose than Java, but still allows me to use Java collections, as such I will be using imperative style Scala through out. It wouldn’t be that hard to translate.

    The second problem can be solved aswell. First, instead of using a pure XOR, we combine our XOR with a shift, thus the hash function is now:

    def fullHash(s: String) = {
      var h = 0
      for(i <- 0 until s.length){
        h = h >>> 1
        h = h ^ LazyCharHash.lHash(s.charAt(i))
      }
      h
    }
    

    Of-course, using fullHash wont give a performance advantage. It is just a specification

    We need a way of using our hash function to store values in the HashSet (I promised we would use it). We can just create a wrapper class:

    class HString(hash: Int, string: String){
      def getHash = hash
      def getString = string
      override def equals(otherHString: Any): Boolean = {
        otherHString match {
          case other: HString => (hash == other.getHash) && (string == other.getString)
          case _ => false
        }
      }
      override def hashCode = hash
    }
    

    Okay, to make the hashing function rolling, we just have to XOR the value associated with the character we will no longer be using. To that just takes shifting that value by the appropriate amount.

    def stringIntersect(a: String, b: String, len: Int): Boolean = {
      val stringSet = new HashSet[HString]()
      var h = 0
      for(i <- 0 until len){
        h = h >>> 1
        h = h ^ LazyCharHash.lHash(a.charAt(i))
      }
      stringSet.add(new HString(h,a.substring(0,len)))
      for(i <- len until a.length){
        h = h >>> 1
        h = h ^ (LazyCharHash.lHash(a.charAt(i - len)) >>> (len))
        h = h ^ LazyCharHash.lHash(a.charAt(i))
        stringSet.add(new HString(h,a.substring(i - len + 1,i + 1)))
      }
      ...
    

    You can figure out how to finish this code on your own.

    Is this O(n)? Well, it matters what mean. Big Oh, big Omega, big Theta, are all metrics of bounds. They could serve as metrics of the worst case of the algorithm, the best case, or something else. In this case these modification gives expected O(n) performance, but this only holds if we avoid hash collisions. It still take O(n) to tell if two Strings are equals. This random approach works pretty well, and you can scale up the size of the random bit arrays to make it work better, but it does not have guaranteed performance.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

No related questions found

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.