I wan to build an algorithm which able to give the overlapping indices of the elements found in the two arrays.
e.g. I have two arrays of string
Array1: {“a”,”c”,”h”,“d”,”a”,”o”,”m”}
Array2: {“d”,”a”,”o”,”m”,”c”,”e”,”o”,”m”,”c”,”z”,”e”,”l”,”p”,”v”,”c”,”z”,”c”}
It should return me the indices of the array1 and array2 in this way
x1,y1={3,6}
*x2,y2={0,3}*
It means the sequence of the string in the array should match and as long as the string values match we need to do that and make sure that every record will be unique by their previous record matching.
Waiting for the answers and responses , if questions then please let me know.
Like an example, I have a table in database where i inserting the records, and every record which i want to insert should be very unique. So at a time of insertion, we do have an array of records which needed to insert in a batch. So if i say, I do have the record in the database table is in the form:
colum1
hash1
hash2
hash3
hash4
hash5
And I want to insert a bunch of records in the database which are in the form:
hash3
hash4
hash5
hash6
hash7
hash8
Then the resultant of the table should be loook like this
colum1
hash1
hash2
hash3
hash4
hash5
hash6
hash7
hash8
But, if the array which need to be inserted in the database will be in the format like
hash2
hash3
hash4
hash6
then it should be treated as a whole new entry and will be inserted in the database as a whole.
Hope I am clear this time to elaborate my problem
The problem you’re referring to is called the Longest Common Substring problem (not to be confused with another problem about strings and an LCS acronym – longest common subsequence). As usual, the best explanation is in Wikipedia: http://en.wikipedia.org/wiki/Longest_common_substring_problem 🙂
In short – if the strings are really short and you’re a very lazy programmer, the fastest way of doing that is to check all substrings of arr2 against arr1. This will take about n^2*m time (if arr2 if of length n, and arr1 is of length m) – which is a lot of time for long strings.
If your strings are longer / you are less lazy, the best algorithm using suffix trees will give you linear running time.