I’ve written a short function for array intersection and wanted to know why one function is faster than the other.
1)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
Dim intersection As New ArrayList
If (list1.Length < list2length) Then
'use list2'
For Each thing As String In list2
If (Array.IndexOf(list1, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Else
'use list1'
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
End If
Return intersection
End Function
2)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
Dim intersection As New ArrayList
If (list1.Length > list2length) Then 'changed >'
'use list2'
For Each thing As String In list2
If (Array.IndexOf(list1, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Else
'use list1'
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
End If
Return intersection
End Function
3)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Return intersection
End Function
So for my testcase, 1 take 65 seconds, 3 takes 63 seconds, while 2 actually takes 75 seconds. Anyone know why 3 is the fastest? And why is 1 faster than 2?
(Sorry about the poor formatting…can’t seem to paste it right)
That’s not much of a difference. Also, it doesn’t seem like the methods would produce the same result, so it would be pointless to compare the performance, right?
Anyhow, the
Array.IndexOfis not very efficient way to find items, and doesn’t scale well. You should get a dramatic improvement if you use a hash key based collection as lookup instead, something like this: