I was trying this problem on spoj.
First of all I came up with a sort of trivial o(blogb) algorithm(refer the problem for whats b).But since the author of the problem mentioned the constraints as b belongs to [0,10^7] i was not convinced if it would pass.Anyways out of shear belief I coded it as follows
#include<stdio.h>
#include<iostream>
#include<algorithm>
#include<cmath>
#include<vector>
#include<cstdlib>
#include<stack>
#include<queue>
#include<string>
#include<cstring>
#define PR(x) cout<<#x"="<<x<<endl
#define READ2(x,y) scanf("%d %d",&x,&y)
#define REP(i,a) for(long long i=0;i<a;i++)
#define READ(x) scanf("%d",&x)
#define PRARR(x,n) for(long long i=0;i<n;i++)printf(#x"[%d]=\t%d\n",i,x[i])
using namespace std;
#include <stdio.h>
struct node {
int val;
int idx;
};
bool operator<(node a,node b){ return a.val<b.val;}
node contain[10000001];
int main(){
int mx=1,count=1,t,n;
scanf("%d",&t);
while(t--){
count=1;mx=1;
scanf("%d",&n);
for(int i=0;i<n;i++){
scanf("%d",&contain[i].val);
contain[i].idx=i;
}
sort(contain,contain+n);
for(int j=1;j<n;j++){
if(contain[j].idx>contain[j-1].idx)
count++;
else count=1;
mx=max(count,mx);
}
printf("%d\n",n-mx);
}
}
And it passed in 0.01 s on SPOJ server(which is known to be slow)
But I soon came up with an O(b) algorithm,code given below
#include<stdio.h>
#include<iostream>
#include<algorithm>
#include<cmath>
#include<vector>
#include<cstdlib>
#include<stack>
#include<queue>
#include<string>
#include<cstring>
#define PR(x) printf(#x"=%d\n",x)
#define READ2(x,y) scanf("%d %d",&x,&y)
#define REP(i,a) for(int i=0;i<a;i++)
#define READ(x) scanf("%d",&x)
#define PRARR(x,n) for(int i=0;i<n;i++)printf(#x"[%d]=\t%d\n",i,x[i])
using namespace std;
int val[1001];
int arr[1001];
int main() {
int t;
int n;
scanf("%d",&t);
while(t--){
scanf("%d",&n);
int mn=2<<29,count=1,mx=1;
for(int i=0;i<n;i++){
scanf("%d",&arr[i]);
if(arr[i]<mn) { mn=arr[i];}
}
for(int i=0;i<n;i++){
val[arr[i]-mn]=i;
}
for(int i=1;i<n;i++){
if(val[i]>val[i-1]) count++;
else {
count=1;
}
if(mx<count) mx=count;
}
printf("%d\n",n-mx);
}
}
But surprisingly it took 0.14s :O
Now my question is isn’t o(b) better than o(blogb) for b > 2 ? Then why so much difference in time? One of the members from the community suggested that it may be due to cache miss.The o(b) code is less localized as compared to o(blogb).But I dont see that causing a difference of 0.10s that too for <1000 runs of the code? (Yes b is actually less than 1000.Dont know why problem setter exaggerated so much)
EDIT : I see all answers are going towards the hidden constant values in asymptotic notations that often cause disparity in the running times of algorithms.But if you look at the codes you will realize all I am doing is replacing the call to sort by another traversal of the loop.Now I am assuming sort accesses each element of the array atleast once .Wouldn’t that make both programs even closer if we think in number of lines that get executed?Beside yes my past experiences with spoj tells me I/O makes drastic impact on the running time of the program but I am using the same I/O routines in both the codes.
Big O notation describes how long the function takes as the input set approaches infinite size. If you have large enough data sets, O(n) will always beat O(n log n).
In practice, some ‘poorer-performing’ algorithms are faster because of the other hidden variables in the big O formula. Some more scalable algorithms can be slower. The difference becomes more arbitrary as the input set becomes smaller.
I learned all this the hard way, when I spent hours implementing a scalable solution, and when testing, found that it would only be faster for large data sets.
Edit:
Regarding the specific case, some people mentioned that the same line of code can vary extremely with regards to performance. This is likely the case here. That means that the ‘hidden variables’ in the big O formula are very relevant. The better you understand how a computer works on the inside, the more optimization techniques you have up your sleeve.
If you only remember one thing, remember this. Never compare two algorithms’ performance by just reading the code. If it’s that important, time an actual implementation on realistic data sets.