I came up with the below code to generate 100001 random strings.the strings should be unique. However, the below code takes hours to do the job. Can someone let me know how i can optimize it and why is it so slow?
string getRandomString(int length) {
static string charset = "abcdefghijklmnopqrstuvwxyz";
string result;
result.resize(length);
for (int i = 0; i < length; i++) {
result[i] = charset[rand() % charset.length()];
}
return result;
}
void main(){
srand(time(NULL));
vector<string> storeUnigrams;
int numUnigram = 100001;
string temp = "";
int minLen = 3;
int maxLen = 26;
int range = maxLen - minLen + 1;
int i =0;
while(i < numUnigram){
int lenOfRanString = rand()%range + minLen;
temp = getRandomString(lenOfRanString);
bool doesithave = false;
for(int j =0 ; j < storeUnigrams.size() ; j++){
if(temp.compare(storeUnigrams[j]) == 0){
doesithave = true;
break;
}
if(temp.compare(storeUnigrams[j]) < 0){
break;
}
}
if(!doesithave){
storeUnigrams.push_back(temp);
sort(storeUnigrams.begin(),storeUnigrams.end());
i++;
}
}
There are two factors that make your code slow:
Use e.g. a
setfor storing the strings – it’s sorted automatically, and checking for existence is fast: