I wrote the following program which should answer this question
Write an efficient function to find the first nonrepeated character in
a string. For instance, the first nonrepeated character in “total” is
‘o’ and the first nonrepeated character in “teeter” is ‘r’. Discuss
the efficiency of your algorithm.
This is what I did:
#include <stdio.h>
#include <iostream>
#include <vector>
using namespace std;
class Node
{
public:
Node::Node(char ch)
{
c = ch;
next = NULL;
}
char c;
Node *next;
};
Node* addNode(Node *tail, char ch)
{
if(tail == NULL)
return new Node(ch);
else
{
Node *newN = new Node(ch);
tail->next = newN;
return newN;
}
}
void deleteNode(char ch, Node** head, Node**tail)
{
Node *prev = NULL;
Node *cur = *head;
while(cur!=NULL)
{
if(cur->c == ch)
{
// found cut it
if(prev == NULL)
{
// head cut off
if(*tail == *head)
{
// worst possible, just one element
delete *head;
*head = NULL;
return;
}
else
{
// Head cut off but not just first element
Node *tmp = *head;
*head = (*head)->next;
delete tmp;
return;
}
}
else
{
// delete normal node
if(*tail == cur)
{
// delete tail
Node *tmp = *tail;
*tail = prev;
delete tmp;
return;
}
else
{
// Normal node not tail
prev->next = cur->next;
delete cur;
return;
}
}
}
// no match keep searching
prev = cur;
cur = cur->next;
}
}
int main()
{
char str[] = "total";
char htable[26];
memset(htable, 0, sizeof(char)*26);
Node *head = NULL;
Node *tail = head;
for(unsigned int i=0;;i++)
{
if(str[i] == '\0')
break;
// check first match
char m = htable[str[i]-'a'];
switch(m)
{
case 0:
{
// first time, add it to linked list
htable[str[i]-'a']++;
tail = addNode(tail, str[i]);
if(head == NULL)
head = tail;
}break;
case 1:
{
// bam, cut it out
htable[str[i]-'a']++;
deleteNode(str[i], &head, &tail);
}break;
}
}
if(head != NULL)
printf("First char without repetition: %c", head->c);
else
printf("No char matched");
return 0;
}
and it works (although I didn’t free the memory at the end of the program for the linked list). Basically I keep an hashtable with a 0 if a character hasn’t been found yet, a 1 if it has been found once (and it’s added to the linked list at the tail position) and 2 if there are at least two occurrences of it (and should be removed by the linked list).
What’s this program’s computational complexity with big-O notation?
Since this algorithm just passes once per each element I think it’s O(n), although the removal of the values in the linked list (in the worst case possible) would require additional O(k^2) where k is the length of the alphabet used. Something like O(n+k^2) it’s my pick and if the string is very long and the alphabet restricted the algorithm becomes very efficient.
Well, yeah on the surface this looks like
O(N)complexity, but you have introduced undesirable inefficiencies by using dynamic allocation.However, because you call
deleteNodeand that has to search through a list, you no longer haveO(N)complexity.Think what happens if you have the string:
This has a complexity of roughly
O(N*(N-1)/2)because everydeleteNodecall must scan to the end of the remaining list.All you really need to do is scan the string once and store the index of each character (if not found already), then scan that array for the lowest index. If you like, you can use an index of
-1to indicate a character has not been encountered. That would be complexity ofO(2N)