What is the quickest way to find the first character which only appears once

Question

0

Editorial Team

Asked: May 13, 20262026-05-13T19:03:23+00:00 2026-05-13T19:03:23+00:00

What is the quickest way to find the first character which only appears once

0

What is the quickest way to find the first character which only appears once in a string?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T19:03:23+00:00

I see that people have posted some delightful answers below, so I’d like to offer something more in-depth.

An idiomatic solution in Ruby

We can find the first un-repeated character in a string like so:

def first_unrepeated_char string
   string.each_char.tally.find { |_, n| n == 1 }.first
end

How does Ruby accomplish this?

Reading Ruby’s source

Let’s break down the solution and consider what algorithms Ruby uses for each step.

First we call each_char on the string. This creates an enumerator which allows us to visit the string one character at a time. This is complicated by the fact that Ruby handles Unicode characters, so each value we get from the enumerator can be a variable number of bytes. If we know our input is ASCII or similar, we could use each_byte instead.

The each_char method is implemented like so:

rb_str_each_char(VALUE str)
{
    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);
    return rb_str_enumerate_chars(str, 0);
}

In turn, rb_string_enumerate_chars is implemented as:

rb_str_enumerate_chars(VALUE str, VALUE ary)
{
    VALUE orig = str;
    long i, len, n;
    const char *ptr;
    rb_encoding *enc;


    str = rb_str_new_frozen(str);
    ptr = RSTRING_PTR(str);
    len = RSTRING_LEN(str);
    enc = rb_enc_get(str);


    if (ENC_CODERANGE_CLEAN_P(ENC_CODERANGE(str))) {
    for (i = 0; i < len; i += n) {
        n = rb_enc_fast_mbclen(ptr + i, ptr + len, enc);
        ENUM_ELEM(ary, rb_str_subseq(str, i, n));
    }
    }
    else {
    for (i = 0; i < len; i += n) {
        n = rb_enc_mbclen(ptr + i, ptr + len, enc);
        ENUM_ELEM(ary, rb_str_subseq(str, i, n));
    }
    }
    RB_GC_GUARD(str);
    if (ary)
    return ary;
    else
    return orig;
}

From this we can see that it calls rb_enc_mbclen (or its fast version) to get the length (in bytes) of the next character in the string so that it can iterate the next step. By lazily iterating over a string, reading just one character at a time, we end up doing just one full pass over the input string as tally consumes the iterator.

Tally is then implemented like so:

static void
tally_up(VALUE hash, VALUE group)
{
    VALUE tally = rb_hash_aref(hash, group);
    if (NIL_P(tally)) {
        tally = INT2FIX(1);
    }
    else if (FIXNUM_P(tally) && tally < INT2FIX(FIXNUM_MAX)) {
        tally += INT2FIX(1) & ~FIXNUM_FLAG;
    }
    else {
        tally = rb_big_plus(tally, INT2FIX(1));
    }
    rb_hash_aset(hash, group, tally);
}


static VALUE
tally_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, hash))
{
    ENUM_WANT_SVALUE();
    tally_up(hash, i);
    return Qnil;
}

Here, tally_i uses RB_BLOCK_CALL_FUNC_ARGLIST to call repeatedly to tally_up, which updates the tally hash on every iteration.

Rough time & memory analysis

The each_char method doesn’t allocate an array to eagerly hold the characters of the string, so it has a small constant memory overhead. When we tally the characters, we allocate a hash and put our tally data into it which in the worst case scenario can take up as much memory as the input string times some constant factor.

Time-wise, tally does a full scan of the string, and calling find to locate the first non-repeated character will scan the hash again, each of which carry O(n) worst-case complexity.

However, tally also updates a hash on every iteration. Updating the hash on every character can be as slow as O(n) again, so the worst case complexity of this Ruby solution is perhaps O(n^2).

However, under reasonable assumptions, updating a hash has an O(1) complexity, so we can expect the average case amortized to look like O(n).

My old accepted answer in Python

You can’t know that the character is un-repeated until you’ve processed the whole string, so my suggestion would be this:

def first_non_repeated_character(string):
  chars = []
  repeated = []
  for character in string:
    if character in chars:
      chars.remove(character)
      repeated.append(character)
    else:
      if not character in repeated:
        chars.append(character)
  if len(chars):
    return chars[0]
  else:
    return False

Edit: originally posted code was bad, but this latest snippet is Certified To Work On Ryan’s Computer™.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

What is the quickest way to find the first character which only appears once

Leave an answerCancel reply

1 Answer

An idiomatic solution in Ruby

Reading Ruby’s source

Rough time & memory analysis

My old accepted answer in Python

Leave an answer
Cancel reply