Given a list of KeyValuePairs, where each pair has a getValue() method, what would

Question

0

Asked: June 12, 20262026-06-12T00:23:23+00:00 2026-06-12T00:23:23+00:00

Given a list of KeyValuePairs, where each pair has a getValue() method, what would

0

Given a list of KeyValuePairs, where each pair has a getValue() method, what would be the fastest way to obtain a List (or Set) of unique Values?

All of the below produce acceptable result. u1 seems to be fastest over an expected list size (about 1000-2000 KVP)

Can we do better (faster)?

private static Set<String> u1(List<_KVPair> pairs) {
    Set<String> undefined = new HashSet<String>();

    for (_KVPair pair : pairs) {
        undefined.add(pair.getValue());
    }

    if (undefined.size() == 1) {
        return new HashSet<String>();
    }
    return undefined;
}

private static List<String> u2(List<_KVPair> pairs) {

    List<String> undefined = new ArrayList<String>();
    for (_KVPair pair : pairs) {
        if (!undefined.contains(pair.getValue())) {
            undefined.add(pair.getValue());
        }
    }

    return undefined;
}

private static List<String> u3(List<_KVPair> pairs) {

    List<String> undefined = new LinkedList<String>();

    Iterator<_KVPair> it = pairs.iterator();
    while (it.hasNext()) {
        String value = it.next().getValue();
        if (!undefined.contains(value)) {
            undefined.add(value);
        }
    }
    return undefined;
}

At about 3600 pairs, ‘u3’ wins. At about 1500 pairs, ‘u1’ wins

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T00:23:24+00:00

First option should be faster. You could possibly make it even faster by sizing the set before using it. Typically, if you expect a small number of duplicates:

Set<String> undefined = new HashSet<String>(pairs.size(), 1);

Note that I used 1 for the load factor to prevent any resizing.

Out of curiosity I ran a test (code below) – the results are (post compilation):

Test 1 (note: takes a few minutes with warm up)

size of original list = 3,000 with no duplicates:
set: 8
arraylist: 668
linkedlist: 1166

Test 2

size of original list = 30,000 – all strings identical:
set: 25
arraylist: 11
linkelist: 13

That kind of makes sense:

when there are many duplicates, List#contains will run fairly fast as a duplicate will be found more quickly and the cost of allocating a large set + the hashing algorithm are penalising
when there are no or very few duplicates, the set wins, by a large margin.

public class TestPerf {

    private static int NUM_RUN;
    private static Random r = new Random(System.currentTimeMillis());
    private static boolean random = false; //toggle to false for no duplicates in original list


    public static void main(String[] args) {

        List<String> list = new ArrayList<>();

        for (int i = 0; i < 30_000; i++) {
            list.add(getRandomString());
        }

        //warm up
        for (int i = 0; i < 10_000; i++) {
            method1(list);
            method2(list);
            method3(list);
        }

        NUM_RUN = 100;
        long sum = 0;
        long start = System.nanoTime();
        for (int i = 0; i < NUM_RUN; i++) {
            sum += method1(list);
        }
        long end = System.nanoTime();
        System.out.println("set: " + (end - start) / 1000000);

        sum = 0;
        start = System.nanoTime();
        for (int i = 0; i < NUM_RUN; i++) {
            sum += method2(list);
        }
        end = System.nanoTime();
        System.out.println("arraylist: " + (end - start) / 1000000);

        sum = 0;
        start = System.nanoTime();
        for (int i = 0; i < NUM_RUN; i++) {
            sum += method3(list);
        }
        end = System.nanoTime();
        System.out.println("linkelist: " + (end - start) / 1000000);

        System.out.println(sum);
    }

    private static int method1(final List<String> list) {
        Set<String> set = new HashSet<>(list.size(), 1);
        for (String s : list) {
            set.add(s);
        }
        return set.size();
    }

    private static int method2(final List<String> list) {
        List<String> undefined = new ArrayList<>();
        for (String s : list) {
            if (!undefined.contains(s)) {
                undefined.add(s);
            }
        }
        return undefined.size();
    }

    private static int method3(final List<String> list) {
        List<String> undefined = new LinkedList<>();

        Iterator<String> it = list.iterator();
        while (it.hasNext()) {
            String value = it.next();
            if (!undefined.contains(value)) {
                undefined.add(value);
            }
        }
        return undefined.size();
    }

    private static String getRandomString() {
        if (!random) {
            return "skdjhflkjrglajhsdkhkjqwhkdjahkshd";
        }
        int size = r.nextInt(100);
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < size; i++) {
            char c = (char) ('a' + r.nextInt(27));
            sb.append(c);
        }
        System.out.println(sb);
        return sb.toString();
    }
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Given a list of KeyValuePairs, where each pair has a getValue() method, what would

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply