I’m currently looking at a simple programming problem that might be fun to optimize – at least for anybody who believes that programming is art 🙂 So here is it:
How to best represent long’s as Strings while keeping their natural order?
Additionally, the String representation should match ^[A-Za-z0-9]+$. (I’m not too strict here, but avoid using control characters or anything that might cause headaches with encodings, is illegal in XML, has line breaks, or similar characters that will certainly cause problems)
Here’s a JUnit test case:
@Test
public void longConversion() {
final long[] longs = { Long.MIN_VALUE, Long.MAX_VALUE, -5664572164553633853L,
-8089688774612278460L, 7275969614015446693L, 6698053890185294393L,
734107703014507538L, -350843201400906614L, -4760869192643699168L,
-2113787362183747885L, -5933876587372268970L, -7214749093842310327L, };
// keep it reproducible
//Collections.shuffle(Arrays.asList(longs));
final String[] strings = new String[longs.length];
for (int i = 0; i < longs.length; i++) {
strings[i] = Converter.convertLong(longs[i]);
}
// Note: Comparator is not an option
Arrays.sort(longs);
Arrays.sort(strings);
final Pattern allowed = Pattern.compile("^[A-Za-z0-9]+$");
for (int i = 0; i < longs.length; i++) {
assertTrue("string: " + strings[i], allowed.matcher(strings[i]).matches());
assertEquals("string: " + strings[i], longs[i], Converter.parseLong(strings[i]));
}
}
and here are the methods I’m looking for
public static class Converter {
public static String convertLong(final long value) {
// TODO
}
public static long parseLong(final String value) {
// TODO
}
}
I already have some ideas on how to approach this problem. Still, I though I might get some nice (creative) suggestions from the community.
Additionally, it would be nice if this conversion would be
- as short as possible
- easy to implement in other languages
EDIT: I’m quite glad to see that two very reputable programmers ran into the same problem as I did: using ‘-‘ for negative numbers can’t work as the ‘-‘ doesn’t reverse the order of sorting:
- -0001
- -0002
- 0000
- 0001
- 0002
Ok, take two:
This one takes a little explanation. Firstly, let me demonstrate that it is reversible and the resultant conversions should demonstrate the ordering:
Output:
As you can see
Long.MIN_VALUEandLong.MAX_VALUE(the first two rows) are correct and the other values basically fall in line.What is this doing?
Assuming signed byte values you have:
Now if you add 0x80 to those values you get:
which is the correct order (with overflow).
Basically the above is doing that with 64 bit signed longs instead of 8 bit signed bytes.
The conversion back is a little more roundabout. You might think you can use:
but you can’t. Pass in 16 f’s to that function (-1) and it will throw an exception. It seems to be treating that as an unsigned hex value, which
longcannot accommodate. So instead I split it in half and parse each piece, combining them together, left-shifting the first half by 32 bits.