Has anyone ever seen this data format? I’ve been given a huge number of records to import from a flat file that contains number fields in some sort of packed binary format. I know from context that they represent numbers and I have some existing translations/decodings, enough to tell me a bit about how to convert. The lowest order byte represents the least significant digit and might have a sign encoded. Here is the decoded digit, then the encoded byte and corresponding bit pattern.
0, 0c, 0000 1100
1, 1c, 0001 1100
2, b1, 1011 0001
3, 14, 0001 0100
4, 3c, 0011 1100
5, 2a, 0010 1010
6, 25, 0010 0101
7, 40, 0100 0000
8, d0, 1101 0000
9, 91, 1001 0001
Bytes beyond this first one seem to pack two values, there seems to be 100 mappings from 00 to 99, I will only show a few here, first the decoded pair of digits and the hex value.
00, 00, 0000 0000
01, 01, 0000 0001
02, 02, 0000 0010
03, 03, 0000 0011
04, dc, 1101 1100
05, 09, 0000 1001
06, c3, 1100 0011
07, 7f, 0111 1111
08, ca, 1100 1010
09, b2, 1011 0010
10, 10, 0001 0000
11, 11, 0001 0001
12, 12, 0001 0010
13, 13, 0001 0011
14, db, 1101 1011
15, da, 1101 1010
16, 08, 0000 1000
17, c1, 1100 0001
18, 18, 0001 1000
19, 19, 0001 1001
20, c4, 1100 0100
21, b3, 1011 0011
22, c0, 1100 0000
23, d9, 1101 1001
24, bf, 1011 1111
If I encounter 000125 then the result is 16. 000000c90c converts to 350. If I find 000000000000000f it should convert to 0, but I don’t see how, and 0000ec is supposed to result in -8.
There are enough repeating patterns here that make me suspect that it is some sort of encoding. And what I have now is enough to decode many positive numbers, but not all, and I have no idea how to handle the negative values, and I am uncertain if there is information being lost in my mapping (thinking of ieee floating point formats).
Any ideas? Thanks!
Since it uses none of the traditional mainframe formats nor any parity/error correcting schemes (count the set bits) then I can only assume that it is not something common in recent history. There might be some kind of XOR operation being applied to one of those old formats, but if so it doesn’t seems to follow a pattern I can detect.
Given that nobody has seen this format or has any clue about how to write an algorithm to decode it, I’m just going to assume that it was meant to be a half-baked attempt to encrypt the numbers. If I can find the time I will write some code to analyze all 100 million values and see if I can find anything of use, but for now I’m just going to wait and see if the originators of the data can/will provide an answer. Or a clue.
I’m going to mark it answered as I don’t want to torture people with an unsolvable puzzle. I’m sorry if anyone was frustrated, I was only hoping that it was something obscure that someone here might have seen before.