I have an array of point data, the values of points are represented as x co-ordinate and y co-ordinate.
These points could be in the range of 500 upto 2000 points or more.
The data represents a motion path which could range from the simple to very complex and can also have cusps in it.
Can I represent this data as one spline or a collection of splines or some other format with very tight compression.
I have tried representing them as a collection of beziers but at best I am getting a saving of 40 %.
For instance if I have an array of 500 points , that gives me 500 x and 500 y values so I have 1000 data pieces.
I around 100 quadratic beziers from this. each bezier is represented as controlx, controly, anchorx, anchory.
which gives me 100 x 4 = 400 pcs of data.
So input = 1000pcs , output = 400pcs.
I would like to further tighen this, any suggestions?
By its nature, spline is an approximation. You can reduce the number of splines you use to reach a higher compression ratio.
You can also achieve lossless compression by using some kind of encoding scheme. I am just making this up as I am typing, using the range example in previous answer (1000 for x and 400 for y),
To decode the sequence properly, you would need some prefix to identify the type of encoding. Let’s say we use 110 for full point, 10 for displacement and 0 for short displacement.
The bit layout will look like this,
Unless your sequence is totally random, you can easily achieve high compression ratio with this scheme.
Let’s see how it works using a short example.
3 points: A(500, 400), B(550, 380), C(545, 381)
Let’s say you were using 2 byte for each coordinate. It will take 16 bytes to encode this without compression.
To encode the sequence using the compression scheme,
A is first point so full coordinate will be used. 3 bytes.
B’s displacement from A is (50, -20) and can be encoded as displacement. 2 bytes.
C’s displacement from B is (-5, 1) and it fits the range of short displacement 1 byte.
So you save 10 bytes out of 16 bytes. Real compression ratio is totally depending on the data pattern. It works best on points forming a moving path. If the points are random, only 25% saving can be achieved.