C
cornelis van der bent
I have a limited range of positive integer values that occur all over
my data to be compressed. After decompression these value may have
changed a little, +/- a fixed percentage. So we're talking about
lossy [de]compression here.
My idea is/was to compress a value by taking its log() and scaling
this up by a factor. The factor being just large enough to get an
integer compressed value with which the original value can be
calculated, give/take the allowed error percentage. I also subtract
an offset from the log() result to let my compressed range start at 0;
this to increase compression even further. The offset is log(<lowest-
value-in-range>).
By just including the above mentioned 'factor' and 'offset' in the
data, I can decompress these values.
Here's my code (I've copy pasted most from working source but typed
the test() routine here, so please forgive typos):
static double factor;
static double offset;
void test(int minValue, int maxValue, int errorPercentage)
{
double resulution;
int n;
resolution = log(maxValue * (1.0 + (errorPercentage / 100.0))) -
log(maxValue * (1.0 - (errorPercentage / 100.0)));
factor = 1 / resolution;
offset = log((double)minValue);
for (n = minValue; n < maxValue; n++)
{
int down = scaler->scaleDown(n);
int up = scaler->scaleUp(down);
printf("%4d => %d => %d => %.2f\n", n, down, up, ((100.0 * (up
- n)) / (double)n ));
}
}
int scaleDown(int originalValue)
{
return (int)round((log((double)originalValue) - offset) * factor);
}
int scaleUp(int scaledValue)
{
return (int)round(exp(((double)scaledValue / factor) + offset));
}
When running test(200, 2000, 20), I see that the error is asymetrical:
Given a certain compressed value, lowest input values have > 20%
error, while on the other side the highest input values have < 20%
error.
1060 => 12 => 1297 => 22.36
....
1589 => 12 => 1297 => -18.38
Does anyone have an idea what causes this and how to fix it?
Thanks for listening!
my data to be compressed. After decompression these value may have
changed a little, +/- a fixed percentage. So we're talking about
lossy [de]compression here.
My idea is/was to compress a value by taking its log() and scaling
this up by a factor. The factor being just large enough to get an
integer compressed value with which the original value can be
calculated, give/take the allowed error percentage. I also subtract
an offset from the log() result to let my compressed range start at 0;
this to increase compression even further. The offset is log(<lowest-
value-in-range>).
By just including the above mentioned 'factor' and 'offset' in the
data, I can decompress these values.
Here's my code (I've copy pasted most from working source but typed
the test() routine here, so please forgive typos):
static double factor;
static double offset;
void test(int minValue, int maxValue, int errorPercentage)
{
double resulution;
int n;
resolution = log(maxValue * (1.0 + (errorPercentage / 100.0))) -
log(maxValue * (1.0 - (errorPercentage / 100.0)));
factor = 1 / resolution;
offset = log((double)minValue);
for (n = minValue; n < maxValue; n++)
{
int down = scaler->scaleDown(n);
int up = scaler->scaleUp(down);
printf("%4d => %d => %d => %.2f\n", n, down, up, ((100.0 * (up
- n)) / (double)n ));
}
}
int scaleDown(int originalValue)
{
return (int)round((log((double)originalValue) - offset) * factor);
}
int scaleUp(int scaledValue)
{
return (int)round(exp(((double)scaledValue / factor) + offset));
}
When running test(200, 2000, 20), I see that the error is asymetrical:
Given a certain compressed value, lowest input values have > 20%
error, while on the other side the highest input values have < 20%
error.
1060 => 12 => 1297 => 22.36
....
1589 => 12 => 1297 => -18.38
Does anyone have an idea what causes this and how to fix it?
Thanks for listening!