I’m reading (in binary format) a file of unsigned 8-bit integers, which I then

Question

0

Asked: June 12, 20262026-06-12T02:35:24+00:00 2026-06-12T02:35:24+00:00

I’m reading (in binary format) a file of unsigned 8-bit integers, which I then

0

I’m reading (in binary format) a file of unsigned 8-bit integers, which I then need to convert to an array of floats. Normally I’d just do something like the following:

uint8_t *s1_tmp = (uint8_t *)malloc(sizeof(uint8_t)*num_elements);
float *s1 = (float *)malloc(sizeof(float)*num_elements);

fread(s1_tmp, sizeof(uint8_t), num_elements, file_id);

for(int i = 0; i < num_elements; i++){
    s1[i] = s1_tmp[i];
}

free(s1_tmp)

Uninspired to be sure, but it works. However, currently num_elements is around 2.7 million, so the process is super slow and IMO wasteful.

Is there a better way to read in the 8-bit integers as floats or convert the uint8_t array into a float array?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T02:35:25+00:00

Firstly, this is going to be I/O-bound from reading the data in. Secondly, it’s going to be memory-bound. You’ll get much better cache performance if you interleave the conversion with the reading.

Pick some reasonable buffer size that’s large enough for good I/O performance but small enough to fit in your cache, maybe 8-32 KB or so. Read in that much data, convert, and repeat.

For example:

#define BUFSIZE 16384
uint8_t *buffer = malloc(BUFSIZE);
float *s1 = malloc(num_elements * sizeof(float));

int total_read = 0;
int n;
while(total_read < num_elements && (n = fread(buffer, 1, BUFSIZE, file_id)) > 0)
{
    n = min(n, num_elements - total_read);
    for(int i = 0; i < n; i++)
        s1[total_read + i] = (float)buffer[i];
    total_read += n;
}
free(buffer);

You might also see improved performance by using SIMD operations to convert multiple items at once. However, the total performance will still be bottlenecked by the I/O from fread, so how much improvement you might see from SIMD will be questionable.

Since you’re converting a large number of uint8_t values, it’s all possible you could get some improved performance by using a lookup table instead of doing the integer to floating point conversion. You’d only need a lookup table of 256 float values (1 KB), which easily fits in cache. I don’t know if that would be faster or not, so you should definitely profile the code to figure out what the best option is.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m reading (in binary format) a file of unsigned 8-bit integers, which I then

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply