I have read a few links on the topic of file formats and encoding,

Question

0

Asked: May 15, 20262026-05-15T18:35:55+00:00 2026-05-15T18:35:55+00:00

I have read a few links on the topic of file formats and encoding,

0

I have read a few links on the topic of file formats and encoding, but how is it done?

If all data is binary, what splits data into different file formats? What exactly does encoding the data involve? How is it done?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T18:35:56+00:00

The main ways to decide what format something is are by file extension or by MIME type – and less frequently by “magic numbers”.
The file extension will be checked by an OS or Application to decide what to do with it (which app to run it in, or which part of code to execute for it).

MIME types are used where an extension (or filename) isn’t always applicable – for example, when downloading a file over HTTP, the URI for a file might be something like ~.php?id=12973. The filetype cannot be determined from ths alone, but the HTTP protocol will send a “Content-Type” definition to say what format the file is, and the browser will handle it correctly. eg: a Content-Type: image/png would force the browser to pass the file to some PNG decoding function.

When the application knows what the file format is, it’ll pass the data to code which is written specifically for that format. If the program doesn’t have code to read a format, it will fail to read it.

How a file is encoded is specific to the file. Most standard formats will have a specification to describe their binary encoding, and any application reading that file type must implement code to match the specification. (Although this is usually done by using a library which already does the reading for you).

To give an example of how binary encodings work, consider an image. The specification might say that bytes 10-13 signify the width of the image, and bytes 14-17 signify the height of the image. In order to read those pieces of the information from the file, the code must explicitly read the correct size data at the correct locations indicated by the spec. EG: fseek(f, 10, SEEK_SET); fread(&width, 4, 1, f); //Read 4 bytes at location 10 into "width"). I think your confusion is “what separates pieces of data in binary files?” (ie, in text files, this can be done by new lines, spaces, comma-separated values (CSV), etc). The answer is: usually the size of the data will determine where it ends – a specification will say what the binary type of each field is (perhaps it may say int32, indicating 32 bits/4 bytes).

Other than that, there can be ambiguities in file formats, but usually happens with text files, where the text inside can be read to determine the format. This isn’t always applicable, because often a text file will simply have the extension “.txt”, so it can be unknown to the application what the character encoding of the text is. (This was, and still is a problem for applications which do not use unicode).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have read a few links on the topic of file formats and encoding,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply