Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6077419
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T10:43:07+00:00 2026-05-23T10:43:07+00:00

I’m currently facing a problem I find more than interesting: detecting the mime-type of

  • 0

I’m currently facing a problem I find more than interesting: detecting the mime-type of a given file.
By detecting, I mean trying to guess the mime type using only information present in the file. By file, I mean a structure that has a name and a content.

Here are the solutions I know to this problem:

  • Trying to guess the file type depending on the file name. For example, if the file name is foo.txt, I can assume that the mime-type is text/plain
  • Trying to determine the type using the content, especially the first bytes that usually contain some sort of magic code. For example, if the file begins with the octets 0xCAFEBABE I can assume the mime-type is application/x-java-class.

The two approaches to this problem come with their advantages and drawbacks.

The first solution is very efficient, but we assume that the file has a correct name, and has an extension. How to detect the mime-type of a file named LICENSE or README?

The second technique is a bit more complex, and has to actually read the data. It works very well for all the files containing a magic code, but works poorly for other files. Some problems may arise: how to tell the difference between a MS-DOS EXE file (starting with MZ as magic code) and an actual text/plain file starting with the letters MZ. A lot of similar problems arrise when you consider other files types (txt vs csv; html vs xml vs xhtml).

So here comes the real question:
How to detect efficiently and reliabily, the mime-type of a file?


Some side notes:

  • I know lots and lots of libraries exist out there that do the job. I’m not interested in the libraries. I’m interested in getting my hands dirty.
  • No specific language. I’m interested in the general algorithm(s), not a specific implementation.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T10:43:07+00:00Added an answer on May 23, 2026 at 10:43 am

    The answer to your question is probably just “regular expressions” as you are asking for algorithms, not tools. Actually looking for patterns in a file to guess what it is surely is the very best way to decide what it is. If in doubt, you can look at the file extension (if available) as well but you shouldn’t rely on it. For example, on UNIX systems the OS doesn’t care about a file extension when deciding whether it can execute a file or not. So the file extension should never be relied on.

    The task itself is trivial from an algorithmic point of view: gather regular expressions that identify different file types. But that’s a lot of work, for every file type you’d like to have recognized you need to get familiar with its design to actually be able to write an expression that really does recognize the file type with only minimum of false positives and false negatives.

    So why bother and trying to solve a problem that other people have already invested heavily in ? As you probably know, the most widespread solution is the UNIX tool file and its library libmagic, which can be used in your programs easily. Bindings to the most common scripting languages exist. The file utility’s “magic” database is probably the most comprehensive out there, knowing about exotic file types you’ve never heard of before (since they’re out of widespread use for years or decades) and having been tuned and fixed for a long time now (a whooping 38 years now).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want use html5's new tag to play a wav file (currently only supported
In my XML file chapters tag has more chapter tag.i need to display chapters
I am trying to render a haml file in a javascript response like so:
I am currently running into a problem where an element is coming back from
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
I am trying to understand how to use SyndicationItem to display feed which is
Basically, what I'm trying to create is a page of div tags, each has
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.