I’m making a small system for personal use on that I want to handle files. In this system I want to categorize files based on names and automate as much as possible. This has lead me to a problem with matching strings.
Say I have a category called A category and a two files called:
a.category.filelotsofgarbage.a-big-kateory.file
I need to match these file names to the category. I guess it would be more like a “how much is they alike” score, since there is no good way to do an exact match.
Can anyone give me a simple an good algorithm for this problem? Or point me in the direction of one?
Probably the best way to tackle this, would be to calculate the edit distance of your filenames to your category name, and if they’re under a certain treshold, then they should match.
Check out this link, apparently php can do that for you.