I need to generate a string regular expression (or at least a format string) from an already formatted string. How is this done?
My use case: I am passed a document title called “Collected stuff (part 3).doc” and need to located all the related docuements (eg part1, part2, and part 3). The complication is that the document could be called Very old collected stuff [part 2].txt or even Misc stuff, vol 4.doc**
Is there an easy way to do this? Bonus for being in python.
Thanks. Kent
Logically, how would you relate your original string to what you want to match? What portion of the original string exists in the others and what do they have in common? It’s going to be pretty difficult to come up with a program that is complex enough to cover all your bases, and that is reliable.
I think your best bet is to come up with a hierarchy of preferred matches.
For example, if you start with “Collected stuff (part 3).doc” , you would probably want to try and match something that contains everything not in parenthesis first, before you started trying to match things based off of their volume number.
Would Very old collected stuff [part 2.txt not be a better match than Misc stuff, vol 4.doc?