I am goingto be starting work soon on a new project at work. Essentially there are many chemical compounds here each has its own prefix / identifier. For example a couple of chars followed by a few ints and that sort of thing, tho they all vary.
I was wondering if there was an algorithm for matching these elements efficiently, opposed to having a massive if else.
I guess a hash map with key -> value with the key being some mask may be good but i was hoping someone could suggest something a little more sophisticated that i could use.
Because its not just for chemical compounds the number of different values it could be is huge.
Thanks
consider these facts:
1) Two molecules can have same structural identifier, caused for example by stereometry or, comparing two complex molecules (especially with many benzen rings)
2) Consider http://en.wikipedia.org/wiki/International_Chemical_Identifier. It’s defining unambiguous version of molecule structure, and you can extract structural formula from it. For example:
is representing
3) You can check MQL Molecular query language
4) Implementing it on your own may take a lot of time. There are some context-free grammars but they are very complex, try to find some free Molecule Query