I have some labels and attributes from text.
I am looking for patterns (combinations of key-value pairs that occur across many documents) of labels and attributes amongst these documents.
What kind of an algorithm and tool should I be looking into? I want to score these patterns based on relevance and importance and not just string matching.
Any kind of inputs would be great.
Thanks
If I correctly understand your question, you are talking about association mining. Example: attr1==value1 ==> label=label1 (95% percision)
There are several algorithms, one of them is Apriori.
The second interpretation of your question is feature selection i.e. selecting attributes which has most impact on label prediction. There you can check infogain/chi^2 selection all of this staff you can find in Weka(www.cs.waikato.ac.nz/ml/weka).
If your don’t want to use such algorithms and implement them, most simple implementation will look like: