I’d like to data mine my Outlook mailboxes at work to be able to learn more about interactions with people and their areas of expertise:
- Generate social graphs from To: and Cc: lists to show people as nodes on a network with lines between who they interact with
- Tag people with concepts (i.e. from pronouns and recognised definitive company concepts and synonyms within emails)
This would give an insight into who does what (including showing how their work changes overtime) and perhaps assist other ways of knowledge sharing and documentation.
My question broken down into parts:
- Is there an alternative Outlook client to datamine emails
- Or working examples of using some kind of API library
- Would like to do this live at a certain frequency, so that it updates as more emails are created
Also, I’m considering applying things I will learn from the following book: “Natural Language Processsing With Python”: http://shop.oreilly.com/product/9780596516499.do (this post is not a stealth advert for this book or outlook, I am not the author nor work for the publisher). BUT any language will be fine – I can always bolt-on ideas from this book later.
Concerns might be that this has big brother connotations but it also has benefits whereby the primary email system can continue to be used without being impacted and this adds value by becoming a business social media overlay to find people with certain knowledge quickly.
Update
Parallel question voted closed on superuser.com (as duplication is not necessary and I guess that widest choice of answers will be programming based) – so please add your answers to this stackoverflow question right here, not the one on superuser.com
Question also posted here for non-programming angle (to see if existing application available): https://superuser.com/questions/343981/outlook-alternative-or-working-demos-of-apis-into-outlook-to-datamine-emails-to-p
Update 2
Any language is fine. Python is in this Question’s tags but that is because a book on Python is mentioned, rather than me requiring answers to be Python based. I can accept answers in any language. If I also wanted to apply what was in the book then I could always “bolt-on” Python code onto the answer if need be.
You can link outlook folders to databases
— outlookcode.com/article.aspx?ID=25
— support.microsoft.com/kb/209946
…and once extracted to: list, cc: list and email text body strings from the database, you could consider ideas for data mining the data from the book “Natural Language Processsing With Python”: http://shop.oreilly.com/product/9780596516499.do (this post is not a stealth advert for this book or outlook, I am not the author nor work for the publisher).