I often measure code snippets using rdtsc which gives me an idea of cycles taken by reading in the time stamp counter but I am aware that processors (in particular mine is an intel Xeon) have performance counters to measure branch misses and all other good stuff. How do I read that? Is it possible with similar code to rdtsc (http://en.wikipedia.org/wiki/Rdtsc) ? Also, I am aware there is a product called perfmon which does this but I would like to do this myself in a simple programmatic way to also learn more. How can I get started with this?
Share
Have a look at PAPI. It provides an API for doing this.