Will I get fewer cache misses if I use Thread Local Storage in my multithreaded program?
Edit:
Since each thread is given its own memory pool is it more likely that the last accessed memory is still in the CPU cache?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Fewer cache misses than what?
TLS is just one of many ways to ensure that different threads operate on different data (the obvious one is to just put each thread’s data on its own stack).
You’ll get better cache behavior if your threads don’t write to the same data (since that will invalidate the corresponding cache line for all other cores), but which method you use to ensure that the threads operate on different data is irrelevant in this respect.
(There is other overhead associated with TLS, though. It’s not magic, and it’s not a silver bullet. Most of the time, it’s the wrong solution)