I have a application which has two threads ,thread1 would receive multicast packages
from network card eth1 , suppose I use sched_setaffinity to set cpu affinity
for thread1 to cpu core 1 , and then I have thread2 to use these packages
(received from thread1,located in heap area global vars) to do some operations ,
I set cpu affinity for thread2 to core 7 ,suppose core 1 and core 7 are
in the same core with hyper-threading , I think the performance would be good,
since core 1 and core 7 can use L1 cache .
I have watched /proc/interrupt , I see eth1 has interrupts in several cpu cores ,
so in my case , I set cpu affinity to core 1 for thread1 ,but interrupts happened
in many cores , would it effect performance ? those packages received from eth1
would go directly to main memory no matter which core has the interrupt ?
I don’t know much about network in linux kernel , may anyone who would suggest
books or websites can help me for this topic ? Thanks for any comments ~~
Edit : according to “What every programmer should know about memory” 6.3.5 “Direct Cache Access” , I think “DCA” is hwat i like to know …
The interrupt will (quite likely) happen on a different core than the one receiving the packet. Depending on how the driver deals with packets, that may or may not matter. If the driver reads the packet (e.g. to makle a copy), then it’s not ideal, as the cache gets filled on a different CPU. But if the packet is just loaded into memory somewhere using DMA, and left there for the software to pick up later, then it doesn’t matter [in fact, it’s better to have it happen on a different CPU, as “your” cpu gets more time to do other things].
As to using hyperthreading, my experience (and that of many others) is that hyperthreading SOMETIMES gives a benefit, but often ends up being similar to not having hyperthreading, because the two threads use the same execution units of the same core. You may want to compare the throughput with both threads set to affinity on the same core as well, to see if that makes it “better” or “worse” – like most things, it’s often details that make a difference, so your code may be slightly different from someone elses, meaning it works better in one or the other of the cases.
Edit: if your system has multiple sockets, you may also want to ensure that the CPU on the socket “nearest” (as in the number of QPI/PCI bridge hops) the network card.