In the paper Loop Recognition in C++/Java/Go/Scala (pdf) we find the following quote in the section C++ Tunings:
Structure Peeling. The structure
UnionFindNodehas 3 cold fields:
type_,loop_, andheader_. Since nodes are allocated in an
array, this is a good candidate for peeling optimization. The three
fields can be peeled out into a separate array. Note theheader_
field is also dead – but removing it has very little performance
impact. Thename_field in theBasicBlockstructure is also dead,
but it fits well in the padding space so it is not removed.
Can some explain to me what cold/dead fields are, and what a peeling optimization is (I understand what the author did there, but what is the rationale behind it)?
Structure peelingis an optimization where you divide a structure into several ones to improve data locality (in order to reduce cache misses). You separate “hot” data (frequently accessed) from “cold” data (seldomly accessed) into two structures to improve the efficiency of the cache, by maximizing the probability of cache hits.In the article, the authors decided to move the
type_,loop_andheader_fields away from the more frequently accessed fields.For more information, you can have a look at this scientific article about structure layout optimization, which contains a description of structure peeling among other techniques: Structure Layout Optimizations in the Open64 Compiler: Design, Implementation and Measurements
If you have access to the ACM digital library, you can also download Practical structure layout optimization and Advice.