I read some notes that say that if we keep the frequently accessed
structures aligned to the size of cpu cache line, then we may get
improved performance.
It is also possible that this will make performance *worse*
(due to "bad" cache aliasing). It all depends on the design
of the cache and the actual access patterns.
Is anyone aware of how this alignment can help in increasing the
performance?
Cache design issues can occupy much of a college-course semester.
The basics are not that hard, but actual performance depends
critically on things well beyond the basics. So your best bet is
to get a good book on computer architecture.
To find out what effect alignment has on your particular problem
as run on your particular system, the only general-purpose answer
is to measure it. Note that apparently irrelevant system changes,
such as expanding main (non-cache) memory, can have odd effects
like making your code run *slower*, so these are very much "my
problem on my system as it exists at this very moment" measurements.
Generalizing is difficult.
See, e.g., <
http://www.e-articles.info/e/a/title/What-is-CAS-latency/>,
and all of chapter 5 at <
http://www.csee.umbc.edu/~plusquel/611/index.html>.