Aligning data on boundaries can help performance. The IntelŪ compiler attempts to align data on boundaries for you. However, as in all areas of optimization coding practices either help or hinder the compiler and can lead to performance problems. Always attempt to optimize using compiler options first. See Optimization Options Summary for more information.
To avoid performance problems you should keep the following guidelines in mind, which are separated by architecture:
IA-32, IntelŪ EM64T, IntelŪ ItaniumŪ architectures:
Do not access or create data at large intervals that are separated by exactly 2n (for example, 1 KB, 2 KB, 4 KB, 16 KB, 32 KB, 64 KB, 128 KB, 512 KB, 1 MB, 2 MB, 4 MB, 8 MB, etc.).
Align data so that memory accesses does not cross cache lines (for example, 32 bytes, 64 bytes, 128 bytes).
Use Application Binary Interface (ABI) for the ItaniumŪ compiler to insure that ITP pointers are 16-byte aligned.
IA-32 and IntelŪ EM64T architectures:
Align data to correspond to the SIMD or Streaming SIMD Extension registers sizes.
ItaniumŪ architecture:
Avoid using packed structures.
Avoid casting pointers of small data elements to pointers of large data elements.
Do computations on unpacked data, then repack data if necessary, to correctly output the data.
In general, keeping data in cache has a better performance impact than keeping the data aligned. Try to use techniques that align the data without greatly expanding data size.
See Setting Data Type and Alignment for more detailed information on aligning data.