Use these source-coding guidelines to improve run-time performance. The amount of improvement in run-time performance is related to the number of times a statement is executed. For example, improving an arithmetic expression executed within a loop many times has the potential to improve performance, more than improving a similar expression executed once outside a loop.
Avoid using integer or logical data less than 32 bits. Accessing a 16-bit (or 8-bit) data type can make data access less efficient, especially on ItaniumŪ-based systems.
To minimize data storage and memory cache misses with arrays, use 32-bit data rather than 64-bit data, unless you require the greater numeric range of 8-byte integers or the greater range and precision of double precision floating-point numbers.
Avoid mixing integer and floating-point (REAL) data in the same computation. Expressing all numbers in a floating-point arithmetic expression (assignment statement) as floating-point values eliminates the need to convert data between fixed and floating-point formats. Expressing all numbers in an integer arithmetic expression as integer values also achieves this. This improves run-time performance.
For example, assuming that I and J are both INTEGER variables, expressing a constant number (2.) as an integer value (2) eliminates the need to convert the data. The following examples demonstrate inefficient and efficient code.
Example: Inefficient Code |
---|
INTEGER I, J I = J / 2. |
Example: Efficient Code |
---|
INTEGER I, J I = J / 2 |
You can use different sizes of the same general data type in an expression with minimal or no effect on run-time performance. For example, using REAL, DOUBLE PRECISION, and COMPLEX floating-point numbers in the same floating-point arithmetic expression has minimal or no effect on run-time performance. However, this practice of mixing different sizes of the same general data type in an expression can lead to unexpected results due to operations being performed in a lower precision than desired.
In cases where more than one data type can be used for a variable, consider selecting the data types based on the following hierarchy, listed from most to least efficient:
Integer (see above example)
Single-precision real, expressed explicitly as REAL, REAL (KIND=4), or REAL*4
Double-precision real, expressed explicitly as DOUBLE PRECISION, REAL (KIND=8), or REAL*8
Extended-precision real, expressed explicitly as REAL (KIND=16) or REAL*16
However, keep in mind that in an arithmetic expression, you should avoid mixing integer and floating-point (REAL) data (see example in the previous subsection).
Before you modify source code to avoid slow arithmetic operators, be aware that optimizations convert many slow arithmetic operators to faster arithmetic operators. For example, the compiler optimizes the expression H=J**2 to be H=J*J.
Consider also whether replacing a slow arithmetic operator with a faster arithmetic operator will change the accuracy of the results or impact the maintainability (readability) of the source code.
Replacing slow arithmetic operators with faster ones should be reserved for critical code areas. The following list shows the Intel Fortran arithmetic operators, from fastest to slowest:
Addition (+), Subtraction (-), and Floating-point multiplication (*)
Integer multiplication (*)
Division (/)
Exponentiation (**)
Avoid using EQUIVALENCE statements. EQUIVALENCE statements can:
Force unaligned data or cause data to span natural boundaries.
Prevent certain optimizations, including:
Global data analysis under certain conditions; see the -O2 (Linux*) or /O2 (Windows*) option in Setting Optimizations.
Implied-DO loop collapsing when the control variable is contained in an EQUIVALENCE statement
Whenever the Intel compiler has access to the use and definition of a subprogram during compilation, it may choose to inline the subprogram. Using statement functions and internal subprograms maximizes the number of subprogram references that will be inlined, especially when multiple source files are compiled together at optimization level -O3 (Linux) or /O3 (Windows).
For more information, see Efficient Compilation.
Minimize the arithmetic operations and other operations in a DO loop whenever possible. Moving unnecessary operations outside the loop will improve performance (for example, when the intermediate nonvarying values within the loop are not needed).
For more Information on loop optimizations, see Pipelining for ItaniumŪ-based Applications and Loop Unrolling; for information on coding Intel statements, see the IntelŪ Fortran Language Reference.