Vectorization Support

The directives discussed in this topic support vectorization.

IVDEP Directive

The IVDEP directive instructs the compiler to ignore assumed vector dependences. To ensure correct code, the compiler treats an assumed dependence as a proven dependence, which prevents vectorization. This directive overrides that decision. Use IVDEP only when you know that the assumed loop dependences are safe to ignore.

For example, if the expression j >= 0 is always true in the code fragment bellow, the IVDEP directive can communicate this information to the compiler. This directive informs the compiler that the conservatively assumed loop-carried flow dependences for values j < 0 can be safely ignored:

Example

!DEC$ IVDEP

  do i = 1, 100

     a(i) = a(i+j)

  enddo

Note

The proven dependences that prevent vectorization are not ignored, only assumed dependences are ignored.

The usage of the directive differs depending on the loop form, see examples below.

Example: Loop 1

Do i

     = a(*) + 1

    a(*) =

 enddo

Example: Loop 2

Do i

    a(*) =

         = a(*) + 1      

 enddo

For loops of the form 1, use old values of a, and assume that there is no loop-carried flow dependencies from DEF to USE.

For loops of the form 2, use new values of a, and assume that there is no loop-carried anti-dependencies from USE to DEF.

In both cases, it is valid to distribute the loop, and there is no loop-carried output dependency.

Example 1

!DEC$ IVDEP

do j=1,n

  a(j)= a(j+m) + 1

enddo

Example 2

!DEC$ IVDEP

do j=1,n

  a(j) = b(j) + 1

  b(j) = a(j+m) + 1

enddo

Example 1 ignores the possible backward dependencies and enables the loop to get software pipelined.

Example 2 shows possible forward and backward dependencies involving array a in this loop and creating a dependency cycle. With IVDEP, the backward dependencies are ignored.

IVDEP has options: IVDEP:LOOP and IVDEP:BACK. The IVDEP:LOOP option implies no loop-carried dependencies. The IVDEP:BACK option implies no backward dependencies.

The IVDEP directive is also used for ItaniumŪ-based applications.

Overriding the Efficiency Heuristics in the Vectorizer

In addition to IVDEP directive, there are VECTOR directives that can be used to override the efficiency heuristics of the vectorizer:

The VECTOR directives control the vectorization of the subsequent loop in the program, but the compiler does not apply them to nested loops. Each nested loop needs its own directive preceding it. You must place the vector directive before the loop control statement.

VECTOR ALWAYS and NOVECTOR Directives

The VECTOR ALWAYS directive overrides the efficiency heuristics of the vectorizer, but it only works if the loop can actually be vectorized, that is: use IVDEP to ignore assumed dependences.

The  VECTOR ALWAYS directive can be used to override the default behavior of the compiler in the following situation. Vectorization of non-unit stride references usually does not exhibit any speedup, so the compiler defaults to not vectorizing loops that have a large number of non-unit stride references (compared to the number of unit stride references). The following loop has two references with stride 2. Vectorization would be disabled by default, but the directive overrides this behavior.

Example

!DEC$ VECTOR ALWAYS

do i = 1, 100, 2

  a(i) = b(i)

enddo

If, on the other hand, avoiding vectorization of a loop is desirable (if vectorization results in a performance regression rather than improvement), the NOVECTOR directive can be used in the source text to disable vectorization of a loop. For instance, the compiler vectorizes the following example loop by default. If this behavior is not appropriate, the NOVECTOR directive can be used, as shown below.

Example

!DEC$ NOVECTOR

do i = 1, 100

  a(i) = b(i) + c(i)

enddo

VECTOR ALIGNED/UNALIGNED Directives

Like VECTOR ALWAYS, these directives also override the efficiency heuristics. The difference is that the qualifiers UNALIGNED and ALIGNED instruct the compiler to use, respectively, unaligned and aligned data movement instructions for all array references. This disables all the advanced alignment optimizations of the compiler, such as determining alignment properties from the program context or using dynamic loop peeling to make references aligned.

Note

The directives VECTOR [ALWAYS|UNALIGNED|ALIGNED] should be used with care. Overriding the efficiency heuristics of the compiler should only be done if the programmer is absolutely sure the vectorization will improve performance. Furthermore, instructing the compiler to implement all array references with aligned data movement instructions will cause a run-time exception in case some of the access patterns are actually unaligned.

The VECTOR NONTEMPORAL Directive

The VECTOR NONTEMPORAL directive results in streaming stores on the IntelŪ PentiumŪ 4 processor-based systems. A floating-point type loop together with the generated assembly are shown in the example below. For large n, significant performance improvements result on a Pentium 4 system over a non-streaming implementation.

The following example shows the VECTOR NONTEMPORAL directive:

Example

subroutine set(a,n)

integer i,n

real a(n)

!DEC$ VECTOR NONTEMPORAL

!DEC$ VECTOR ALIGNED

do i = 1, n

  a(i) = 1

enddo

end

program setit

parameter(n=1024*1204)

real a(n)

integer i

do i = 1, n

  a(i) = 0

enddo

call set(a,n)

do i = 1, n

  if (a(i).ne.1) then

    print *, 'failed nontemp.f', a(i), i

    stop

  endif

enddo

print *, 'passed nontemp.f'

end

For more details on these directives, see "Directive Enhanced Compilation", section "General Directives", in the IntelŪ Fortran Language Reference.