Auto-parallelization Overview

The auto-parallelization feature of the Intel® compiler automatically translates serial portions of the input program into equivalent multithreaded code. The auto-parallelizer analyzes the dataflow of the program’s loops and generates multithreaded code for those loops which can be safely and efficiently executed in parallel. This enables the potential exploitation of the parallel architecture found in symmetric multiprocessor (SMP) systems.

Automatic parallelization relieves the user from:

Dealing with the details of finding loops that are good worksharing candidates
Performing the dataflow analysis to verify correct parallel execution
Partitioning the data for threaded code generation as is needed in programming with OpenMP* directives.

The parallel run-time support provides the same run-time features as found in OpenMP, such as handling the details of loop iteration modification, thread scheduling, and synchronization.

While OpenMP directives enable serial applications to transform into parallel applications quickly, the programmer must explicitly identify specific portions of the application code that contain parallelism and add the appropriate compiler directives.

Auto-parallelization triggered by the -parallel (Linux*) or /Qparallel (Windows*) option automatically identifies those loop structures, which contain parallelism. During compilation, the compiler automatically attempts to decompose the code sequences into separate threads for parallel processing. No other effort by the programmer is needed.

The following examples illustrate how a loop’s iteration space can be divided so that it can be executed concurrently on two threads:

Example 1: Original Serial Code
do i=1,100 a(i) = a(i) + b(i) * c(i) enddo

Example 2: Transformed Parallel Code
! Thread 1 do i=1,50 a(i) = a(i) + b(i) * c(i) enddo ! Thread 2 do i=51,100 a(i) = a(i) + b(i) * c(i) enddo

Example 2: Transformed Parallel Code

! Thread 1

do i=1,50
a(i) = a(i) + b(i) * c(i)
enddo

! Thread 2

do i=51,100
a(i) = a(i) + b(i) * c(i)
enddo

Auto-parallelization Overview

Example 1: Original Serial Code

Example 2: Transformed Parallel Code