Programming with Auto-parallelization

The auto-parallelization feature implements some concepts of OpenMP*, such as the worksharing construct (with the PARALLEL DO directive). See Programming with OpenMP for worksharing construct. This section provides details on auto-parallelization.

Guidelines for Effective Auto-parallelization Usage

A loop can be parallelized if:

The compiler may generate a run-time test for the profitability of executing in parallel for loop with loop parameters that are not compile-time constants.

Coding Guidelines

Enhance the power and effectiveness of the auto-parallelizer by following these coding guidelines:

Auto-parallelization Data Flow

For auto-parallelization processing, the compiler performs the following steps:

  1. Data flow analysis: Computing the flow of data through the program.

  2. Loop classification: Determining loop candidates for parallelization based on correctness and efficiency, as shown by threshold analysis.

  3. dependency analysis: Computing the dependency analysis for references in each loop nest.

  4. High-level parallelization: Analyzing dependency graph to determine loops which can execute in parallel, and computing run-time dependency.

  5. Data partitioning: Examining data reference and partition based on the following types of access: SHARED, PRIVATE, and FIRSTPRIVATE.

  6. Multi-threaded code generation: Modifying loop parameters, generating entry/exit per threaded task, and generating calls to parallel run-time routines for thread creation and synchronization.