<<

effects of multifile IPO
efficient
   Improving I/O Performance
   Improving Run-time Efficiency
    compilation
       Efficient Compilation
       Optimizing Compilation Process Overview
    use of arrays
    use of record buffers
enabling
    auto-parallelizer
    implied-DO loop collapsing
    inlining
    parallelizer
    PGO options
    SIMD-encodings
END PARALLEL DO
endian data
enhancing optimization
enhancing performance
environment variables
    and OpenMP* extension routines
    for auto-parallelization
    for little endian conversion
    FORT_BUFFERED
    OMP_NUM_THREADS
    OMP_SCHEDULE
    OpenMP*
    PROF_DUMP_INTERVAL
    routines overriding
EQUIVALENCE
example of
    auto-parallelization
    auto-vectorzation
    dumping profile information
    loop constructs
    parallel program development
    using OpenMP*
    using profile-guided optimization
    vectorization
exceptions
    denormal
exclude code
    code-coverage tool
execution environment routines
execution flow
execution mode
exit conditions
explicit-shape arrays



files
    .dpi
       Basic PGO Options
       Code-coverage Tool
       Profmerge and Proforder Utilities
       Test-prioritization Tool
    .dyn
       Advanced PGO Options
       Basic PGO Options
       Code-coverage Tool
       Dumping and Resetting Profile Information
       Dumping Profile Information
       PGO Environment Variables
       Profmerge and Proforder Utilities
       Test-prioritization Tool
    .hpi
    .spi
       Code-coverage Tool
       Test-prioritization Tool
    .tb5
    formatted
    IR
    object files
       Command Line for Creating an IPO Executable
       Creating a Multifile IPO Executable
    OpenMP* header
    real object
    source
       Command Line for Creating an IPO Executable
       Efficient Compilation
       Example of Profile-Guided Optimization
    unformatted
FIRSTPRIVATE
   PRIVATE, FIRSTPRIVATE, and LASTPRIVATE Clauses
   Worksharing Construct Directives
floating-point applications
    comparisons
    optimizing
floating-point arithmetic
   Compiler Optimizations Overview
   Improving Run-time Efficiency
   Understanding Floating-point Performance
    improving precision of
    on IA-32 systems
    on ItaniumŪ-based systems
    options
    options for
    overview
    performance
    restricting precision of
floating-point arthimetics
    array operations
flush-to-zero mode
formatted files
FORT_BUFFERED environment variable
FTZ mode
function order list
function splitting
    enabling or disabling



general compiler directives
   Loop Unrolling Support
   Prefetching Support
generating
    instrumented code
    processor-specific code
    profile-optimized executable
    profiling information
    reports
guidelines
    for auto-parallelization
    for high performance programming
    for IA-32 architecture
    for improving run-time efficiency
    for profile-guided optimization
       Advanced PGO Options
       Profile-guided Optimizations Overview
    for vectorization
       Key Programming Guidelines for Vectorization
       Vectorization Overview



helper thread optimization
heuristics
    affecting data prefetches
       Loop Count and Loop Distribution
       Prefetching Support
    affecting software pipelining
       Loop Count and Loop Distribution
       Pipelining for ItaniumŪ-based Applications
    for inlining functions
       Criteria for Inline Function Expansion
       Using Qoption Specifiers
high-level language optimizer
   HLO Overview
   Optimizer Report Generation
high performance
high performance programming
   Profile-guided Optimizations Overview
   Programming for High Performance Overview
HLO
    reports
hotspots
   Using a Performance Methodology
   Using Tuning Tools and Strategies
Hyper-Threading Technology
    parallel loops
    thread pools
    using OpenMP*



I/O
    improving performance
    list
    parsing
    performance
IA-32 architecture
    applications for
    dispatch options for
    guidelines for
    options for
       Automatic Processor-specific Optimization (IA-32 Only)
       Processor-specific Optimization (IA-32 only)
    options targeting
       Optimizing for Specific Processors Overview
       Targeting a Processor
    processors for
       Automatic Processor-specific Optimization (IA-32 Only)
       Parallelism Overview
       Processor-specific Optimization (IA-32 only)
    report generation
    targeting
       Processor-specific Optimization (IA-32 only)
       Targeting a Processor
ILO
implied-DO loop
improving
    floating-point arithmetic precision
    I/O performance
    run-time performance
inefficient
    code
initialization values for reduction variables
inlining
   Controlling Inline Expansion of User Functions
   Efficient Compilation
   Improving Run-time Efficiency
   Profile-guided Optimizations Overview
instruction-level parallelism
instrumentation
    compilation
    repeat
instrumented code
    execution - run
    generating
    program
integer pointers
    preventing aliasing
IntelŪ-extended intrinsics
IntelŪ architectures
IntelŪ compiler-generated code
IntelŪ Debugger
IntelŪ extension environment variables
IntelŪ extension routines
IntelŪ ItaniumŪ 2 processors
IntelŪ ItaniumŪ processors
IntelŪ PentiumŪ 4 processors
IntelŪ PentiumŪ II processors
IntelŪ PentiumŪ III processors
IntelŪ PentiumŪ Pro processors
IntelŪ PentiumŪ processors
IntelŪ Threading Tools
intermediate language files (IL)
    implementing with version number
intermediate language scalar optimizer
intermediate representation
intermediate results
    using memory for
internal subprograms
interprocedural optimizations
   Compiler Optimizations Overview
   Controlling Inline Expansion of User Functions
   Efficient Compilation
   Interprocedural Optimizations Overview
   Optimizer Report Generation
   Profile-guided Optimizations Overview
    code layout
interval profile dumping
    initiating
intrinsics
introduction to Optimizing Applications
IPO
    code layout
    creating libraries
    generating multiple IPO object files
    issues
    overview
    performance
    reports
ItaniumŪ-based applications
    auto-vectorization in
    floating-point arithmetic precision
    floating point options
       Floating-point Options for ItaniumŪ-based Systems
       Floating-point Options for Multiple Architectures
    HLO
    options targeting
    pipelining for
    report generation
    targeting
    using intrinsics in
IVDEP
   HLO Overview
   Loop Transformations
   Memory Dependency with IVDEP Directive


>>