effects of multifile IPO
efficient
Improving I/O Performance
Improving Run-time Efficiency
compilation
Efficient Compilation
Optimizing Compilation Process Overview
use of arrays
use of record buffers
enabling
auto-parallelizer
implied-DO loop collapsing
inlining
parallelizer
PGO options
SIMD-encodings
END PARALLEL DO
endian data
enhancing optimization
enhancing performance
environment variables
and OpenMP* extension routines
for auto-parallelization
for little endian conversion
FORT_BUFFERED
OMP_NUM_THREADS
OMP_SCHEDULE
OpenMP*
PROF_DUMP_INTERVAL
routines overriding
EQUIVALENCE
example of
auto-parallelization
auto-vectorzation
dumping profile information
loop constructs
parallel program development
using OpenMP*
using profile-guided optimization
vectorization
exceptions
denormal
exclude code
code-coverage tool
execution environment routines
execution flow
execution mode
exit conditions
explicit-shape arrays
files
.dpi
Basic PGO Options
Code-coverage Tool
Profmerge and Proforder Utilities
Test-prioritization Tool
.dyn
Advanced PGO Options
Basic PGO Options
Code-coverage Tool
Dumping and Resetting Profile Information
Dumping Profile Information
PGO Environment Variables
Profmerge and Proforder Utilities
Test-prioritization Tool
.hpi
.spi
Code-coverage Tool
Test-prioritization Tool
.tb5
formatted
IR
object files
Command Line for Creating an IPO Executable
Creating a Multifile IPO Executable
OpenMP* header
real object
source
Command Line for Creating an IPO Executable
Efficient Compilation
Example of Profile-Guided Optimization
unformatted
FIRSTPRIVATE
PRIVATE, FIRSTPRIVATE, and LASTPRIVATE Clauses
Worksharing Construct Directives
floating-point applications
comparisons
optimizing
floating-point arithmetic
Compiler Optimizations Overview
Improving Run-time Efficiency
Understanding Floating-point Performance
improving precision of
on IA-32 systems
on ItaniumŪ-based systems
options
options for
overview
performance
restricting precision of
floating-point arthimetics
array operations
flush-to-zero mode
formatted files
FORT_BUFFERED environment variable
FTZ mode
function order list
function splitting
enabling or disabling
general compiler directives
Loop Unrolling Support
Prefetching Support
generating
instrumented code
processor-specific code
profile-optimized executable
profiling information
reports
guidelines
for auto-parallelization
for high performance programming
for IA-32 architecture
for improving run-time efficiency
for profile-guided optimization
Advanced PGO Options
Profile-guided Optimizations Overview
for vectorization
Key Programming Guidelines for Vectorization
Vectorization Overview
helper thread optimization
heuristics
affecting data prefetches
Loop Count and Loop Distribution
Prefetching Support
affecting software pipelining
Loop Count and Loop Distribution
Pipelining for ItaniumŪ-based Applications
for inlining functions
Criteria for Inline Function Expansion
Using Qoption Specifiers
high-level language optimizer
HLO Overview
Optimizer Report Generation
high performance
high performance programming
Profile-guided Optimizations Overview
Programming for High Performance Overview
HLO
reports
hotspots
Using a Performance Methodology
Using Tuning Tools and Strategies
Hyper-Threading Technology
parallel loops
thread pools
using OpenMP*
I/O
improving performance
list
parsing
performance
IA-32 architecture
applications for
dispatch options for
guidelines for
options for
Automatic Processor-specific Optimization (IA-32 Only)
Processor-specific Optimization (IA-32 only)
options targeting
Optimizing for Specific Processors Overview
Targeting a Processor
processors for
Automatic Processor-specific Optimization (IA-32 Only)
Parallelism Overview
Processor-specific Optimization (IA-32 only)
report generation
targeting
Processor-specific Optimization (IA-32 only)
Targeting a Processor
ILO
implied-DO loop
improving
floating-point arithmetic precision
I/O performance
run-time performance
inefficient
code
initialization values for reduction variables
inlining
Controlling Inline Expansion of User Functions
Efficient Compilation
Improving Run-time Efficiency
Profile-guided Optimizations Overview
instruction-level parallelism
instrumentation
compilation
repeat
instrumented code
execution - run
generating
program
integer pointers
preventing aliasing
IntelŪ-extended intrinsics
IntelŪ architectures
IntelŪ compiler-generated code
IntelŪ Debugger
IntelŪ extension environment variables
IntelŪ extension routines
IntelŪ ItaniumŪ 2 processors
IntelŪ ItaniumŪ processors
IntelŪ PentiumŪ 4 processors
IntelŪ PentiumŪ II processors
IntelŪ PentiumŪ III processors
IntelŪ PentiumŪ Pro processors
IntelŪ PentiumŪ processors
IntelŪ Threading Tools
intermediate language files (IL)
implementing with version number
intermediate language scalar optimizer
intermediate representation
intermediate results
using memory for
internal subprograms
interprocedural optimizations
Compiler Optimizations Overview
Controlling Inline Expansion of User Functions
Efficient Compilation
Interprocedural Optimizations Overview
Optimizer Report Generation
Profile-guided Optimizations Overview
code layout
interval profile dumping
initiating
intrinsics
introduction to Optimizing Applications
IPO
code layout
creating libraries
generating multiple IPO object files
issues
overview
performance
reports
ItaniumŪ-based applications
auto-vectorization in
floating-point arithmetic precision
floating point options
Floating-point Options for ItaniumŪ-based Systems
Floating-point Options for Multiple Architectures
HLO
options targeting
pipelining for
report generation
targeting
using intrinsics in
IVDEP
HLO Overview
Loop Transformations
Memory Dependency with IVDEP Directive