Term |
Definition |
---|---|
|
|
alignment constraint |
The proper boundary of the stack where data must be stored. |
alternate loop transformation |
An optimization in which the compiler generates a copy of a loop and executes the new loop depending on the boundary size. |
|
|
branch count profiler |
A tool that counts the number of times a program executes each branch statement. The utility also generates a database that shows how the program executed. |
branch probability database |
The database generated by the branch count profiler. The database contains the number of times each branch is executed. |
|
|
cache hit |
The situation when the information the processor wants is in the cache. |
call site |
A call site consists of the instructions immediately preceding a call instruction and the call instruction itself. |
common subexpression elimination |
An optimization in which the compiler detects and combines redundant computations. |
conditionals |
Any operation that takes place depending on whether or not a certain condition is true. |
constant argument propagation |
An optimization in which the compiler replaces the formal arguments of a routine with actual constant values. The compiler then propagates constant variables used as actual arguments. |
constant branches |
Conditionals that always take the same branch. |
constant folding |
An optimization in which the compiler, instead of storing the numbers and operators for computation when the program executes, evaluates the constant expression and uses the result. |
copy propagation |
An optimization in which the compiler eliminates unnecessary assignments by using the value assigned to a variable instead of using the variable itself. |
|
|
dataflow |
The movement of data through a system, from entry to destination. |
dead-code elimination |
An optimization in which the compiler eliminates any code that generates unused values or any code that will never be executed in the program. |
denormal values |
Computed floating-point values that have absolute values smaller than the smallest normalized floating-point number are called denormal values. |
dynamic linking |
The process in which a shared object is mapped into the virtual address space of your program at run time. |
|
|
empty declaration |
A semicolon and nothing before it. |
|
|
frame pointer |
A pointer that holds a base address for the current stack and is used to access the stack frame. |
|
|
Gradual Underflow |
Gradual underflow occurs when computed floating-point values have absolute values smaller than the smallest normalized floating-point number. Such floating-point values are called denormal values. Gradual underflow can degrade the performance of an application. |
|
|
HLO |
High-level optimization |
Hyper-Threading Technology |
Hyper-Threading Technology enables the operation of multiple logical processors to share execution resources in each physical processor package. It increases system throughput when executing multithreaded applications or when multitasked workloads are running concurrently. Hyper-Threading Technology enables you to use simultaneous multithreading on the IA-32 systems. This technology makes a single physical processor appear as two logical processors. Each logical processor can execute a software thread, allowing a maximum of two software threads to execute simultaneously on one physical processor. The two software threads execute simultaneously by the execution engine. |
|
|
ILP32 |
int, long, and pointer types are 32-bit |
IPO |
Interprocedural optimization |
in-line function expansion |
An optimization in which the compiler replaces each function call with the function body expanded in place. |
induction variable simplification |
An optimization in which the compiler reduces the complexity of an array index calculation by using only additions. |
instruction scheduling |
An optimization in which the compiler reorders the generated machine instructions so that more than one can execute in parallel. |
instruction sequencing |
An optimization in which the compiler eliminates less efficient instructions and replaces them with instruction sequences that take advantage of a particular processor's features. |
interprocedural optimization |
An optimization that applies to the entire program except for library routines. |
|
|
LP64 |
long and pointer types are 64-bit |
load balancing |
The equal division of work among threads. If a load is balanced, it ensures processors are busy most, if not all, of the time. If a load is not balanced, some threads may finish significantly before others, leaving processor resources idle and wasting performance opportunities. |
loop blocking |
An optimization in which the compiler reorders the execution sequence of instructions so that the compiler can execute iterations from outer loops before completing all the iterations of the inner loop. |
loop unrolling |
An optimization in which the compiler duplicates the executed statements inside a loop to reduce the number of loop iterations. |
loop-invariant code movement |
An optimization in which the compiler detects multiple instances of a computation that does not change within a loop. |
|
|
P64 |
pointer types are 64-bit |
padding |
The addition of bytes or words at the end of each data type in order to meet size and alignment constraints. |
PGO |
Profile-guided optimization |
preloading |
An optimization in which the compiler loads the vectors, one cache at a time, so that during the loop computation the number of external bus turnarounds is reduced. |
privatization of scalars |
Privatization of scalars is an operation of re-assigning the storage of scalars from the static or parent stack area to the local stack of a thread to enable parallelization. This operation requires a WRITE permission and is usually performed to remove a data dependency between concurrently executing threads. |
profiling |
A process in which detailed information is produced about the program's execution. |
|
|
register variable detection |
An optimization in which the compiler detects the variables that never need to be stored in memory and places them in register variables. |
|
|
side effects |
Results of the optimization process that might increase the code size and/or processing time. |
static linking |
The process in which a copy of the object file that contains a function used in your program is incorporated in your executable file at link time. |
strength reduction |
An optimization in which the compiler reduces the complexity of an array index calculation by using only additions. |
strip mining |
An optimization in which the compiler creates an additional level of nesting to enable inner loop computations on vectors that can be held in the cache. This optimization reduces the size of inner loops so that the amount of data required for the inner loop can fit the cache size. |
SWP |
Software Pipelining |
|
|
token pasting |
The process in which the compiler treats two tokens separated by a comment as one (for example, a/**/b become ab). |
transformation |
A rearrangement of code. In contrast, an optimization is a rearrangement of code where improved run-time performance is guaranteed. |
|
|
unreachable code |
Instructions that are never executed by the compiler. |
unused code |
Instructions that produce results that are not used in the program. |
|
|
variable renaming |
An optimization in which the compiler renames instances of a variable that refer to distinct entities. |