Optimizing Applications Glossary

Term	Definition
A
alignment constraint	The proper boundary of the stack where data must be stored.
alternate loop transformation	An optimization in which the compiler generates a copy of a loop and executes the new loop depending on the boundary size.
B
branch count profiler	A tool that counts the number of times a program executes each branch statement. The utility also generates a database that shows how the program executed.
branch probability database	The database generated by the branch count profiler. The database contains the number of times each branch is executed.
C
cache hit	The situation when the information the processor wants is in the cache.
call site	A call site consists of the instructions immediately preceding a call instruction and the call instruction itself.
common subexpression elimination	An optimization in which the compiler detects and combines redundant computations.
conditionals	Any operation that takes place depending on whether or not a certain condition is true.
constant argument propagation	An optimization in which the compiler replaces the formal arguments of a routine with actual constant values. The compiler then propagates constant variables used as actual arguments.
constant branches	Conditionals that always take the same branch.
constant folding	An optimization in which the compiler, instead of storing the numbers and operators for computation when the program executes, evaluates the constant expression and uses the result.
copy propagation	An optimization in which the compiler eliminates unnecessary assignments by using the value assigned to a variable instead of using the variable itself.
D
dataflow	The movement of data through a system, from entry to destination.
dead-code elimination	An optimization in which the compiler eliminates any code that generates unused values or any code that will never be executed in the program.
denormal values	Computed floating-point values that have absolute values smaller than the smallest normalized floating-point number are called denormal values.
dynamic linking	The process in which a shared object is mapped into the virtual address space of your program at run time.
E
empty declaration	A semicolon and nothing before it.
F
frame pointer	A pointer that holds a base address for the current stack and is used to access the stack frame.
G
Gradual Underflow	Gradual underflow occurs when computed floating-point values have absolute values smaller than the smallest normalized floating-point number. Such floating-point values are called denormal values. Gradual underflow can degrade the performance of an application.
H
HLO	High-level optimization
Hyper-Threading Technology	Hyper-Threading Technology enables the operation of multiple logical processors to share execution resources in each physical processor package. It increases system throughput when executing multithreaded applications or when multitasked workloads are running concurrently. Hyper-Threading Technology enables you to use simultaneous multithreading on the IA-32 systems. This technology makes a single physical processor appear as two logical processors. Each logical processor can execute a software thread, allowing a maximum of two software threads to execute simultaneously on one physical processor. The two software threads execute simultaneously by the execution engine.
I
ILP32	int, long, and pointer types are 32-bit
IPO	Interprocedural optimization
in-line function expansion	An optimization in which the compiler replaces each function call with the function body expanded in place.
induction variable simplification	An optimization in which the compiler reduces the complexity of an array index calculation by using only additions.
instruction scheduling	An optimization in which the compiler reorders the generated machine instructions so that more than one can execute in parallel.
instruction sequencing	An optimization in which the compiler eliminates less efficient instructions and replaces them with instruction sequences that take advantage of a particular processor's features.
interprocedural optimization	An optimization that applies to the entire program except for library routines.
L
LP64	long and pointer types are 64-bit
load balancing	The equal division of work among threads. If a load is balanced, it ensures processors are busy most, if not all, of the time. If a load is not balanced, some threads may finish significantly before others, leaving processor resources idle and wasting performance opportunities.
loop blocking	An optimization in which the compiler reorders the execution sequence of instructions so that the compiler can execute iterations from outer loops before completing all the iterations of the inner loop.
loop unrolling	An optimization in which the compiler duplicates the executed statements inside a loop to reduce the number of loop iterations.
loop-invariant code movement	An optimization in which the compiler detects multiple instances of a computation that does not change within a loop.
P
P64	pointer types are 64-bit
padding	The addition of bytes or words at the end of each data type in order to meet size and alignment constraints.
PGO	Profile-guided optimization
preloading	An optimization in which the compiler loads the vectors, one cache at a time, so that during the loop computation the number of external bus turnarounds is reduced.
privatization of scalars	Privatization of scalars is an operation of re-assigning the storage of scalars from the static or parent stack area to the local stack of a thread to enable parallelization. This operation requires a WRITE permission and is usually performed to remove a data dependency between concurrently executing threads.
profiling	A process in which detailed information is produced about the program's execution.
R
register variable detection	An optimization in which the compiler detects the variables that never need to be stored in memory and places them in register variables.
S
side effects	Results of the optimization process that might increase the code size and/or processing time.
static linking	The process in which a copy of the object file that contains a function used in your program is incorporated in your executable file at link time.
strength reduction	An optimization in which the compiler reduces the complexity of an array index calculation by using only additions.
strip mining	An optimization in which the compiler creates an additional level of nesting to enable inner loop computations on vectors that can be held in the cache. This optimization reduces the size of inner loops so that the amount of data required for the inner loop can fit the cache size.
SWP	Software Pipelining
T
token pasting	The process in which the compiler treats two tokens separated by a comment as one (for example, a/**/b become ab).
transformation	A rearrangement of code. In contrast, an optimization is a rearrangement of code where improved run-time performance is guaranteed.
U
unreachable code	Instructions that are never executed by the compiler.
unused code	Instructions that produce results that are not used in the program.
V
variable renaming	An optimization in which the compiler renames instances of a variable that refer to distinct entities.

Optimizing Applications Glossary

Term

Definition