The compiler implements a parallel region by enabling the code in the region and putting it into a separate, compiler-created entry point. Although this is different from outlining – the technique employed by other compilers, that is, creating a subroutine, – the same debugging technique can be applied.
The compiler-generated parallel region entry point name is constructed with a concatenation of the following strings:
"__" character
entry point name for the original routine (for example, _parallel)
"_" character
line number of the parallel region
__par_region
for OpenMP parallel regions (!$OMP PARALLEL)
__par_loop for OpenMP parallel loops (!$OMP
PARALLEL DO),
__par_section for OpenMP parallel sections (!$OMP
PARALLEL SECTIONS)
sequence number of the parallel region (for each source file, sequence number starts from zero.)
When you use routine names (for example, padd) and entry names (for example, _PADD, ___PADD_6__par_loop0), the following occurs. The Fortran Compiler, by default, first changes lower/mixed case routine names to upper case. For example, pAdD becomes PADD, and this becomes the entry name by adding one underscore. The secondary entry name change happens after that. That's why the "__par_loop" part of the entry name stays as lower case.
Note
The debugger doesn't accept the upper case routine name "PADD" to set the breakpoint. Instead, it accepts the lower case routine name "padd".
Example 1 shows the debugging of the code with a parallel region. Example 1 is produced by this command:
ifort -openmp -g -O0 -S file.f90
Let us consider the code of subroutine parallel in Example 1.
Subroutine PARALLEL() source listing
1 subroutine
parallel
2 integer
id,OMP_GET_THREAD_NUM
3 !$OMP PARALLEL PRIVATE(id)
4 id
= OMP_GET_THREAD_NUM()
5 !$OMP END PARALLEL
6 end
The parallel region is at line 3. The compiler created two entry points: parallel_ and ___parallel_3__par_region0. The first entry point corresponds to the subroutine parallel(), while the second entry point corresponds to the OpenMP parallel region at line 3.
Example 1 Debugging Code with Parallel Region
Machine Code Listing of the Subroutine parallel()
.globl
parallel_
parallel_:
..B1.1: #
Preds ..B1.0
..LN1:
pushl %ebp
#1.0
movl %esp,
%ebp #1.0
subl $44,
%esp #1.0
pushl %edi
#1.0
... ... ... ... ... ... ... ... ... ... ... ... ...
..B1.13: #
Preds ..B1.9
addl $-12,
%esp #6.0
movl $.2.1_2_kmpc_loc_struct_pack.2,
(%esp) #6.0
movl $0,
4(%esp) #6.0
movl $_parallel__6__par_region1,
8(%esp) #6.0
call __kmpc_fork_call
#6.0
# LOE
..B1.31: #
Preds ..B1.13
addl $12,
%esp #6.0
# LOE
..B1.14: #
Preds ..B1.31 ..B1.30
..LN4:
leave #9.0
ret #9.0
# LOE
.type parallel_,@function
.size parallel_,.-parallel_
.globl _parallel__3__par_region0
_parallel__3__par_region0:
# parameter 1: 8 + %ebp
# parameter 2: 12 + %ebp
..B1.15: #
Preds ..B1.0
pushl %ebp
#9.0
movl %esp,
%ebp #9.0
subl $44,
%esp #9.0
..LN5:
call omp_get_thread_num_
#4.0
# LOE eax
..B1.32: #
Preds ..B1.15
movl %eax,
-32(%ebp) #4.0
# LOE
..B1.16: #
Preds ..B1.32
movl -32(%ebp),
%eax #4.0
movl %eax,
-20(%ebp) #4.0
..LN6:
leave #9.0
ret #9.0
# LOE
.type _parallel__3__par_region0,@function
.size _parallel__3__par_region0,._parallel__3__par_region0
.globl _parallel__6__par_region1
_parallel__6__par_region1:
# parameter 1: 8 + %ebp
# parameter 2: 12 + %ebp
..B1.17: #
Preds ..B1.0
pushl %ebp
#9.0
movl %esp,
%ebp #9.0
subl $44,
%esp #9.0
..LN7:
call omp_get_thread_num_
#7.0
# LOE eax
..B1.33: #
Preds ..B1.17
movl %eax,
-28(%ebp) #7.0
# LOE
..B1.18: #
Preds ..B1.33
movl -28(%ebp),
%eax #7.0
movl %eax,
-16(%ebp) #7.0
..LN8:
leave #9.0
ret #9.0
.align 4,0x90
# mark_end;
Debugging the program at this level is just like debugging a program that uses POSIX threads directly. Breakpoints can be set in the threaded code just like any other routine. With the Intel® Debugger (idb) or the GNU debugger, breakpoints can be set to source-level routine names (such as parallel). Breakpoints can also be set to entry point names (such as parallel_ and _parallel__3__par_region0). Note that the Intel® Fortran Compiler for Linux* converted the upper case Fortran subroutine name to the lower case one.