Debugging Multiple Threads

When in a debugger, you can switch from one thread to another. Each thread has its own program counter so each thread can be in a different place in the code. Example 2 shows a Fortran subroutine PADD(). A breakpoint can be set at the entry point of OpenMP parallel region.

Source listing of the Subroutine PADD()

12.       SUBROUTINE PADD(A, B, C, N)
13.       INTEGER N
14.       INTEGER A(N), B(N), C(N)
15.       INTEGER I, ID, OMP_GET_THREAD_NUM
16. !$OMP PARALLEL DO SHARED (A, B, C, N) PRIVATE(ID)
17.       DO I = 1, N
18.         ID = OMP_GET_THREAD_NUM()
19.         C(I) = A(I) + B(I) + ID
20.       ENDDO
21. !$OMP END PARALLEL DO
22.       END

The Call Stack Dumps

The first call stack below is obtained by breaking at the entry to subroutine PADD using GNU debugger. At this point, the program has not executed any OpenMP regions, and therefore has only one thread. The call stack shows a system run-time __libc_start_main function calling the Fortran main program parallel, and parallel calls subroutine padd. When the program is executed by more than one thread, you can switch from one thread to another. The second and the third call stacks are obtained by breaking at the entry to the parallel region. The call stack of master contains the complete call sequence. At the top of the call stack is _padd__6__par_loop0. Invocation of a threaded entry point involves a layer of Intel OpenMP library function calls (that is, functions with __kmp prefix). The call stack of the worker thread contains a partial call sequence that begins with a layer of Intel OpenMP library function calls.

ERRATA: The GNU debugger sometimes fails to properly unwind the call stack of the immediate caller of the Intel OpenMP library function __kmpc_fork_call().

Call Stack Dump of Master Thread upon Entry to Subroutine PADD

Switching from One Thread to Another

Call Stack Dump of Master Thread upon Entry to Parallel Region

Call Stack Dump of Worker Thread upon Entry to Parallel Region

Example 2  Debugging Code Using Multiple Threads with Shared Variables

Subroutine PADD() Machine Code Listing

     .globl padd_
padd_:
# parameter 1: 8 + %ebp
# parameter 2: 12 + %ebp
# parameter 3: 16 + %ebp
# parameter 4(n): 20 + %ebp
..B1.1:                      # Preds ..B1.0
..LN1:
pushl     %ebp                                   #1.0

... ... ... ... ... ... ... ... ... ... ... ... ...

 

..B1.19:                    # Preds ..B1.15
addl      $-28, %esp                             #6.0
movl      $.2.1_2_kmpc_loc_struct_pack.1, (%esp) #6.0
movl      $4, 4(%esp)                            #6.0
movl      $_padd__6__par_loop0, 8(%esp)          #6.0
movl      -196(%ebp), %eax                       #6.0
movl      %eax, 12(%esp)                         #6.0
movl      -152(%ebp), %eax                       #6.0
movl      %eax, 16(%esp)                         #6.0
movl      -112(%ebp), %eax                       #6.0
movl      %eax, 20(%esp)                         #6.0
lea       20(%ebp), %eax                         #6.0
movl      %eax, 24(%esp)                         #6.0
call      __kmpc_fork_call                       #6.0
             # LOE
..B1.39:                    # Preds ..B1.19
addl      $28, %esp                              #6.0
jmp       ..B1.31        # Prob 100%             #6.0
             # LOE
..B1.20:                    # Preds ..B1.30

... ... ... ... ... ... ... ... ... ... ... ... ...


call      __kmpc_for_static_init_4               #6.0
             # LOE
..B1.40:                    # Preds ..B1.20
addl      $36, %esp                              #6.0
             # LOE

... ... ... ... ... ... ... ... ... ... ... ... ...

 
..B1.26:                    # Preds ..B1.28 ..B1.21
addl      $-8, %esp                              #6.0
movl      $.2.1_2_kmpc_loc_struct_pack.1, (%esp) #6.0
movl      -8(%ebp), %eax                         #6.0
movl      %eax, 4(%esp)                          #6.0
call      __kmpc_for_static_fini                 #6.0
             # LOE
..B1.41:                    # Preds ..B1.26
addl      $8, %esp                               #6.0
jmp       ..B1.31        # Prob 100%             #6.0
             # LOE
..B1.27:                    # Preds ..B1.28 ..B1.25
..LN7:
call      omp_get_thread_num_                    #8.0
              # LOE eax
..B1.42:                     # Preds ..B1.27

... ... ... ... ... ... ... ... ... ... ... ... ...

 
cmpl      %edx, %eax                             #10.0
jle       ..B1.27       # Prob 50%               #10.0
jmp       ..B1.26       # Prob 100%              #10.0  
            # LOE
.type padd_,@function
.size padd_,.-padd_
.globl _padd__6__par_loop0
_padd__6__par_loop0:
# parameter 1: 8 + %ebp
# parameter 2: 12 + %ebp
# parameter 3: 16 + %ebp
# parameter 4: 20 + %ebp

# parameter 5: 24 + %ebp
# parameter 6: 28 + %ebp
..B1.30:                     # Preds ..B1.0
..LN16:
pushl     %ebp                                   #13.0
movl      %esp, %ebp                             #13.0
subl      $208, %esp                             #13.0
movl      %ebx, -4(%ebp)                         #13.0
..LN17:
movl      8(%ebp), %eax                          #6.0
movl      (%eax), %eax                           #6.0
movl      %eax, -8(%ebp)                         #6.0
movl      28(%ebp), %eax                         #6.0
..LN18:
movl      (%eax), %eax                           #7.0
movl      (%eax), %eax                           #7.0
movl      %eax, -80(%ebp)                        #7.0
movl      $1, -76(%ebp)                          #7.0
movl      -80(%ebp), %eax                        #7.0
testl     %eax, %eax                             #7.0
jg        ..B1.20       # Prob 50%               #7.0
            # LOE
..B1.31:                   # Preds ..B1.41 ..B1.39 ..B1.38 ..B1.30
..LN19:
movl      -4(%ebp), %ebx                         #13.0
leave                                            #13.0
ret                                              #13.0
.align    4,0x90
# mark_end;