Profrun Utility

The profrun utility is a tool for collecting data from the Performance Monitoring Unit (PMU) hardware for a CPU.

Profrun Utility Requirements and Behavior

This utility uses the Intel® VTune™ Performance Analyzer Driver to sample events. Therefore, you must have the VTune™ analyzer installed to use the profrun utility. The requirements differ depending on platform:

Platform

Requirements

Linux*

  1. Install Intel® VTune Performance Analyzer 3.0 for Linux.

  2. Grant utility user to the appropriate access to the VTune  Performance Analyzer Driver.

    All users running the utility must be a member of the same group used by the Intel® VTune  Performance Analyzer 3.0 for Linux Driver; the default group is vtune. If another group was specified during installation, add the user to the specified group.

Windows*

  1. Install Intel® VTune Performance Analyzer 7.2.

  2. Grant utility user to the appropriate access to the VTune  Performance Analyzer Driver.

    All users running the utility must have Profile Single Process and Profile System Performance rights on the system. Adding the user to either the Administrators or Power Users group should work. Refer to Notes on Windows-family installations in the Intel® VTune  Performance Analyzer Release Notes for detailed information.

  3. Specify a local disk as the default directory.

The utility, in coordination with the analyzer driver, collects samples of events monitored by the PMU and creates a hardware profiling information file (.hpi). The hardware profiling data contained in the file can be used by the Intel® compiler to further enhance optimizations for some programs.

During the initial target program analysis phase, VTune™ analyzer driver creates a file that contains event data for all system-wide processes. By default, the file is named pgopti.tb5. Eventually, the profrun utility will coalesce the data for the target executable into a significantly smaller pgopti.hpi file, and then deletes the .tb5 file.

Note

Your system must have sufficient hard disk space to hold the .tb5 file temporarily. The file size can range widely, plan for as much as 400 MB.

The VTune™ analyzer driver can be used by only one process at a time. The profrun utility returns an error if the driver is already in use by another process. By default, profrun waits up to 10 minutes for the driver to become available before attempting to access it again.

Using the profrun Utility

  1. Compile your target application using the -prof-gen-sampling (Linux*) or /Qprof-gen-sampling (Windows*) option. The following examples illustrate possible combinations:

Platform

Command Examples

Linux

ifort -oMyApp -O2 -prof-gen-sampling source1.f source2.f

Windows

ifort /FeMyApp.exe /02 /Qprof-gen-sampling source1.f source2.f

  1. Run the resulting executable by entering a command similar to the following:

Platform

Command Examples

Linux

profrun -dcache MyApp

Windows

profrun -dcache MyApp.exe

This step uses the VTune™ analyzer driver to produce the necessary .hpi file.

  1. Compile your target application again; however, during this compilation use the -prof-use (Linux) or /Qprof-use (Windows) option. The following examples illustrate possible, valid combinations:

Platform

Command Examples

Linux

ifort -oMyApp -02 -prof-use source1.f source2.f

Windows

ifort /FeMyApp.exe /02 /Qprof-use source1.f source2.f

The -prof-use (Linux) or /Qprof-use (Windows) option instructs the compiler to read the .hpi file and optimize the application using the collected branch sample data.

The profrun utility uses the following syntax:

Syntax

profrun -command [argument...] application

where command is one or more of the commands listed in the following table, argument is one or more of the extended switches, and application is the command line for running the application, which is usually the application name.

Note

Windows* systems: Unlike the compiler options, which are preceded by forward slash ("/"), the utility options are preceded by a hyphen ("-").

Profrun Utility Options

The following table summarizes the available profrun utility commands, list defaults where applicable, and provides a brief description of each.

Command

Default

Description

-help

 

Lists the supported tool options.

[sav]

10007

An optional argument supported by the -branch, -dcache, -icache or -event commands.

This integer value specifies the sample-after value for used for collecting samples. The default value is 10007, which is a moderately-sized prime number. If another value is not specified for sav when using any of the supported commands, the default value is used. Use a prime number for best results.

When changing the value, keep the following guideline in mind:

  • Decreasing the value to forces the utility to sample more frequently. Frequent sampling results in a more accurate profile, and it a larger output file size, when compared to the file size created by the default value.

  • Increasing the value to forces the utility to sample less frequently. Less frequent sampling results in less accurate profiles, and it produces a relatively smaller output file size.

-branch[:sav]

10007

Collect branch samples. A sample is taken after every interval, as defined by the sav argument.

Use this command to gather information that guides the compiler while doing hot/cold block layout, predicting which direction conditional branches directions, and deciding where it is most profitable to inline functions. Gathering information using this command is similar to using the -prof-gen (Linux) or /Qprof-gen (Windows) option to instrument your program, but using this command is much less intrusive.

-dcache[:sav]

10007

Collect samples of data cache misses. A sample is taken after every interval, as defined by the sav argument.

Use this command to gather information to guide the compiler in placing prefetch operations and performing data layout.

-icache[:sav]

10007

Collect samples of misses in the Instruction cache. A sample is taken after every interval, as defined by the sav argument.

Use this command to gather information to guide the compiler in placing prefetch operations and performing data layout.

-event:eventname[:sav]

10007

Collect information about valid VTune™ analyzer events; eventname specifies a specific event name. Use this command when -branch, -dcache, or -icache do not apply.

Some event names contain embedded spaces. In the case where you can use a period instead of a space. The utility will change periods to spaces before passing the event to VTune™ analyzer.

Refer to Intel® VTune™ Performance Analyzer documentation for more information on valid events.

-tb5:file

pgopti.tb5

Specifies the name of the .tb5 file name generated by the VTune™ analyzer driver while it profiles the application. By default, the file resides in the current directory,

The specified file will be deleted when profrun completes unless -tb5only is also specified.

You might consider overriding this behavior and place the .tb5 file on a disk with more available space.

-tb5only

 

Produces only the .tb5 file. If this command is not specified the utility will reduce the data into a single .hpi file and delete the .tb5 file.

-wait[:time]

600 seconds

(10 minutes)

Forces the utility to wait for the specified time before attempting to access to the VTune™ analyzer driver. This option is most useful in cases where you anticipate the driver will be busy.

Disable the command by specifying a value of 0 (zero).

-hpi:file

pgopti.hpi

Specifies name of the .hpi file containing the profile information for the application. The file resides in the current directory.

-executable:file

 

Specifies the name of the executable being profiled. By default, this is the first token of the command.

-bufsize:size

65536 KB
(64 MB)

Specifies the buffer size used in kilobytes (KB).

-sampint:interval

 

Specifies sampling interval in Milliseconds (ms). This command should rarely be needed since all sampling is event-based sampling.

--

 

Stop parsing options for the tool. Use this if the command name starts with a hyphen.