Name

report — generate a report from a sample file

Synopsis

report [-h, --help] [--version] [--verbose] -i, --in-file FILENAME [-t, --title TITLE] [-o, --report NAME]
[-s, --source-dir DIRECTORY...] [-D, --debug-dir DIRECTORY...] [-b, --binary BINARY...] [--debug-symbol-level LEVEL] [--c++filt[=FILTER]]
[-p, --percent PERCENT] [-d, --depth FRAMES] [--new-hwpf-algorithm] [--no-barriers]
[--cpu CPU] [--number-of-cpus NUM] [--level LEVEL]
[-c, --cache-size SIZE] [-l, --line-size SIZE] [-r, --replacement POLICY] [-n, --number-of-caches NUM] [--group THREAD[,THREAD...]] [--real-thread-id]

Description

report generates a report based on an existing sample file.

Options

-h, --help

Print help message.

--version

Print version information.

--verbose

Show which filenames are tried when searching for debug information and source code.

-i, --in-file FILENAME

Specifies the input file.

-t, --title TITLE

Title of report.

-o, --report NAME

Name of the generated report file, defaults to report.tsr.

-s, --source-dir DIRECTORY

Additional directories to look for source code in.

-D, --debug-dir DIRECTORY

Additional directories to look for external debug information in. The report tool will by default look in the system global debug directory (/usr/lib/debug), the .debug directory in the same directory as the binary and the same directory as the binary.

-b, --binary BINARY

Specifies an additional binary containing debug information to use if the sampled binary with the same file name can not be found.

--debug-symbol-level

(experimental) Balance debug symbol detail and processing speed.

0

no debug symbols

1

line number

2

line number and public symbols

3

full debug info (default)

--c++-filt[=FILTER]

Specify an external symbol demangler program to translate the symbols for presentation. Useful for c++ code compiled with the 'stabs' debugging format.

If --c++filt is given without a program, the external program c++filt is used.

If --c++filt=FILTER is specified, use the program FILTER.

Default is to not translate symbols.

-p, --percent PERCENT

Percent of total fetches, upgrades or write-backs required for advice to be reported. The default is 1.

-d, --depth FRAMES

Stack depth to use for separating issues caused by different calls to the same function. Default 1. Use 0 to merge all different call paths into a function for analysis.

--new-hwpf-algorithm

Use a new experimental version of the hardware prefetch analysis algorithm, that more diligently captures complex patterns. It consumes more memory and takes longer for some input sets than the default algorithm.

--no-barriers

Show fusion and blocking advice which would otherwise be suppressed due to detected possible data dependencies.

--cpu CPU

Selects the processor model to use in the analysis. CPU is specified as vendor-id/cpu-id. Default is to 'auto'.

The following special processor models are defined:

help

Lists available processor models.

auto

Auto-detects the processor model of the computer the report is being generated on.

--number-of-cpus NUM

Number of physical processors to include in the analysis. Each physical processor may have multiple logical processors (cores/threads). The special value '0' may be used to indicate that auto-detection should be used, which is also the default.

--level LEVEL

Selects the cache level to analyze. The number of available cache levels depend on the selected processor model. Default is to analyze the highest cache level.

-c, --cache-size SIZE

Overrides the cache size specified in the processor model.

-l, --line-size SIZE

Overrides the cache line size specified in the processor model. Must be power of two.

-r, --replacement POLICY

Cache replacement policy. Must be 'random' or 'lru'. The default is 'random'.

-n, --number-of-caches NUM

Total number of caches to assign threads to. Should match the number of caches of the desired cache level for the intended processor/architecture. Default: Determined by the processor model, cache level and the number of physical processors.

The special value '0' may be used to assign one private cache to each thread in the application.

--group THREAD[,THREAD...]

Manually specify thread to cache mappings. Each instance of the --group parameter creates a new cache with the specified threads. Threads that are left unmapped will be automatically assigned to the created caches.

This option overrides the number of caches specified in the processor model and using the --number-of-caches option.

--real-thread-id

Use real thread id:s instead of virtual when specifying cache groups using the --group parameter. Virtual thread id:s are assigned sequentially starting from 0 in the order they are seen in the sample file. The real thread id:s are the id:s exposed by the operating system during sampling.

Examples

Example 8. Analyzing sample files using autodetected CPU models

Perform an analysis of sample.smp for the currently running processor on cache level 2:

$ report --level 2 -i sample.smp

The report tool will create a report file named report.tsr by default.


Example 9. Specifying a CPU model

If you are running a different processor than you are analyzing for, you may specify the --cpu model option.

First, use --cpu help to get a list of available CPUs.

$ report --cpu help

Find the processor you want to perform analysis for, for example the Intel Quad-Core Xeon E5345 which has the model name 'clowertown_4_8'.

Use the manufacturer name together with the model name like this when calling report:

$ report --cpu intel/clovertown_4_8 -i sample.smp


Example 10. Using custom thread to cache mappings

Some effects of communication between threads change depending on how threads are mapped to different caches. It is possible to explicitly specify thread to cache mappings to evaluate such effects.

Assume that the sampled application contains 4 threads and we are interested in what happens in the L2 cache of a system with two coherent caches at this level. Assuming we are only interested in the cases where two threads are mapped to each cache, the following commands will create reports for all unique such cases:

$ report --group 0,1 --group 2,3 --level 2 -o case1 -i sample.smp
	$ report --group 0,2 --group 1,3 --level 2 -o case2 -i sample.smp
	$ report --group 0,3 --group 1,2 --level 2 -o case3 -i sample.smp

There are other possible permutations as well, but they will be identical to one of the above mappings due to symmetry.


Exit Status

0

Successful program execution.

>0

An error occured.

Environment

RW_LICENSE_FILE

Environment variable pointing to the license file. Should only be used to override default license locations.

Files

$HOME/.threadspotter/license

Directory containing per-user license files. Freja looks for license files here if no system wide license file can be found.