2.4. Advanced Use

2.4.1. Burst Sampling

By default Freja will sample the application continuously during the run, collecting information about its memory access pattern. The sampling adds some overhead to the application, making it run slower than usual.

However, for long application runs Freja can collect enough information without continuous sampling. Instead it can engage periodically during the execution. This is called burst sampling. Since a smaller part of the execution run is sampled the overhead of the sampling becomes much lower.

Burst sampling can be used to sample executions that take at least 5 minutes. With runs shorter than that the sampler needs to engage so frequently that the execution becomes more or less continuously sampled, and no reduction in overhead is gained.

With burst sampling the overhead for a sampling run becomes independent of the length of the execution. Sampling a 1-hour application run may, for example, cause a 20 minute overhead, taking 1 hour and 20 minutes. Sampling a 4 hour run of the application would then also cause approximately a 20 minute overhead, taking 4 hours and 20 minutes.

When you enable burst sampling you also need to specify an estimated (normal) execution time for the part of the application run that you intend to sample. Freja needs this information to calculate how densely the bursts need to be placed to collect enough information for analysis. If Freja does not capture enough information you will get this warning when the sample file is post-processed:

Post-processing sample file, please wait...
Warning: Not enough samples for reliable results.
Got 6252 samples of the largest line size.
Required number of samples: 10000.
Consider decreasing the estimated execution time by 87%.
Sample file post-processing finished.

If you get the warning above it does not necessarily mean your estimate of the execution time was inaccurate. You may also get this warning if the overhead of sampling this particular application is unusually low. Adjust the estimated execution time as suggested and sample the application again.

It is possible to tweak the overhead of the burst sampling by adjusting the quality parameter. Switching this parameter to 'fast' yields a lower overhead but may have a negative impact on the data quality. Conversely using 'detailed' quality increases the sampler overhead and will also increase the data quality.

The effects from changes in burst quality are usually limited to applications with low fetch ratios when analyzing large caches. Some issues are more sensitive to the quality setting than others, specifically 'Loop Fusion', 'Blocking' and 'Inefficient Loop Nesting'. The locations affected by these issues will still be flagged as issues, even when the burst quality is too low, but with less specific issue types.

It is usually not necessary to change the burst quality parameter, the default quality should provide an acceptable overhead and good data quality for most common cache sizes.

See Section 2.1.5, “Advanced Sampling Settings” and Section 2.2.1.1, “Burst Sampling” for instructions on how to enable burst sampling in the graphical interface and from the command line, respectively.

2.4.2. Sampling Start Conditions

By default Freja will start to sample the application as soon as you have launched it or attached to it. However, you may sometimes want to delay the start of the sampling, for example, to avoid sampling the application start up. There are a few ways to do that.

The simplest way is to specify a delay in seconds before the sampling is started. This can be done in the Advanced sampling settings dialog if you are using the GUI, or with the -d seconds option if you are using the sample command.

You can also specify the sampling to start when a specific function is called. This can again be done in the Advanced sampling settings dialog if you are using the GUI, or with the --start-at-function function option if you are using the sample command. The sampling is started the first time the specified function is called. Note that when specifying a function to start the sampling at, the function must be in the application binary. The function may not be in a shared library loaded by the application.

If you want to start the sampling at a function in a shared library you can instead determine the address where the function will be loaded and specify that address. You find the address of the function by starting the program and looking up the function in a debugger.

The address where the sampling should be started is specified in the Advanced sampling settings dialog if you are using the GUI, or with the --start-at-address address option if you are using the sample command. The sampling is started the first time the instruction at the address is executed.

It is also possible to ignore the first few times the function or address is executed. The number of executions to ignore is specified in the Advanced sampling settings dialog if you are using the GUI, or with the --start-at-ignore count option if you are using the sample command. For example, to start the sampling the fourth time the function mult is called you could add the options --start-at-function mult --start-at-ignore 3 to the sample command.

2.4.3. Sampling Stop Conditions

By default the sampling will be ended when the application terminates or when you manually stop the sampling at the sampler prompt. However, you can also stop the sampling after a fixed time or when a specific function or address is executed, just like when controlling when to start the sampling in Section 2.4.2, “Sampling Start Conditions”.

If you are using the GUI you can find the controls for stopping the sampling in the Advanced sampling settings dialog. They work just like the controls for starting the sampling.

To stop the sampling after a fixed time using the sample command, use the -t seconds option. The options for stopping the sampling when a function or address is executed are --stop-at-function function, --stop-at-address address and --stop-at-ignore count. They work just like the corresponding options for controlling the start of the sampling.

The same limitation as when specifying a function to start the sampling at still applies, the function must be in the application binary.

If you specify that the sampling should be stopped after a fixed time, that time is counted from when the sampling is started. Similarly, if you specify that the sampling should be stopped after a function or address has been executed a number of times, those executions are counted from when the sampling is started.

2.4.4. Sample Files

A sample file contains the fingerprint of the application memory access behavior collected during sampling. If you have a sample file you can generate a report for that sampling of the application for any cache configuration.

The amount of information captured in a sample file is measured in the number of samples. Too few samples will cause unreliable results, capturing too many samples will cause the sampling of the program to run slower and the report generation to take longer and use more memory. 10000 samples is enough to get a reliable result, and by default the sampler will try to capture 50000 samples.

The report will contain a clearly visible warning if the sample file contained too few samples for a reliable result.

2.4.4.1. Tuning the Sample Period

The sampler automatically adapts to the running program and attempts to capture 50000 samples during the sampling run. There is therefore usually no need to adjust the sample rate.

However, if the program runs for a very short time, a few seconds or less, the sampler may fail to capture enough samples. You will then get a warning like this when the sample file is being post processed:

Warning: Not enough samples for reliable results.
Got 3011 samples of the largest line size.
Required number of samples: 10000.
Consider decreasing the sample period to at least 301.

To collect enough samples you then have to manually tell the sampler to use a lower sample period than the default of 100000, as suggested in the warning. If you are using the graphical user interface you can specify the sample period in the Advanced sampling options dialog. If you run the sample command from the command line you use the -s sample period option, for example, to sample ls -l with a sample period of 100:

$ sample -s 100 -r ls -l

If you are sampling a program for a long time you may get a small sampling speed up if you increase the sample period from the default. There is no simple rule of thumb for how to set the sample period in this case. Try increasing sample periods, and when you get the warning above you know you have exceeded that maximum sample period that can be used.

2.4.4.2. Disabling Application Stack Use by the Sampler

By default the sampler will use the application stack for temporary storage, since that is faster than alternative storage options. This is safe to do with the vast majority of applications, since the sampler obeys the standard conventions for how the stack should be managed.

However, with a few applications that use the stack in a non-standard way this may lead to incorrect execution or crashes. If you experience such problems you may want to try disabling the sampler's use of the application stack. This will result in slower sampling, but avoids the potential problems.

To disable the sampler's use of the application stack add the --safe-stack option to the sample command, or check the Safe stack handling check box in the Advanced sampling settings dialog if you are using the GUI.