4.10. Hardware Prefetch Probability

Modern processors detect regular access patterns in applications, and use this information to prefetch data into the cache before it is needed by the application. This can hide the memory accesses latencies and avoid the stalls that the cache misses would otherwise have caused.

Freja models a generic hardware prefetcher and estimates the percentage of the cache misses that can be avoided by the hardware prefetcher. This number assumes an idle memory bus and does not take the memory bandwidth limitation into account.

The hardware prefetch probability can be used to judge if addressing an issue will be worthwhile, and have noticeable effect, or if the processor is likely able to handle the problem by itself. Fixing an issue with a high hardware prefetch probability may not result in a performance improvement, since the processor is likely to able to prefetch the data and therefore avoid cache misses.

However, if the application runs into the memory bandwidth limitation the prefetching will be ineffective and fixing the issue will improve performance anyway.

Another use for the hardware prefetch probability is to find data structures that are not effectively prefetched by the hardware prefetcher, and try to reorganize them so that they can be effectively prefetched.