From ArcoWiki
Revision as of 18:05, 12 March 2012 by Jmanel (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


  • Vijay S. Pai. Improving the Speed and Quality of Architectural Performance Evaluation. March 12th. (Slides)

    Abstract: The performance of multithreaded applications on multicore architectures depends largely on locality and communication. However, most performance analyses are architecture-dependent, and hence insights gleaned from an application's behavior on one platform may not apply when the application is run on another. In contrast, architecture-independent metrics allow a program's performance to be analyzed across a range of architectures without incurring the overhead of repeated profiling and analysis. We propose multicore-aware reuse distance, which captures the inherent locality properties of an application along with the impact of inter-thread data interactions. We then show how statistical sampling and parallelization can speed this analysis up by orders of magnitude with minimal loss of accuracy, enabling the use of privatized O(1) data structures, reduced synchronization, and sampling rates as low as one in a million.