Difference: SPMV (13 vs. 14)

Revision 142009-06-25 - VasileiosKarakasis

Line: 1 to 1
META TOPICPARENT name="ActivitiesProjects"

Sparse Matrix-Vector Multiplication

Line: 8 to 8
 Another important optimization that has been widely applied to SPMV is blocking. While such optimization was initially intended to increase register reuse and the locality of references to the input vector, the main advantage of blocking is that it can significantly reduce the working set of the algorithm, thus leading to less memory traffic, meaning better performance for the case of the memory-bound SPMV. After attacking the memory intensiveness of the algorithm, the computational part of the kernel becomes more significant as a result. This allows for computation optimizations, such as vectorization, which can add a further performance boost to the blocked kernel. However, preliminary experiments have shown that the performance among different block shapes, which lead to the same working set, can have significantly varying performance, especially in the case of vectorization, due to a number of architectural issues raised from modern commodity microprocessors. Considering the multitude of blocked storage formats available for sparse matrices and the fact that block shapes can heavily affect the performance of SPMV, a performance model that could accurately predict the performance of a blocked storage format and then guide through the proper selection of both the right method and the right block shape would be desirable. Towards this direction, we have developed a performance model that accounts for both the memory and computation times of the blocked kernels. Our model is capable to better approximate the real execution time and lead to a more fit selection of method and block shape than earlier models, which solely based their prediction to memory traffic.


  • V. Karakasis, G. Goumas, and N. Koziris, "Performance models for blocked Sparse Matrix-Vector multiplication kernels," International Conference on Parallel Processing (ICPP), Vienna, Austria, September 22-25, 2009. (to appear).
  • V. Karakasis, G. Goumas, and N. Koziris, "A comparative study of blocking storage methods for sparse matrices on multicore architectures," IEEE International Conference on Computational Science and Engineering (CSE), Vancouver, BC, Canada, August 29-31, 2009. (to appear).
  • V. Karakasis, G. Goumas, and N. Koziris, "Exploring the effect of block shapes on the performance of sparse kernels," Proceedings of the 23rd International Parallel and Distributed Processing Symposium (IPDPS-PDSEC), Rome, Italy, May 25-29, 2009.

Diploma Thesis

  • Implementation of the Sparse Matrix-Vector Multiplication computational kernel on massively threaded graphics processors (pdf).
This site is powered by the TWiki collaboration platform Powered by Perl

No permission to view TWiki.WebBottomBar