This portal provides links to various research papers on simultaneous multithreading.

A more extensive (yet outdated) paper list can be found here:

General on Simultaneous Multithreading

  • "Simultaneous Multithreading: A Platform for Next-Generation Processors". S. Eggers, J. Emer, H. Levy, J. Lo, R. Stamm, D. Tullsen. IEEE Micro, vol. 17, no. 5, 1997. (ps)
  • "ILP versus TLP on SMT". N. Mitchell, L. Carter, J. Ferrante, D. Tullsen. Proceedings of the ACM/IEEE conference on Supercomputing, 1999. (ps)

SMT Implementations

  • "Hyper-Threading Technology Architecture and Microarchitecture". D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty, J. Miller, M. Upton. Intel Technology Journal, vol.3, issue 1, 2002. (pdf)
  • "Initial Observations of the Simultaneous Multithreading Pentium 4 Processor". N. Tuck, D. Tullsen. Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, 2003. (pdf)
  • "IBM Power5 Chip: A Dual-Core Multithreaded Processor". R. Kalla, B. Sinharoy, J. Tendler. IEEE Micro, vol. 24, no. 2, 2004. (pdf)(cslab)

Resource Sharing in SMTs

  • "Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor". D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, R. Stamm. Proceedings of the 23th annual international symposium on Computer architecture, 1996. (ps)
  • "Handling long-latency loads in a simultaneous multithreading processor". D. Tullsen, J. Brown. Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, 2001. (pdf)
  • "Front-End Policies for Improved Issue Efficiency in SMT Processors". A. Moursy, D. Albonesi. Proceedings of the 9th International Symposium on High-Performance Computer Architecture, 2003. (pdf)
  • "Dynamically Controlled Resource Allocation in SMT Processors". F. Cazorla, A. Ramirez, M. Valero, E. Fernandez. Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, 2004. (pdf)
  • "Learning-Based SMT Processor Resource Distribution via Hill-Climbing". S. Choi, D. Yeung. Proceedings of the 33rd annual international symposium on Computer Architecture, 2006. (pdf)
  • "Software-Controlled Priority Characterization of POWER5 Processor". C. Boneti, F. Cazorla, R. Gioiosa, A. Buyuktosunoglu, C. Cher, M. Valero. SIGARCH Comput. Archit. News, ACM, 36, 415-426, 2008. (pdf)

Helper Threading

  • "Simultaneous subordinate microthreading (SSMT)". S. Chappell, J. Stark, S. Kim, S. Reinhardt, Y. Patt. Proceedings of the 26th annual international symposium on Computer architecture, 1999. (pdf)
  • "Tolerating Latency through Software-Controlled Pre-Execution in Simultaneous Multithreading Processors". C. Luk. Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001. (ps)
  • "Speculative precomputation: long-range prefetching of delinquent loads". J. Collins, H. Wang, D. Tullsen, C. Hughes, Y. Lee, D. Lavery, J. Shen. Proceedings of the 28th annual international symposium on Computer architecture, 2001. (pdf)
  • "Speculative Precomputation: Exploring the Use of Multithreading for Latency". H. Wang, P. Wang, R. Weldon, S. Ettinger, H. Saito, M. Girkar, S. Liao, J. Shen. Intel Technology Journal, vol.3, issue 1, 2002. (pdf)
  • "Transparent Threads: Resource Sharing in SMT Processors for High Single-Thread Performance". G. Dorai, D. Yeung. Proceedings of the 11th international conference on Parallel architectures and compilation techniques, 2002. (pdf)
  • "Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors". D. Kim, J. Shen, S. Liao, P. Wang, J. Cuvillo, X. Tian, X. Zou, H. Wang, D. Yeung, M. Girkar. Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, 2004. (pdf)

Synchronization

  • "Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor". D. Tullsen, J. Lo, S. Eggers, H. Levy. Proceedings of the IEEE 5th International Symposium on High Performance Computer Architecture, 1999. (ps)
  • "Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers". J. Sampson, R. Gonzalez, J. Collard, N. Jouppi, M. Schlansker, B. Calder. Proceedings of the International Symposium on Microarchitecture, 2006. (pdf)

Job Scheduling

  • "Symbiotic job scheduling with Priorities for a Simultaneous Multithreading Processor". A. Snavely, D. Tullsen, G. Voelker. Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2002. (pdf)
  • "Architectural Support for Enhanced SMT Job Scheduling". A. Settle, J. Kihm, A. Janiszewski, D. Connors. Proceedings of the 13th international conference on Parallel architectures and compilation techniques, 2004. (pdf)
  • "Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors". R. McGregor, C. Antonopoulos, D. Nikolopoulos. Proceedings of the 19th International Symposium on Parallel and Distributed Processing, 2005. (pdf)
  • "Hyper-threading aware process scheduling heuristics". J. Bulpin, I. Patt. Proceedings of the USENIX Annual Technical Conference 2005 on USENIX Annual Technical Conference, 2005. (pdf)
  • "Dynamic run-time architecture techniques for enabling continuous optimization". T. Moseley, A. Shye, V. Reddi, M. Iyer, D. Fay, D. Hodgdon, J. Kihm, A. Settle, D. Grunwald, D. Connors. Proceedings of the 2nd conference on Computing frontiers, 2005. (pdf)
  • "Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors". D. Tam, R. Azimi, M. Stumm. Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, 2007. (pdf)

Code Optimizations and Runtime Techniques

  • "Tuning compiler optimizations for simultaneous multithreading". J. Lo, S. Eggers, H. Levy, S. Parekh, D. Tullsen. Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, 1997. (ps)
  • "Maximizing TLP with loop-parallelization on SMT". D. Puppin, D. Tullsen. 5th Workshop on Multithreaded Execution, Architecture, and Compilation, 2001. (ps)
  • "Code and Data Transformations for Improving Shared Cache Performance on SMT Processors". D. Nikolopoulos. ISHPC, pp. 54-69, 2003. (pdf)
  • "Runtime support for integrating precomputation and thread-level parallelism on simultaneous multithreaded processors". M. Curtis-Maury, T. Wang, C. Antonopoulos, D. Nikolopoulos. Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems, 2004. (pdf)
  • "Runtime Empirical Selection of Loop Schedulers on Hyperthreaded SMPs". Y. Zhang, M. Voss. Proceedings of the 19th International Symposium on Parallel and Distributed Processing, 2005. (pdf)(cslab)
  • "Integrating Multiple Forms of Multithreaded Execution on multi-SMT Systems: A Study with Scientific Applications". M. Curtis-Maury, T. Wang, C. Antonopoulos, D. Nikolopoulos. Proceedings of the Second International Conference on the Quantitative Evaluation of Systems, 2005. (pdf)
  • "Dynamic tiling for effective use of shared caches on multithreaded processors". D. Nikolopoulos. International Journal of High Performance Computing and Networking, Vol. 2, pp. 22-35, 2006. (pdf)

Applications

  • "Database hash-join algorithms on multithreaded computer architectures". P. Garcia, H. Korth. Proceedings of the 3rd conference on Computing frontiers, 2006. (pdf)(cslab)
  • "Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architectures". C. Antonopoulos, X. Ding, A. Chernikov, F. Bagojevic, D. Nikolopoulos, N. Chrisochoides. Proceedings of the 19th annual international conference on Supercomputing, 2005. (pdf)
  • "Stream Programming on General-Purpose Processors". J. Gummaraju, M. Rosenblum. Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, 2005. (pdf)

SMT Extensions and Variations

  • "Mini-Threads: Increasing TLP on Small-Scale SMT Processors". J. Redstone, S. Eggers, H. Levy. Proceedings of the IEEE 9th International Symposium on High Performance Computer Architecture, 2003. (pdf)
  • "Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy". E. Tune, R. Kumar, D. Tullsen, B. Calder. Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, 2004. (pdf)


This topic: CSLab > WebHome > BibPortal > BibSMT
Topic revision: r4 - 2008-10-08 - NikosAnastopoulos
 
This site is powered by the TWiki collaboration platform Powered by Perl

No permission to view TWiki.WebBottomBar