Computing Systems Laboratory

CSLab possesses a strong expertise in parallel & distributed systems with a tradition spanning more than three decades in the implementation, optimization and operation of systems at all scales.

Photo credit: Unsplash

Past Events

8th Computing Systems Research Day - 7 January 2025

Schedule

12:00-12:15 | Welcome
12:15-13:15 | Achieving Bare-Metal Performance in the Cloud (Kostis Kaffes, Columbia University)
Abstract

Modern cloud applications need microsecond-level responsiveness, yet current virtualization approaches often cause millisecond-scale delays. This talk presents two complementary solutions that bring virtualized environments closer to bare-metal performance. First, Rorke is a microsecond-scale VM scheduler for oversubscribed cloud environments. By approximating processor sharing at the host and dynamically adapting time slices, Rorke cuts tail latency by over 10x for popular low-latency workloads without harming throughput in non-oversubscribed scenarios. Second, Machnet is a userspace network stack designed for public clouds. Rather than relying on specialized NIC features unavailable in virtual NICs, Machnet uses a “Least Common Denominator” approach and a microkernel design to support flexible execution models. It achieves substantial latency and CPU efficiency gains, demonstrating 80% lower latency and 75% lower CPU utilization for a key-value store compared to today’s best solutions. Together, Rorke and Machnet bring virtualized infrastructure closer than ever to bare-metal levels of performance, setting a new standard for cloud computing efficiency.

Bio

Kostis Kaffes joined the Department of Computer Science at Columbia University as an assistant professor in June 2023. Kostis obtained an MSc and PhD from Stanford University in 2018 and 2022, respectively, and an undergraduate degree from the National Technical University of Athens in Greece in 2015. He is broadly interested in computer systems, cloud computing, and scheduling. He has worked on end-host, rack-scale, and cluster-scale scheduling for microsecond-scale tail latency. He has also been seeking ways to accelerate machine learning systems and use machine learning to improve operating systems management. Prior to Columbia, he spent a year at Google SRG.
13:15-14:00 | Lunch Break
14:00-14:30 | System Software Interfaces in the CXL Era and a Case for Fast Process Cloning (Chloe Alverti, UIUC)
Abstract

The shared and distributed memory capabilities of the emerging Compute Express Link (CXL) interconnect urge the rethink of traditional system software interfaces. In this talk, we will discuss the challenges of CXL-connected distributed systems and explore one such interface: remote fork over CXL fabrics for cluster-wide process cloning. In detail, we will introduce CXLfork that realizes zero-serialization, zero-copy process cloning across nodes. CXLfork utilizes globally-shared CXL memory for cluster-wide deduplication of process states and enables fine-grained control of state tiering between local and CXL memory. We will show how it can be integrated to Serverless Runtimes to achieve fearless concurrency. In detail, we will introduce CXLporter, an efficient horizontal autoscaler for serverless functions deployed over CXL fabrics. Overall, CXLfork attains a remote fork latency close to that of a local fork, outperforming state-of-practice by 2.26x on average, and reducing local memory consumption by 87% on average. Integrated to CXLporter it can achieve high throughput with 3x less local memory resources.

Bio

Chloe Alverti is a postdoctoral researcher at University of Urbana Champaign (UIUC), hosted by professor Josep Torrellas. Her ongoing research is part of the ACE Center for Evolvable Computing. She received her PhD in 2022 from the School of Electrical and Computer Engineering at National Technical University of Athens (NTUA), where she was a member of the Computing Systems Laboratory (CSLAB) supervised by professor Georgios Goumas. During her studies she spent 3 months as a visiting scholar at University of Wisconsin-Madison working with Professor Michael Swift. Before her PhD she worked for two years as a research assistant at Chalmers University of Technology advised by professor Per Stenstrom. Her research interests are focused on system software and hardware co-design for efficient memory access and virtualization, recently focusing on distributed systems.
14:30-15:00 | Unlocking the Potential of Memory Disaggregated Cloud Systems (Dimosthenis Masouros, NTUA)
Abstract

An interference-aware memory orchestration framework that enables effective/optimized data placement decisions on memory-disaggregated cloud infrastructures, named Adrias, is introduced. The key features of Adrias could be summarized through: i) its ability to forecast the tendency of system-wide metrics in the future, thus driving proactive memory orchestration decisions; ii) its accurate performance predictions for deployed applications w.r.t. memory heterogeneity (local/fast vs. remote/slow DRAM) and interference and iii) its power to leverage disaggregated memory with minimal impact on the performance of deployed applications without the employment of dynamic memory management mechanisms. Adrias exploits system-level performance monitoring information and leverages deep learning approaches to place incoming applications on the pool of available memory resources.

Bio

Dr. Dimosthenis Masouros received his Diploma and Ph.D. degrees from the Department of Electrical and Computer Engineering at the National Technical University of Athens, Greece, in 2016 and 2023, respectively. His research focuses on systems optimization, with an emphasis on leveraging machine learning techniques to address challenges in resource allocation, application scheduling and systems performance prediction. His current research interests include optimizing performance and energy efficiency in emerging paradigms such as serverless computing, Large Language Models, Federated Learning, and other related technologies. He has been actively involved in five European research projects and has authored over 40 peer-reviewed papers in leading international conferences and journals.
15:00-15:30 | Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes (Stratos Psomadakis, NTUA)
Abstract

Large pages have been the de facto mitigation technique to address the translation overheads of virtual memory, with prior work mostly focusing on the large page sizes supported by the x86 architecture, i.e., 2MiB and 1GiB. ARMv8-A and RISC-V support additional intermediate translation sizes, i.e., 64KiB and 32MiB, via OS-assisted TLB coalescing, but their performance potential has largely fallen under the radar due to the limited system software support. In this paper, we propose Elastic Translations (ET), a holistic memory management solution, to fully explore and exploit the aforementioned translation sizes for both native and virtualized execution. ET implements mechanisms that make the OS memory manager coalescing-aware, enabling the transparent and efficient use of intermediate-sized translations. ET also employs policies to guide translation size selection at runtime using lightweight HW-assisted TLB miss sampling. We design and implement ET for ARMv8-A in Linux and KVM. Our real-system evaluation of ET shows that ET improves the performance of memory intensive workloads by up to 39% in native execution and by 30% on average in virtualized execution.

Bio

Stratos Psomadakis is a final-year PhD student at the National Technical University of Athens under the supervision of Prof. Georgios Goumas. His research interests lie in the intersection of Operating Systems and Hardware, with a focus on virtual memory and emerging ISAs.
15:30-16:00 | Coffee Break
16:00-17:00 | Milner, McIlroy, and You Go to a Party (Nikos Vasilakis, Brown University)
Abstract

Language-agnostic composition environments such as OSes, Shells, microservices, and serverless have always held the promise of significant benefits, including in developer effort, financial costs, and component specialization. Unfortunately, these environments hinder the performance optimizations, strong correctness, and security guarantees that are typical of language-aware, semantics-first environments. In this talk, I will discuss how recent developments across fields allow overcoming these challenges, offer several benefits, and enable new opportunities for exciting research that has the potential for widespread impact.

Bio

Nikos Vasilakis is on the faculty of Computer Science at Brown University. His research encompasses software systems, programming languages, and security, with a current focus on automatically transforming systems to add new capabilities such as parallelism, distribution, isolation, and correctness. Prof. Vasilakis is also the chair of the Technical Steering Committee behind PaSh, a shell-script optimization system hosted by the Linux Foundation. More: https://nikos.vasilak.is and https://atlas.cs.brown.edu
17:00-17:15 | Closing Remarks

7th Computing Systems Research Day - 9 January 2024

Schedule

6th Computing Systems Research Day - 11 January 2023

Schedule

5th Computing Systems Research Day - 7 January 2020

Schedule

4th Computing Systems Research Day - 7 January 2019

Schedule

12:45-13:45 | Reconfigurable Architectures for Efficient Application Acceleration (Amphitheater 5, New Electrical Engineering Buildings)
Abstract

To meet ever-increasing computational needs and fixed power budget, computing systems are forced to adopt more efficient computational engines. With the end of Moore and Dennard scaling, technology alone cannot satisfy these needs, hence systems incorporate heterogeneous accelerators, that is, units optimized for specific sets of functions. At the Microprocessor and Hardware Laboratory of the Technical University of Crete we have a long track of research in reconfigurable accelerators. I will give a brief overview of our recent work on accelerators for intensive big-data (Classification and Frequent sub-graph mining), streaming applications (Stream Join and ECM Exponential Sketch generation) and bioinformatics. These works have been designed and prototyped for high-performance reconfigurable platforms such as Convey and Maxeler.

Bio

Dionisis Pnevmatikatos is a Professor and Director of the Microprocessor and Hardware Laboratory at the School of Electrical and Computer Engineering of the Technical University of Crete. He received his PhD in Computer Science from the University of Wisconsin-Madison in 1995. His research interests include Computer Architecture with a focus on the use of Reconfigurable Logic for the creation of efficient accelerators in heterogeneous parallel systems. He has also worked on the design of dependable systems, architectures for hardware or reconfigurable logic application acceleration, network packet processors, and related areas. He has served as coordinator of the European research project FASTER (FP7) and as Principal Investigator in the European research projects DeSyRe (FP7), AXIOM, dRedBox, and EXTRA (H2020), as well as several national projects. He is a regular member of Program Committees at key conferences in his area such as FPL and DATE.
13:45-14:30 | Break
14:30-15:00 | Experts Need Not Apply (Pavlos Petoumenos, University of Edinburgh)
Abstract

Single core processing power has stagnated, forcing us to use increasingly complex processing systems in order to extract performance: multicores, GPUs, asymmetric multiprocessors, distributed computing, computation offloading. Writing fast and correct programs for them is tough even for experts. For most programmers it is almost impossible. Existing development tools do not help enough. Most analysis and optimization is left to the programmer, while the decisions that our tools can make are often suboptimal or wrong. With hardware becoming more complicated, the gap between what we need our tools to do and what they can achieve will only grow. In this talk, I will present a new method for bridging this gap. The central idea is substituting expert understanding of how code is structured and works with automatically trained deep neural networks. Such learned models can give us all the information we need to analyze the code and drive optimization decisions. This approach allows us to build new powerful tools with little human input, even less expertise, and in a mostly language agnostic way, dramatically reducing the difficulty and cost of creating such tools.

Bio

Pavlos Petoumenos is a Senior Researcher at the University of Edinburgh and a Research Fellow of the Royal Academy of Engineering. His work focuses on code optimization techniques for performance, energy, and size. Much of his recent output explores ways of automating optimization decisions through machine and deep learning. He was awarded a PhD from the University of Patras for his work on cache sharing and cache replacement techniques.
15:00-15:30 | Efficient Scheduling Techniques on Low Power Heterogeneous Architectures (Ioanna Alifieraki, Canonical Ltd.)
Abstract

The continuous growth of computer systems have introduced a new era for computing. The performance and power gains that came through advancements in transistor technology driven by Moore’s law have begun to diminish due to Dennard’s Scaling hitting the physical boundaries. The increasing demand for performance along with resource constraints have brought energy and power efficiency to the forefront of research agenda. Power efficiency requirement is imposed by thermal problems in modern chips while energy efficiency is needed for long lasting batteries and low electricity costs. The inability of multi-core processors to meet the above requirements have shifted research towards heterogeneous architectures. This work explores scheduling techniques on single-ISA heterogeneous architectures, and more specifically on ARM big.LITTLE systems. The state-of-the-art schedulers for big.LITTLE systems are based on the default Time Preemptive Scheduling mechanism of the Linux kernel which can miss rapid phase changes of the workload. This work proposes a novel scheduling mechanism, called Context Preemptive Scheduling, that exploits features of the ARM architecture to closely track phase changes in running programs and invokes the migration process of the scheduler in time.

Bio

Ioanna Alifieraki is a software engineer at Canonical Ltd. Prior to this, she was a PhD student at the University of Manchester. She received her Diploma in Electrical and Computer Engineering from NTUA in 2014.
15:30-16:00 | BASMAT: Bottleneck-Aware Sparse Matrix-Vector Multiplication Auto-Tuning on GPGPUs (Athena Elafrou, NTUA)
Abstract

Over the past few years, a large body of research has been devoted to optimizing sparse matrix-vector multiplication (SpMV) on General Purpose Graphics Processing Units (GPGPUs). Numerous sparse matrix formats and associated algorithms have been proposed, with different strengths and weaknesses. However, while previous works particularly focus on parallelization strategies that tackle load imbalance, in this paper we emphasize that other SpMV bottlenecks have not been thoroughly addressed on GPGPUs. Towards this direction, we present a bottleneck-aware SpMV auto-tuner (BASMAT), a holistic approach for the optimization of SpMV on GPGPUs that addresses all encountered bottlenecks, focusing both on fast execution and low preprocessing.

Bio

Athena Elafrou is a graduate of the Electrical and Computer Engineering (ECE) School of NTUA. She is currently a PhD candidate with the parallel systems research group of CSLab at ECE/NTUA. Her current research interests focus on high-performance sparse linear algebra and deep learning on parallel systems.
16:00-16:30 | Break
16:30-17:00 | Shinjuku: Preemptive Scheduling for Microsecond-scale Tail Latency (Kostis Kaffes, Stanford University)
Abstract

The recently proposed dataplanes for microsecond scale applications, such as IX and ZygOS, use non-preemptive policies to schedule requests to cores. For the many real-world scenarios where request service times follow distributions with high dispersion or a heavy tail, they allow short requests to be blocked behind long requests, which leads to poor tail latency. Shinjuku is a single-address space operating system that uses hardware support for virtualization to make preemption practical at the microsecond scale. This allows Shinjuku to implement centralized scheduling policies that preempt requests as often as every 5 microseconds and work well for both light and heavy tailed request service time distributions. We demonstrate that Shinjuku provides significant tail latency and throughput improvements over IX and ZygOS for a wide range of workload scenarios. For the case of a RocksDB server processing both point and range queries, Shinjuku achieves up to 6.6x higher throughput and 88% lower tail latency.

Bio

Kostis Kaffes is a PhD student in Electrical Engineering at Stanford University, advised by Christos Kozyrakis. His research interests lie in the areas of computer systems, cloud computing, and scheduling. Recently, he has been working on end-host preemptive scheduling for microsecond-scale tail latency. Previously, he completed his Diploma in Electrical and Computer Engineering at the National Technical University of Athens, where he worked with Nectarios Koziris and Georgios Goumas on interference-aware VM scheduling.
17:00-17:30 | DERP: A Deep Reinforcement Learning Cloud System for Elastic Resource Provisioning (Constantinos Bitsakos, NTUA)
Abstract

Modern large scale computer clusters benefit significantly from elasticity. Elasticity allows a cluster to dynamically allocate computer resources, based on the user’s fluctuating workload demands. Many cloud providers use threshold-based approaches, which have been proven to be difficult to configure and optimise, while others use reinforcement learning and decision-tree approaches, which struggle when having to handle large multidimensional cluster states. In this work we use Deep Reinforcement Learning techniques to achieve automatic elasticity. We use three different approaches of a Deep Reinforcement Learning agent, called DERP (Deep Elastic Resource Provisioning), that takes as input the current multi-dimensional state of a cluster and manages to train and converge to the optimal elasticity behaviour after a finite amount of training steps. The system automatically decides and proceeds on requesting/releasing VM resources from the provider and orchestrating them inside a NoSQL cluster according to user-defined policies/rewards. We compare our agent to state-of-the-art, Reinforcement Learning and decision-tree based, approaches in demanding simulation environments and show that it gains rewards up to 1.6 times better on its lifetime. We then test our approach in a real life cluster environment and show that the system resizes clusters in real-time and adapts its performance through a variety of demanding optimisation strategies, input and training loads.

Bio

Constantinos Bitsakos is a graduate of the Electrical and Computer Engineering (ECE) School of NTUA. He has worked in the industry for 5 years as a full stack web developer. He is currently a PhD candidate with the distributed systems research group of CSLab at ECE/NTUA. His current research interests focus on deep reinforcement learning and game theoretic approaches applied for high elasticity on cloud computing.
17:30-18:00 | ORiON: Online Resource Negotiator for Multiple Big Data Analytics Frameworks (Nikolaos Chalvantzis, NTUA)
Abstract

In recent years we observe the rapid growth of large-scale analytics applications in a wide range of domains, from healthcare infrastructures to traffic management. The high volume of data that need to be processed has stimulated the development of special purpose frameworks which handle the data deluge by parallelizing data processing and concurrently using multiple computing nodes. These frameworks differentiate significantly in terms of the policies they follow to decompose their workloads into multiple tasks and also on the way they exploit the available computing resources. As a result, based on the framework that applications have been implemented in, we observe significant variations in their resource utilization and execution times. Therefore, determining the appropriate framework for executing a big data application is not trivial. In this work we propose Orion, a novel resource negotiator for cloud infrastructures that support multiple big data frameworks such as Apache Spark, Apache Flink and TensorFlow. More specifically, given an application, Orion determines the most appropriate framework to assign it to. Additionally, Orion reserves the required resources so that the application is able to meet its performance requirements. Our negotiator exploits state-of-the-art prediction techniques for estimating the application’s execution time when it is assigned to a specific framework with varying configuration parameters and processing resources.

Bio

Nikolaos Chalvantzis is a graduate of the Electrical and Computer Engineering (ECE) School of NTUA. He works as a PhD candidate with the distributed systems research group of CSLab at ECE/NTUA. His current research interests and publications focus on distributed systems, cloud elasticity and resource provisioning. Nikolaos also holds a degree in Music (String Performance).

3rd Computing Systems Research Day - 8 January 2018

Schedule

12:45-13:45 | Using Big Data to Design Better Cloud Systems (Amphitheater 1, New Electrical Engineering Buildings)
Abstract

Cloud computing promises flexibility, high performance, and low cost. Despite its prevalence, most datacenters hosting cloud computing services still operate at very low utilization, posing serious scalability concerns. There are several reasons behind low cloud utilization, dominated by overly conservative users trying to avoid the unpredictable performance of multi-tenancy. A crucial system that can improve the efficiency of cloud infrastructures, while guaranteeing high performance for each submitted application is the cluster manager; the system that orchestrates where applications are placed and how many resources they receive. In this talk, I will first describe Quasar, a cluster management system that leverages practical ML techniques to quickly determine the type and amount of resources a new cloud application needs to satisfy its quality of service constraints. Quasar also introduces a new declarative interface in cluster management, where users express their applications’ performance, not resource, requirements to the system. We have built and deployed Quasar in local clusters, as well as production systems, including Twitter and AT&T, and showed that it guarantees high application performance, while improving system utilization by 2-3x. Second, I will talk about the security vulnerabilities cloud multi-tenancy creates, and show how similar ML techniques to those used in Quasar can enable an adversary to extract confidential information about an application, and negatively impact its performance. Finally, I will briefly discuss the direction in which cloud applications and systems are evolving, and how big data can help us improve the way we design and manage these complex, large-scale systems.

Bio

Christina Delimitrou is an assistant professor of Electrical and Computer Engineering, and Computer Science at Cornell, working in computer architecture, systems, and applied data mining. She is a member of the Computer Systems Lab and directs the SAIL group at Cornell. Christina has received a PhD in Electrical Engineering from Stanford University. She previously earned an MS in Electrical Engineering, also from Stanford, and a diploma in Electrical and Computer Engineering from the National Technical University of Athens. She is the recipient of a John and Norma Balen Sesquicentennial Faculty Fellowship, a Facebook Research Fellowship, and a Stanford Graduate Fellowship.
14:00-14:30 | Break
14:30-15:00 | Photonics for Disaggregated DataCenter and Computercom Applications (Multimedia Amphitheater, NTUA Central Library)
Abstract

The vast amount of new data being generated is outpacing the development of infrastructures and continues to grow at much higher rates than Moore’s law, a problem that is commonly referred to as the “data deluge problem”. This brings current computational machines in the struggle to exceed Exascale processing powers by 2020 and this is where the energy boundary is setting the second, bottom-side alarm: A reasonable power envelope for future Supercomputers has been projected to be 20MW, while world’s current No. 2 Supercomputer Sunway TaihuLight provides 93 Pflops and requires already 15.37 MW. This simply means that we have reached so far below 10% of the Exascale target but we consume already more than 75% of the targeted energy limit! The way to escape is currently following the paradigm of disaggregating and disintegrating resources, massively introducing at the same time optical technologies for interconnect purposes. Disaggregating computing from memory and storage modules can allow for flexible and modular settings where hardware requirements can be tailored to meet the certain energy and performance metrics targeted per application. At the same time, optical interconnect and photonic integration technologies are rapidly replacing electrical interconnects continuously penetrating at deeper hierarchy levels. In this work, we will discuss the main performance and energy challenges currently faced by the computing industry and we will present our recent research on photonic technologies towards realizing resource disaggregation at all hierarchy levels spanning from rack- through board- down to disintegrated chip-scale computing.

Bio

Dr. Nikos Pleros joined the faculty of the Department of Informatics, Aristotle University of Thessaloniki, Greece, in September 2007, where he is currently serving as an Assistant Professor. He obtained the Diploma and the PhD Degree in Electrical & Computer Engineering from the National Technical University of Athens (NTUA) in 2000 and 2004, respectively. His research interests include optical interconnect technologies and architectures, photonic integrated circuit technologies, optical technologies for disaggregated data center architectures and high-performance computing, optical RAM memories and optical caches, silicon photonics and plasmonics, optical signal processing, optical switching and fiber-wireless technologies and protocols for 5G mobile networks.
15:00-15:30 | Vinetalk: The Missing Piece for Cluster Managers to Enable Accelerator Sharing (Multimedia Amphitheater, NTUA Central Library)
Abstract

FPGA and GPU based accelerators have recently become first class citizens in datacenters. Despite their high cost, however, accelerators remain underutilized for large periods of time, as vendors prefer to dedicate them to workloads for guaranteed QoS. At the same time, accelerator sharing is difficult, due to vendor locked communication paths with software applications. In this work in progress, we modified the agents of Apache Mesos with Vinetalk, an accelerator middleware that abstracts the entire communication path between OS processes and accelerator hardware without adding more than 10% performance overhead. We demonstrate the ease of integration of software applications with GPUs and, in collaboration with ICCS with FPGA logic. Finally, we show that the use of Vinetalk-enhanced Mesos allows analytics pipelines, such as Apache Spark, to use for the first time executors with heterogeneous characteristics.

Bio

Christos Kozanitis is a research collaborator at FORTH-ICS. He received his M.S. and Ph.D in Computer Science and Engineering from the University of California, San Diego in 2009 and 2013 respectively. Parts of his PhD work influenced products from companies such as Cisco and Illumina. He also held a two-year postdoctoral appointment at the AMP Lab of the University of California, Berkeley, where he used and adapted state of the art big data technologies, such as Apache Spark SQL, Apache Parquet and Apache Avro to process large amounts of DNA sequencing data. His current research interests involve the improvement in software, storage and hardware level of modern datacenters in order to speed up the processing of big data workloads.
15:30-16:00 | RCU-HTM: Combining RCU with HTM to Implement Highly Efficient Concurrent Binary Search Trees (Multimedia Amphitheater, NTUA Central Library)
Abstract

In this work we introduce RCU-HTM, a technique that combines Read-Copy-Update (RCU) with Hardware Transactional Memory (HTM) to implement highly efficient concurrent Binary Search Trees (BSTs). Similarly to RCU-based algorithms, we perform the modifications of the tree structure in private copies of the affected parts of the tree rather than in-place. This allows threads that traverse the tree to proceed without any synchronization and without being affected by concurrent modifications. The novelty of RCU-HTM lies at leveraging HTM to permit multiple updating threads to execute concurrently. After appropriately modifying the private copy, we execute an HTM transaction, which atomically validates that all the affected parts of the tree have remained unchanged since they have been read and, only if this validation is successful, installs the copy in the tree structure. We apply RCU-HTM on AVL and Red-Black balanced BSTs and compare their performance to state-of-the-art lock-based, non-blocking, RCU- and HTM-based BSTs. Our experimental evaluation reveals that BSTs implemented with RCU-HTM achieve high performance, not only for read-only operations, but also for update operations. More specifically, our evaluation includes a diverse range of tree sizes and operation workloads and reveals that BSTs based on RCU-HTM outperform other alternatives by more than 18%, on average, on a multi-core server with 44 hardware threads.

Bio

Dimitrios Siakavaras is a Ph.D. candidate at the Computing Systems Laboratory of the National Technical University of Athens (NTUA). His research interests include concurrent programming, concurrent data structures and transactional memory. He received his Diploma in Electrical and Computer Engineering from NTUA in 2012.
16:00-16:30 | Break
16:30-17:00 | Combining Matrix Factorization Techniques and Neural Language Models in Recommender Systems (Multimedia Amphitheater, NTUA Central Library)
Abstract

Review-based recommender systems have become dominant in recent years. In these systems, the traditional user-item ratings’ matrix is augmented with textual evaluations of the items by the users. In this talk, we are going to explore how this extra information source can be incorporated in matrix factorization algorithms, which constitute the state-of-the-art in recommender systems. More specifically, we will examine a special category of machine learning techniques for text analysis known as neural language models. The talk will conclude with the presentation of some preliminary results of the discussed techniques on reference datasets.

Bio

Georgios Alexandridis is an electrical and computer engineer and a post-doc affiliate of the Intelligent Systems, Content and Interaction Laboratory of the National Technical University of Athens (NTUA). He graduated from the Department of Electrical and Computer Engineering of the University of Patras and also holds a doctoral degree from the School of Electrical and Computer Engineering of NTUA. His research interests are in the areas of Machine Learning, Artificial Intelligence and Big Data analysis.
17:00-17:30 | Data Movement and Cache Management Using Task-based Runtimes (Multimedia Amphitheater, NTUA Central Library)
Abstract

Task-based dataflow programming models and runtimes are promising candidates for programming multicore and manycore architectures. These programming models analyze dynamically task dependencies at runtime and schedule independent tasks concurrently to the processing elements. In such models, cache locality and efficient utilization of the on-chip cache resources is critical for performance and energy efficiency. In this talk we will describe a number of combined hardware-software approaches to improve data movement and locality in the cache hierarchy and better utilize the on-chip cache resources. We will also present our recent research activities on interconnects and communication primitives for exascale systems.

Bio

Vassilis Papaefstathiou received his Ph.D. in Computer Science (2013) from the University of Crete. From 2001 to 2003 he worked on IC design and verification in ISD S.A. and collaborated closely with STMicroelectronics on industrial SoC designs. From 2005 to 2013 he was a Research Engineer in the Computer Architecture and VLSI Systems Laboratory at the Institute of Computer Science, FORTH, Greece. From 2014 to 2016 he was a Postdoctoral Researcher at the Computer Science and Engineering Department at Chalmers University of Technology, Sweden. Since September 2016 he is with FORTH. His research interests are on Parallel Computer Architecture, High-Performance Computing, High-Speed Interconnects, Low-Power Datacenter Servers, and Storage Systems.
17:30-18:00 | A Decision Tree Based Approach Towards Adaptive Modeling of Big Data Applications (Multimedia Amphitheater, NTUA Central Library)
Abstract

The advent of the Big Data era has given birth to a variety of new architectures aiming at applications with increased scalability, robustness and fault tolerance. At the same time, though, these architectures have complicated the application structure, leading to an exponential growth of the applications’ configuration space and increased difficulty in predicting their performance. In this work, we describe a novel, automated profiling methodology that makes no assumptions on application structure. Our approach utilizes oblique Decision Trees in order to recursively partition an application’s configuration space in disjoint regions, choose a set of representative samples from each subregion according to a defined policy and return a model for the entire space as a composition of linear models over each subregion.

Bio

Giannis Giannakopoulos is a Ph.D. candidate at the Computing Systems Laboratory of the National Technical University of Athens (NTUA). His research interests include Large Scale Data Management, Distributed Systems and Cloud Computing. He received his Diploma in Electrical and Computer Engineering from NTUA in 2012.