Blogs (1) >>
POPL 2019
Sun 13 - Sat 19 January 2019 Cascais, Portugal
Fri 18 Jan 2019 16:59 - 17:21 at Sala II - Program Analysis II Chair(s): Michael Emmi

There is a huge gap between the speeds of modern caches and main memories, and therefore cache misses account for a considerable loss of efficiency in programs. The predominant technique to address this issue has been Data Packing: data elements that are frequently accessed within time proximity are packed into the same cache block, thereby minimizing accesses to the main memory. We consider the algorithmic problem of Data Packing on a two-level memory system. Given a reference sequence $R$ of accesses to data elements, the task is to partition the elements into cache blocks such that the number of cache misses on $R$ is minimized. The problem is notoriously difficult: it is NP-hard even when the cache has size $1$, and is hard to approximate for any cache size larger than $4$. Therefore, all existing techniques for Data Packing are based on heuristics and lack theoretical guarantees.

In this work, we present the first positive theoretical results for Data Packing, along with new and stronger negative results. We consider the problem under the lens of the underlying access hypergraphs, which are hypergraphs of affinities between the data elements, where the order of an access hypergraph corresponds to the size of the affinity group. We study the problem parameterized by the treewidth of access hypergraphs, which is a standard notion in graph theory to measure the closeness of a graph to a tree. Our main results are as follows: we show that there is a number $q^$ depending on the cache parameters such that (a) if the access hypergraph of order $q^$ has constant treewidth, then there is a linear-time algorithm for Data Packing; (b) the Data Packing problem remains NP-hard even if the access hypergraph of order $q^-1$ has constant treewidth. Thus, we establish a fine-grained dichotomy depending on a single parameter, namely, the highest order among access hypegraphs that have constant treewidth; and establish the optimal value $q^$ of this parameter.

Finally, we present an experimental evaluation of a prototype implementation of our algorithm. Our results demonstrate that, in practice, access hypergraphs of many commonly-used algorithms have small treewidth. We compare our approach with several state-of-the-art heuristic-based algorithms and show that our algorithm leads to significantly fewer cache-misses.

Slides (popl2019-slides.pdf)2.44MiB

Fri 18 Jan

Displayed time zone: Belfast change

16:37 - 17:43
Program Analysis IIResearch Papers at Sala II
Chair(s): Michael Emmi SRI International
16:37
22m
Talk
Efficient Automated Repair of High Floating-Point Errors in Numerical Libraries
Research Papers
Xin Yi National University of Defense Technology, Liqian Chen National University of Defense Technology, Xiaoguang Mao National University of Defense Technology, Tao Ji National University of Defense Technology
Link to publication DOI Media Attached File Attached
16:59
22m
Talk
Efficient Parameterized Algorithms for Data Packing
Research Papers
Krishnendu Chatterjee IST Austria, Amir Kafshdar Goharshady IST Austria, Nastaran Okati Ferdowsi University of Mashhad, Andreas Pavlogiannis EPFL, Switzerland
Link to publication DOI Pre-print Media Attached File Attached
17:21
22m
Talk
Fast and exact analysis for LRU caches
Research Papers
Valentin Touzeau Univ. Grenoble Alpes, Claire Maiza Verimag, France, David Monniaux CNRS, VERIMAG, Jan Reineke Saarland University
Link to publication DOI Media Attached File Attached