streaming; temporal correlation; spatial correlation; spatio-temporal; cache; prefetching; temporal corrélation; spatial corrélation
Jevdjic Djordje, Volos Stavros, Falsafi Babak (2013), Die-stacked DRAM caches for servers: hit ratio, latency, or bandwidth? have it all with footprint cache, in
Proceedings of the 40th Annual International Symposium on Computer Architecture.
Ferdman Michael, Adileh Almutaz, Kocberber Onur, Volos Stavros, Alisafaee Mohammad, Jevdjic Djordje, Kaynak Cansu, Popescu Adrian Daniel, Ailamaki Anastasia, Falsafi Babak (2012), Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware, in
Proceedings of the seventeenth international conference on Architectural Support for Programming Lan.
Ferdman Michael, Kaynak Cansu, Falsafi Babak (2011), Proactive Instruction Fetch, in
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture , ACM, in press.
Ferdman Michael, Adileh Almutaz, Kocberber Onur, Volos Stavros, Alisafaee Mohammad, Jevdjic Djordje, Kaynak Cansu, Popescu Adrian Daniel, Ailamaki Anastasia, Falsafi Babak, Quantifying the Mismatch Between Emerging Scale-Out Applications and Modern Processors, in
ACM Transactions on Computer Systems (TOCS).
Kaynak Cansu, Grot Boris, Falsafi Babak, SHIFT: Shared History Instruction Fetch for Manycore Lean-Core Server Processors, in
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture .
We propose Spatio-Temporal Memory Streaming (STeMS), a memory system in which data are fetched, placed, and replaced in the form of data groups that are spatially or temporally correlated-which we refer to as spatio-temporal streams-rather than individual cache blocks. STeMS can extract streams on-the-fly in hardware by observing and recording a program’s memory accesses or allow software to form and communicate the streams to hardware. STeMS helps bridge the processor/memory performance gap in a number of ways. First, streaming helps to hide the memory access latency and increase the memory-level parallelism for arbitrary memory access patterns (including dependent accesses). Second, streaming helps to improve storage utilization by throttling the movement of data across storage levels and matching the fetch rate to the processor’s con¬sumption rate. Finally, STeMS optimizes pin bandwidth by moving only data that is likely to be referenced and implementing replacement decisions on data groups to avoid unnecessary replacement traffic. The specific contributions of this project will be:Extracting spatio-temporal streams. We will develop mechanisms that can detect and extract spatial and temporal correlation in memory access sequences to construct spatio-temporal streams. We will describe both transparent streaming mechanisms that extract streams dynamically in hardware, and soft¬ware-assisted streaming mechanisms that allow the programmer/compiler to specify streams.Transferring and placing spatio-temporal streams. We will present novel hardware designs that throttle the rate at which spatio-temporal streams are loaded on chip to hide the latency of off-chip accesses. Furthermore, we will design mechanisms that intelligently allocate, place, and replace streamed data on chip to keep frequently-accessed streams in low-latency storage.Integrated hardware/software streaming systems. We will deliver integrated hardware/software system prototypes for important server applications, such as online transaction processing, that leverage the synergies possible with explicit software-specified streams. We will design HW/SW interfaces that allow the programmer or compiler to communicate a data structure’s access patterns directly to the memory system.