Non-volatile byte-addressable memory (NVRAM) is an emerging technology that is persistent upon power loss (unlike DRAM), provides fast and fine-granular access to memory (unlike disk), and promises high performance (orders of magnitude faster than flash memory). It combines the best features of traditional RAM and disk storage, but it cannot readily be used as a drop-in replacement and therefore also introduces a paradigm shift for developers. NVRAM is of particular interest for shared-memory data structures, which are at the core of many key infrastructure components, such as in-memory databases, key-value stores, and graph processing engines. Yet, most shared-memory data structures are not persistent and, hence, not designed to tolerate failures or corruption (accidental or malicious). Traditional techniques such as logging to storage come with significant performance overheads, both during normal-case operations and during recovery.
The objective of the PersiST project is to design and implement high-performance, concurrent and secure data structures and algorithms that can leverage the capabilities of NVRAM. The primary target of such structures is to support big-data storage and processing.
We will specifically explore three complementary research directions, which will be conducted in parallel during the course of the project and will be jointly evaluated in the context of big data stores.
First, we will investigate, design, and implement persistent data structures/types (PDTs) and associated algorithms that leverage efficiently the capabilities of NVRAM. The objective is to provide a collection of PDTs that provide different trade-offs in terms of performance and robustness, and are readily applicable for data-intensive applications. These PDTs will range from simple, general-purpose structures such as lists, queues, or hash tables to specialized data-oriented structures like search trees or graphs. We will also identify the challenges and devise guidelines for transforming traditional algorithms designed for volatile memory so that they can best exploit NVRAM.
Second, we will focus on performance and scalability by investigating how PDTs can be safely accessed concurrently by large number of threads in multi-core systems. This is far from being trivial as data structures must be maintained consistent at all time as a whole—not just at the level of individual memory locations—despite concurrent accesses, in order to support safe recovery after a failure. Otherwise threads might not see the same state before and after power loss, because the effects of a partially-executed operation in NVRAM persist across failures unlike with volatile memory. To address this challenge, we will in particular study the notion of transactional persistence by leveraging the principles of "transactional memory", which is supported in hardware in recent processors.
Third, we will secure these PDTs in-memory, guarding them from attacks, even in the case where rogue software would be able to subvert legitimate system software to gain amplified privilege levels. This is of particular concern for NVRAM as data is kept persistent even if the system is abruptly turned off. This would allow an attacker to easily gain access to sensitive data by reading the state of NVRAM of a system that was intentionally terminated before any security measure was applied (e.g., encryption). Hence one need mechanisms to preserve data confidentiality (against disclosure) and integrity (against tampering), but without compromising performance. To that end, we will notably study the use of hardware extensions like Intel's software guard extensions (SGX) or AMD's secure memory encryption (SME).