Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores

Date: Friday, October 23, 2020, 16:30  to 17:30
Title: Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores
Speaker: Prof. Siqiang LUO
In this talk, I will introduce Rosetta, a probabilistic range filter designed specifically for LSM-tree based key-value stores. The core intuition is that we can sacrifice filter probe time because it is not visible in end-to-end key-value store performance, which in turn allows us to significantly reduce the filter false positive rate for every level of the tree. Rosetta indexes all binary prefixes of a key using a hierarchically arranged set of Bloom filters. It then converts each range query into multiple probes, one for each non-overlapping binary prefix. We show how to integrate Rosetta in a full system, RocksDB, and we demonstrate that it brings as much as a 40x improvement compared to default RocksDB and 2-5x improvement compared to state-of-the-art range filters in a variety of workloads and across different levels of the memory hierarchy.
Dr. Siqiang LUO is currently a tenure-track assistant professor at the Chinese University of Hong Kong, Shenzhen. He received his B.S. degree from Fudan University in 2010, and his M.S. degree from Fudan University in 2013. He then joined the industry for two years and later did his Ph. D. from the University of Hong Kong (2015-2019). He worked as a Postdoc Fellow of Harvard University from 2019 to 2020. His research interests include big data systems and algorithms. Many of his publications appear in top conferences and journals, including SIGMOD, PVLDB, ICDE, VLDB Journal, TKDE, and ICML.
Friday, October 23, 2020 - 16:30 to 17:30