link

February 19, Tuesday
12:00 – 13:00

Data Reduction for Enterprise Storage: Estimation and Effective Resource Utilization
Computer Science seminar
Lecturer : Ronen Kat
Affiliation : IBM Haifa Research Labs
Location : 202/37
Host : Dr. Aryeh Kontorovich
Real-time compression and deduplication for primary storage is quickly becoming widespread as data continues to grow exponentially, but adding compression and deduplication on the data path consumes scarce CPU and memory resources on the storage system. In this talk we present different approaches to efficient estimation of the potential data reduction ratio of data and how these methods can be applied in advanced storage systems. The main focus is on compression ratio evaluation where we employ two filters: The first level of filtering that we employ is at the data set level( e.g., volume or file system), where we estimate the overall compressibility of the data at rest. According to the outcome, we may choose to enable or disable compression for the entire data set, or to employ a second level of finer-grained filtering. The second filtering scheme examines data being written to the storage system in an online manner and determines its compressibility. We also discuss the challenges in achieving similar results when deduplication is involved and suggest alternatives for this scenario.