January 14, Tuesday
12:00 – 14:00
Data synopses for data streams and massive data sets
Computer Science seminar
Lecturer : Yossi Matias
Affiliation : Tel Aviv University
Location : -101/58
Host : Mayer Goldberg
The emerging area of data synopses and streaming data analysis has seen tremendous progress over the past few years, involving deep theoretical issues on the one hand, and vast applicability on the other hand. Massive data sets with hundreds of gigabytes or more of raw data are becoming commonplace, and traditional algorithms and data structures fail to process such data sets effectively. Hence, there is a growing need for algorithms and data structures that enable fast response times for various classes of queries on such data, and for algorithms that can handle efficiently the data as it streams by. We discuss synopsis data structures that use very limited space to capture the demographics of massive data sets; these are designed to support fast and typically approximated answers to queries. We will point out several techniques that have proven useful, including adaptive sampling, random projection, and wavelets. Time permitting, we will discuss recent results on spectral bloom filters and list-traversal synopses.