April 17, Tuesday
12:00 – 13:00
The ongoing effort to reconstruct the Cairo Genizah
Computer Science seminar
Lecturer : Lior Wolf
Affiliation : The Blavatnik School of Computer Science , Tel Aviv University
Location : 202/37
Host : Dr. Aryeh Kontorovich
Many significant historical corpora contain leaves that are mixed up
and no longer bound in their original state as multi-page documents.
The reconstruction of old manuscripts from a mix of disjoint leaves
can therefore be of a paramount importance to historians and literary
scholars. In collaboration with the The Friedberg Genizah Project, we
show that visual similarity provides meaningful pair-wise similarities
between handwritten leaves and then go a step further and suggest a
semi-automatic clustering tool that helps reconstruct the original
documents. The proposed solution is based on a graphical model that
makes inferences based on catalog information provided for each leaf
as well as on the pairwise similarities of handwriting. Several novel
active clustering techniques are explored, and the solution is applied
to a significant part of the Cairo Genizah, where the problem of
joining leaves remains unsolved even after a century of extensive study by hundreds of human scholars.