link

January 18, Sunday
12:00 – 14:00

Semantics for Incomplete Information as Hereditary Graph Properties
Computer Science seminar
Lecturer : Mrs. Sara Cohen
Lecturer homepage : http://www.cs.huji.ac.il/~sarina
Affiliation : CS dept. at Hebrew U
Location : -101/58
Host : Dr. Eitan Bachmat
Databases often contain incomplete information. This is especially common when the database is formed by integrating information from heterogeneous sources or in the context of data extracted from the Web. In such situations, there may not be enough information to completely answer a query. Hence, it is of interest to find maximal answers for a given query. In this talk I will discuss semantics for incomplete query answers, and I will focus on the following topics:

(1) Semantically-Related Nodes: Given an XML document, we consider the problem of automatically determining when nodes are semantically related. I will present one method, called interconnection, that can be used to solve this problem. I will discuss how interconnection can be used even when the data being queried is incomplete.

(2) Semantic Search Engine: Building on the idea of interconnection, we developed and implemented XSEarch, a semantic search engine for XML. Using XSEarch one can efficiently find maximal subtrees of XML documents that answer a given search query and also contain semantically related nodes.

(3) Hereditary, Connected-Hereditary and Rooted-Hereditary Graph Properties: I will briefly survey other semantics for dealing with incomplete information (e.g., full disjunctions). These semantics for maximal query answers differ significantly one from another. Nonetheless, it has been shown that each of these semantics is polynomially-solvable, i.e., it is possible to generate all maximal solutions under each of these semantics in polynomial time in the size of the input and output. I will show that the above semantics can be couched as computing maximal solutions for hereditary, connected-hereditary or rooted-hereditary graph properties, and that this is the underlying reason why the semantics are polynomially-solvable. I will also present general algorithms that generate all maximal solutions for such graph properties in polynomial time in the size of the input and output. Our results include and improve upon previous results in databases, in particular, and in graph theory, in general.