link

June 4, Wednesday
12:00 – 13:30

Initial conditions for unsupervised learning of morphological model
Students seminar
Lecturer : Dr. Meni Adler
Lecturer homepage : http://www.cs.bgu.ac.il/~adlerm/
Affiliation : CS, BGU
Location : 202/37
Host : Stutents seminar
Morphological disambiguation is the process of assigning one set of morphological features (e.g. ילד is a singular masculine noun – a child) to each individual word in a text. When the word is ambiguous (e.g. ילד can be alternatively analyzed as a verb – gave birth), a disambiguation procedure based on the word context must be applied (e.g. given the phrase ילד ירח, the noun analysis is more probable). The most common model for unsupervised learning of stochastic processes is Hidden Markov Models (HMM). In this generative model, a given sequence of observed events is considered to be the emitted output of a stochastic process, over a set of states. In order to estimate the parameters of the model, the Expectation Maximization (EM) algorithm of Baum-Welch is applied over the observed emissions. In the first part of this talk (a type of 'end of PhD. event'), I will overview the Hebrew morphological disambiguation problem, presenting a word-based text encoding, which enables EM learning of HMM model for this task. Then, we will discuss the influence of the initial conditions on the learning process, suggesting several methods for initial estimation of syntagmatic and morpho-lexical distributions.