January 13, Wednesday
12:00 – 13:30
Applied NLP in Medical Informatics
Graduate seminar
Lecturer : Rafi Cohen
Affiliation : CS,BGU
Location : 202/37
Host : Graduate Seminar
Digitization of health care data is creating an opportunity for a new way of studying diseases and improving medical care.
Using the vast amount of patient data collected daily at hospitals can assist us in circumventing the inherent faults of current medical research of studying phenomenon using various imperfect models, as experimenting on humans is unethical and illegal.
The majority of information is stored in free text written by doctors.
Using that data requires adapted Natural Language Processing methods combined with domain specific knowledge.
Here I will present one project that originated from challenges in Medical Hebrew Processing:
In most professional domains of languages with non-Latin alphabet, proper names, named entities and adjectives are transliterated from English.
We show that recognizing these words as well as the original word is important for term recognition.
We developed a method for identifying said words combining unsupervised classifiers and a lexicon.
The lexicon based approach produced F-Measures of 87%-92% across domains, the combined approach produced F-Measures of 93%-94% respectively.
Using this classifier to improve term matching we obtained 77% more matches with precision of 92%.