October 27, Tuesday
12:00 – 14:00
The Invisible Programmers
Computer Science seminar
Lecturer : Prof. Moti Ben-Ari
Affiliation : Weizmann Institute of Science
Location : 202/37
Host : Shlomi Dolev
show full content
The visibility of personal computers hides the fact that most software development is done for systems in industry and business, and not for standalone software packages. Programmers and engineers in these environments are required to have a different set of skills and a different approach to software than are currently taught in most computer science (CS) curricula. Furthermore, the talk of a hi-tech, internet-driven revolution during the last decade is inaccurate from a historical perspective, and this loss of perspective has led to demands for an artifact-driven CS curriculum. A comparison of the ACM/IEEE CC2001 curriculum with the curriculum of a traditional engineering discipline points to what I believe the future of CS education should be.
The talk will conclude with a survey of my work on teaching concurrent and distributed computation using model checking. I will show how this advanced technique can be presented to undergraduate and even high-school students.
October 13, Tuesday
14:00 – 16:00
Seeded Search Techniques for DNA Homology Detection and Mapping of Next Generation Sequencing Reads
Computer Science seminar
Lecturer : Gary Benson
Affiliation : Boston University
Location : 202/37
Host : Dr Dekel Tsur
show full content
Standard search techniques for detecting homology in DNA sequences start by detecting small matching parts, called seeds, between a query sequence and database sequences. Contiguous seed models (k-mers, k-tuples, etc.) have existed for many years and are used in programs like BLAST and BLAT. Newer models include spaced seeds and indel seeds. Both of these seed models have been shown to be more sensitive than contiguous seeds while maintaining similar specificity, where sensitivity measures the ability to find true homologies, and specificity measures the ability to avoid wasting computation time on false candidates for homology. The domains of application for the seed classes differ: spaced seeds are superior under alignment models which only allow matches and mismatches, indel seeds under models which also allow insertions and deletions in the alignments.
For any value k, there is only one contiguous seed of length k, but there can be many, many spaced seeds and indel seeds. Optimal seed selection is a resource intensive activity because essentially all possible seed shapes must be tested. In this talk, I describe the various seed models, show how to efficiently compute optimal seeds, and discuss an application in the context of new technologies for genome sequencing, in particular, mapping of short sequencing reads to a reference genome.