January 30, Wednesday
12:00 – 13:30
Distributed Private Data Analysis: Simultaneously Solving How and What
Students seminar
Lecturer : Mr. Eran Omri
Affiliation : CS, BGU
Location : 202/37
Host : Students seminar
show full content
In this work we combine two directions in the field of privacy. While in both the goal is to privately evaluate some function $f$ on $n$ inputs, the two approaches differ in their privacy goals and requirements. The first direction is secure function evaluation (SFE), where $n$ parties, sharing a common interest in distributively and securely computing $f$ applied to their private inputs, are given a protocol to do so. An SFE protocol for evaluating $f$ is private if no subset of the parties may deduce information, other than what can be learned from the result of $f$ and the initial inputs of the members of the subset. The second direction is private data analysis where computation of $f$ is considered to be private if the privacy of each record is preserved individually. One possible way to obtain useful information on a collection of individual information, while protecting individual data, is by adding carefully chosen ``noise'', i.e., by evaluating a random approximation $widehat{f}$ of $f$. This choice of $widehat{f}$ is usually done while abstracting away implementation issues, assuming that the computation is performed by a trusted server holding the data.
A natural paradigm for implementing the computation without a trusted party is by constructing an SFE protocol for $widehat{f}$. This conceptual separation, between the decision on which $widehat{f}$ to compute and how to compute it, is valuable. However, this approach may result in non-optimal protocols, as SFE protocols need to comply with strong privacy requirements. For instance, publishing outputs of intermediate computations, as long as they do not compromise individual privacy, may be useful. However, publishing these outputs may be impossible under SFE definitions (e.g., if for some party these outputs cannot be simulated from the input of the party and the output of the function).
We initiate an examination whether there are advantages to choosing $widehat{f}$ and the protocol for computing it simultaneously. In particular, we investigate the case of sum queries and specify under which accuracy requirements, it is beneficial to adapt this paradigm.
Joint work with Amos Beimel and Kobbi Nissim.
January 29, Tuesday
12:00 – 14:00
What We Shouldn't Ignore When Studying Gene Regulation
Computer Science seminar
Lecturer : Prof. Peter F. Stadler
Affiliation : Department of Computer Science and Interdisciplinary Center of Bioinformatics
Location : 202/37
Host : Dr. Barash
show full content
Until recently, the study of gene regulation has focused mostly on two major mechanisms: the regulation of transcription (typically described in terms of the binding to transcription factors to specific DNA sequence motifs) and on direct protein-protein interactions. A string of high-throuhput experiments, however, has resulted in overwhelming evidence that several other layers of regulation are of comparable importance and cannot be neglected. Indeed, at least half of the diversity of the observable transcriptome is comprised by non-coding RNAs, and novel types and classes of ncRNAs keep being discovered. The oldest and best known, but as it seems by no means the only one of these RNA-mediated mechanism is post-transcriptional regulation through microRNAs. Anti-sense transcription is another one of the better-known effects.
In my presentation I will briefly review recent findings, both experimental and computational, that suggest that a substantial fraction of transcripts has regulatory function. Then I will focus on bioinformatics approaches to disentangle - at least partially - the various layers using two systems as examples: direct RNA-RNA and RNA-protein interactions. Because of their abundance (even though we understand very few specific systems in detail), we cannot ignore these multiple regulatory layers in attempts to reconstruct and eventually understand gene regulation networks.
January 28, Monday
14:00 – 16:00
A Framework for Binary Coding Analysis and Static and Dynamic Patching
Computer Science seminar
Lecturer : Prof. Barton Miller
Affiliation : Computer Science Department, University of Wisconsin - Madison
Location : 202/37
Host : Prof. S. Dolev
show full content
Tools that analyze and modify binary code are crucial to many areas of computer science, including cyber forensics, program tracing, debugging, testing, performance profiling, performance modeling, and software engineering. Many tools used to support these activities have significant limitations in functionality, efficiency, accuracy, portability and availability.
Our goal is the design and implementation of a new framework that will allow for interoperability of the static and dynamic code modification, and enable the sharing and leveraging of this complex technology.
Characteristics of this framework include: multi-architecture, multi-format, and multi-operating system; library-based; open source; extensible data structures; exportable data structures; batch enabled; testable, with each separate component provided with a detailed test suite; and functions with non-contiguous and shared code.
This component-based approach requires identifying key portable and multi-platform abstractions at every level of functionality. Transitioning to this model successfully will enable us to break the "hero model" of tool development of having each group trying to produce its own end-to-end complete tool sets.
January 23, Wednesday
12:00 – 13:00
Polychromatic Colorings of Plane Graphs
Students seminar
Lecturer : Mr. Roi Krakovski
Affiliation : Department of Computer Science, Ben-Gurion University
Location : 202/37
show full content
A polychromatic
$k$ - coloring of a plane graph
$G$ is an assignment of k colors to the vertices of
$G$, such that each face of
$G$, except possibly for the outer face, has all
$k$ colors on its boundary. Polychromatic colorings of plane graphs are an essential tool for guarding and covering problems. In a polychromatic coloring of a plane graph
$G$ every color class is present on every face of
$G$. Therefore, any color class forms a set of vertex guards for the faces of
$G$.
We survey recent results concerning polychromatic colorings, and prove that every 2-connected plane bipartite cubic graph admits a polychromatic $4$-coloring.
January 22, Tuesday
12:00 – 14:00
Developing fair negotiation support systems
Computer Science seminar
Lecturer : Prof. John Zeleznikow
Affiliation : School of Information Systems, Victoria University, Melbourne, Australia
Location : 202/37
Host : Prof. Miri Balaban
show full content
In previous work on mediation and negotiation support systems, we have focused upon using integrative bargaining. When end-users of our family-mediation and plea-bargaining systems have evaluated our systems, they commented that the systems met their personal needs, but not the interests of fairness or justice. For example, in Family Mediation, they meet the interests of the parents and not the children.
For negotiation support systems to be widely used, issues of fairness and justice must be addressed. We are currently developing measures for assessing the outcomes of online negotiation in the legal domains of sentencing, plea-bargaining and family mediation. Such methods will form the basis of a new model for evaluating fairness and consistency within online dispute resolution systems. This model will inform the construction of fairer and more consistent systems of IT based negotiation support.
We conclude by discussing a new project we are about to commence focusing upon developing negotiation support systems that promote constructive relationships following disputes.
January 16, Wednesday
12:00 – 14:00
Deterministic Extractors for Affine Sources over Large Fields
Bio-Informatics seminar
Lecturer : Mr. Ariel Gabizon
Affiliation : Faculty of Mathematics and Computer Science
Location : 202/37
Host : Student Seminar
show full content
An
$(n,k)$-affine source over a finite field F is a random variable
$X=(X_1,...,X_n) \in F^n$, which is uniformly distributed over an (unknown) k-dimensional affine subspace of
$F^n$. We show how to (deterministically) extract practically all the randomness from affine sources, for any field of size larger than
$n^c$ (where
$c$ is a large enough constant).
Our main results are as follows: (For arbitrary k): For any n,k and any F of size larger than $n^{20}$, we give an explicit construction for a function $D: F^n \rightarrow F^{k-1}$, such that for any $(n,k)$-affine source X over F, the distribution of $D(X)$ is $\epsilon$-close to uniform, where $\epsilon$ is polynomially small in $|F|$. (For k=1): For any n and any F of size larger than $n^{c}$, we give an explicit construction for a function $D: F^n \rightarrow \{0,1\}^{(1-\delta) \log_2|F|}$, such that for any $(n,1)$-affine source X over F, the distribution of $D(X)$ is $\epsilon$-close to uniform, where $\epsilon$ is polynomially small in $|F|$. Here, $\delta > 0$ is an arbitrary small constant, and $c$ is a constant depending on $\delta$.
Joint work with Ran Raz
January 15, Tuesday
12:00 – 14:00
Quantitative Models for Chromatin and Transcription Regulation
Computer Science seminar
Lecturer : Dr. Eran Segal
Affiliation : Department of Computer Science and Applied Mathematics, Weizmann Institute of Science
Location : 202/37
Host : Dr. Barash
show full content
Precise control of gene activation lies at the heart of nearly all biological processes. However, despite enormous advances in understanding this process from both experimental and theoretical perspectives, we are still missing a quantitative description of the underlying transcriptional control mechanisms, and the remaining questions, such as how DNA sequence elements 'compute' expression from the inputs they receive, are still very basic. In this talk, I will present our progress towards the ultimate goal of developing integrated quantitative models for transcription regulation. I will describe a novel thermodynamic model that computes gene activation patterns as a function of DNA sequence and show that this model accurately predicts gene activation patterns in fly from DNA sequence alone.
January 10, Thursday
11:00 – 13:00
Learning From Related Sources
Computer Science seminar
Lecturer : Dr. Koby Crammer
Affiliation : Department of Computer and Information Science, University of Pennsylvania
Location : 202/37
Host : Prof. Ronen Brafman
show full content
We often like to build a model for one scenario based on data from similar or nearby cases. For example, consider the problem of building a model which predicts a sentiment about books from short reviews, using reviews and sentiment of DVDs. Another example is of learning movies preference for one viewer from ratings provided by other similar users. There is a natural tradeoff between using data from more users and using data from only similar users.
In this talk, I will discuss the problem of learning good models using data from multiple related or similar sources. I will present a theoretical approach which extends the standard probably approximately correct (PAC) learning framework, and show how it can be applied in order to determine which sources of data should be used and how. The bounds explicitly model the inherit tradeoff between building a model from many but inaccurate data sources or building it from a few accurate data sources. The theory shows that optimal combinations of sources can improve performance bounds on some tasks.
January 9, Wednesday
12:00 – 14:00
Cryptanalysis of the Windows Random Number Generator
Bio-Informatics seminar
Lecturer : Mr. Leo Dorrendorf
Affiliation : Hebrew University, Dept. of Computer Sciences
Location : 202/37
Host : Student Seminar
show full content
Random numbers are essential in every cryptographic protocol. The quality of a system's random number generator (RNG) is therefore vital to its security. In Microsoft Windows, the operating system provides an RNG for security purposes, through an API function named CryptGenRandom. This is the cryptographic RNG used by the operating system itself and by important applications like the Internet Explorer, and the only RNG recommended for security purposes on Windows. This is the most common security RNG in the world, yet its exact algorithm was never published until now. We provide a description of the Windows RNG, based on examining the binary code of Windows 2000. We reconstructed the RNG's algorithm, and present its exact description and an analysis of the design. Our analysis shows a number of weaknesses in the design and implementation, and we demonstrate practical attacks on the RNG. We propose our recommendations for users and implementers of RNGs on the Windows platform. In addition, we describe the reverse-engineering process which led to our findings.
January 8, Tuesday
16:00 – 18:00
Live Distributed Objects
Computer Science seminar
Lecturer : Prof. Ken Birman
Affiliation : Department of Computer Science, Cornell University
Location : 202/37
Host : Prof. Shlomy Dolev
show full content
Live Distributed Objects
Prof. Kenneth P. Birman
Abstract
Although we've been building distributed systems for decades, it remains remarkably difficult to get them right. Distributed software is hard to design and the tools available to developers have lagged far behind the options for building and debugging non-distributed programs targeting desktop environments. At Cornell, we're trying to change this dynamic. The first part of this talk will describe "Live Distributed Objects", a new and remarkably easy way to create distributed applications, with little or no programming required (in fact, the talk will include a demo of how this works). Supporting these kinds of objects forced us to confront a number of scalability, security and performance questions not addressed by prior research on distributed computing platforms. The second part will look at Cornell's Quicksilver system and the approach it uses to solve these problems
This research is joint with PhD candidate Krzys Ostrowski (the "real" leader on the effort) and with Danny Dolev.
12:00 – 14:00
Making large problems seem small: Convex duality in learning and inference
Computer Science seminar
Lecturer : Dr. Amir Globerson
Affiliation : Computer Science and Artificial Intelligence Laboratory, MIT
Location : 202/37
Host : Prof. Ronen Brafman
show full content
Abstract
Machine learning algorithms are often used to extract rules and structure from large datasets, and have been successfully used in fields such as machine vision, natural language processing and computational biology. With the growing availability of text and images on the web, and high-throughput experiments in biology, learning algorithms are becoming a key tool for organizing, searching, and interpreting this data.
Learning algorithms are typically required to work with very large datasets of high dimensional signals. Thus scalability of algorithms is a key issue that must be addressed. In this talk I will show how the concept of convex duality can be used to design simple and scalable algorithms, and to help understand convergence of previously existing ones.
I will first address the problem of probabilistic inference in multivariate distributions, and show how convex duality results in simple convergent message passing algorithms for this problem. Specifically, I will show that approximate inference may be cast as a geometric program, where coordinate descent yields message passing updates. These results resolve a long standing problem regarding the convergence of message passing algorithms for approximate inference.
I will next address the problem of learning classifiers from labeled data, and present an exponentiated gradient algorithm that is also derived using convex duality. The algorithm can be shown to converge, and its convergence rate can be analyzed. I will conclude by showing an application of the algorithm to a large scale natural language parsing task, and demonstrate that it converges significantly faster than previous state of the art algorithms.
January 7, Monday
14:00 – 16:00
Worst-case complexity vs. average-case complexity
Computer Science seminar
Lecturer : Dr. Dan Gutfreund
Affiliation : Departments of Mathematics and Computer Science, MIT
Location : 202/37
Host : Dr. Amos Beimel
show full content
Abstract
Worst-case complexity, which measures the performance of algorithms with respect to the most difficult instances of the problem at hand, is a standard and convenient complexity measure. On the other hand, average-case complexity, which measures the performance of algorithms with respect to typical (or simply random) instances, may be a more realistic complexity measure for instances that actually appear in practice. Furthermore, it seems to be the right measure in settings where computational hardness is beneficial (e.g. in cryptography and pseudo-randomness).
A central question in theoretical computer science is for which classes of computational problems (and against which families of algorithms) there is an equivalence between their worst-case and average-case complexities.
This question and the techniques to attack it are related to many other areas in computer science such as cryptography, pseudo-randomness, error-correcting codes, program checking and learning.
In this survey talk I will describe some recent results on worst-case/average-case equivalence. I will concentrate on two important classes of computational problems: NP and EXPTIME.
January 1, Tuesday
12:00 – 14:00
Computational methods for RNA secondary structure determination
Computer Science seminar
Lecturer : Prof. Michael Zuker
Affiliation : Department of Mathematical Sciences, Rensselaer Polytechnic Institute
Location : 202/37
Host : Dr. Danny Barash
show full content
Abstract
RNA secondary structure is defined along with three different ways to display them.
Two distinct approaches are presented for determining secondary structure from sequence data.
The comparative method requires a multiple sequence alignment of a collection of homologous RNA sequences.
It uses phylogeny to determine common, conserved base pairs that are more likely to be the result of evolution
than to exist by chance. On the other hand, recursive algorithms may be used on single sequences to compute
minimum free energy structures, partition functions and other biophysical quantities. These algorithms ignore
evolution and use empirically derived energy parameters based on physical chemistry. Examples will be given for both methods.
The latter part of the lecture will include recent work on computing the entropy of the Boltzmann distribution
of foldings and how this quantity may be used to judge the overall reliability of free energy based methods for particular molecules.