Time  Speaker  Talk  

09:30  Gathering and Coffee  
10:00  Welcome Remarks  
10:15  Yaacov Choueka  
10:25  Ian Munro  Optimal search trees with 2way comparisons  
This talk is about finding a polynomial time algorithm that you probably thought was known almost a half century ago, but it wasn’t. The polynomial time algorithm is still rather slow and requires a lot of space to solve, so we also look at extremely good and fast approximate solutions. More specifically … In 1971, Knuth gave an O(n^{2})time algorithm for the now classic problem of finding an optimal binary search tree. Knuth’s algorithm works only for search trees based on 3way comparisons, but most modern programming languages and computers support only 2way comparisons (<, = and >). Until this work, the problem of finding an optimal search tree using 2way comparisons remained open — polynomial time algorithms were known only for restricted variants. We solve the general case, giving
This is joint work with Marek Chrobak, Mordecai Golin, and Neal E. Young 

10:50  Rao Kosaraju  RealTime Pattern Matching  
RealTime pattern matching algorithm is developed when the input is the text followed by the pattern. We first develop a new lineartime algorithm for constructing the suffix tree for the text. We deamortize it into a near realtime algorithm. We then answer in realtime the pattern match query. This approach is an expansion of the outline of an algorithm we presented two decades ago at an ACM Symp. on Theory of Computing.  
11:15  Coffee Break  
11:45  Gary Benson  BitParallel Alignment with Substitution and Affine Gap Scoring  
Large sequence data sets generated by highthroughput biology experiments have motivated development of more efficient sequence comparison algorithms. Bitparallel alignment offers one approach to dealing with these large data sets. Bitparallel alignment algorithms use bits in computer words to represent alignment scores within a single row of the alignment scoring matrix and use logic operations and addition to compute the scores in the next row. For global alignment with substitution scoring we present a new approach and add an extension for affine gap scoring. Substitution scoring assigns a potentially different substitution weight to every pair of alphabet characters. Commonly used substitution scoring schemes include the BLOSUM table for aligning protein sequences, and transitiontransversion tables for aligning DNA sequences. Affine gap scoring includes separate weights for initiating a gap and for extending a gap by one character. Our approach is based on partial sums of adjacent score differences. It runs in time $O(mn\log W/W + n\Sigma)$ for both the simple and affine gap implementations, where $W$ is the number of values that are stored in a computer word (in our implementation, $w/8$ where $w$ is the length of a computer word) and $\Sigma$ is the size of the alphabet.  
12:10  Mike Goodrich  Learning Character Strings via Mastermind Queries, with Case Studies  
We study the degree to which a character string, Q, leaks details about itself any time it engages in comparison protocols with a strings provided by a querier, Bob, even if those protocols are cryptographically guaranteed to produce no additional information other than the scores that assess the degree to which Q matches strings offered by Bob. We show that such scenarios allow Bob to play variants of the game of Mastermind with Q so as to learn the complete identity of Q. We show that there are a number of efficient implementations for Bob to employ in these Mastermind attacks, depending on knowledge he has about the structure of Q, which show how quickly he can determine Q. Indeed, we show that Bob can discover Q using a number of rounds of test comparisons that is much smaller than the length of Q, under reasonable assumptions regarding the types of scores that are returned by the cryptographic protocols. We also provide the results of case studies we performed on some example string databases.  
12:35  Gonzalo Navarro  Range Majorities in Arrays  
Karpinski and Nekrich (2008) introduced the problem of parameterized range majority, which asks to preprocess
a string of length n such that, given the endpoints of a range, one can quickly find all the distinct elements whose
relative frequencies in that range are more than a threshold τ . Subsequent authors have reduced their time and
space bounds such that, when τ is fixed at preprocessing time, we need either O(n lg(1/τ )) space and optimal
O(1/τ ) query time or linear space and O((1/τ ) lg lg σ) query time, where σ is the alphabet size. In this talk I'll give
the first linearspace solution with optimal O(1/τ ) query time, even with variable τ (i.e., specified with the query).
(joint work with D. Belazzougui, T. Gagie, Y. Nekrich, and I. Munro) 

13:00  Lunch  
15:00  Gadi Landau  My Time with Ami  
15:10  Edward Kaplan  Life with Dr. Goldman  
15:35  Dov Gabbay  Talmudic logic project  
The Talmud is the most comprehensive and fundamental work of Jewish religious law, employing a large number of logical components centuries ahead of their time. In many cases the basic principles are not explicitly formulated, which makes it difficult to formalize and make available to the modern student of Logic. This book series on Talmudic Logic,aims to present logical analysis of Talmudic reasoning using modern logical tools. The series begins with the systematic analysis of Talmudic inference rules. The first book shows that we can present Talmudic reasoning intuitions as a systematic logical system basic to modern nondeductive reasoning , such as Argumentum A Fortiori, Abduction and Analogy. Forthcoming books will deal with additional topics like Deontic logic , and Temporal logic , Agency and processes in the Talmud and more. So far we published 14 books. The current book deals with changes through time. The aims of the series are twofold:


16:00  Coffee Break  
16:35  Richard Cole  Fast sparse convolution  
The task to to efficiently compute the convolution w of two vectors with positive integer entries in the case that the output is sparse. We give a Las Vegas algorithm with expected run time O(w log^2 w), where w denotes the number of nonzero entries in w. This is joint work with Ramesh Hariharan. The result was stated in a STOC'02 paper (Verifying candiate matches in sparse and wildcard matching), but to date the algorithm has been published only on Cole's website (see the last page of the extended STOC publication). 

17:00  Esko Ukkonen  Identifiability of a String from Its Substrings  
The genome assembly problem asks one to reconstruct a DNA sequence from short fragments (called reads) that a sequencing instrument samples from the original DNA sequence. The talk focuses on the exact version of the problem in which the reads are noiseless. Given a set F of reads sampled from an unknown target string T, the problem is to reconstruct T from F. If T has repeating substrings that are longer than the reads, as is often the case for DNA sequences, then the reconstruction becomes ambiguous because several different reconstructions may be consistent with the reads. We demonstrate that F determines a unique reconstruction, if F is the full kmer spectrum of T and T has no 3repeats of length k1 and no interleaved pairs of 2repeats of length k1. A finitestate automatonlike representation of the pairwise overlaps of the reads is introduced such that the unique identifiability of T reduces to the uniqueness of the Eulerian path in this representation.  
17:25  Ely Porat  A Glimpse into Ami's Research  
17:50  End of the lectures  
18:30  Reception On the lawn between the Jakobovits building (1002) and the Language Studies building (1004) (map). 

19:15  Dinner In the Language Studies building (1004) foyer. 