Time | Speaker | Talk | |
---|---|---|---|
09:30 | Gathering and Coffee | ||
10:00 | Welcome Remarks | ||
10:15 | Yaacov Choueka | ||
10:25 | Ian Munro | Optimal search trees with 2-way comparisons | |
This talk is about finding a polynomial time algorithm that you probably thought was known almost a half century ago, but it wasn’t. The polynomial time algorithm is still rather slow and requires a lot of space to solve, so we also look at extremely good and fast approximate solutions. More specifically … In 1971, Knuth gave an O(n2)-time algorithm for the now classic problem of finding an optimal binary search tree. Knuth’s algorithm works only for search trees based on 3-way comparisons, but most modern programming languages and computers support only 2-way comparisons (<, = and >). Until this work, the problem of finding an optimal search tree using 2-way comparisons remained open — polynomial time algorithms were known only for restricted variants. We solve the general case, giving
This is joint work with Marek Chrobak, Mordecai Golin, and Neal E. Young |
|||
10:50 | Rao Kosaraju | Real-Time Pattern Matching | |
Real-Time pattern matching algorithm is developed when the input is the text followed by the pattern. We first develop a new linear-time algorithm for constructing the suffix tree for the text. We de-amortize it into a near real-time algorithm. We then answer in real-time the pattern match query. This approach is an expansion of the outline of an algorithm we presented two decades ago at an ACM Symp. on Theory of Computing. | |||
11:15 | Coffee Break | ||
11:45 | Gary Benson | Bit-Parallel Alignment with Substitution and Affine Gap Scoring | |
Large sequence data sets generated by high-throughput biology experiments have motivated development of more efficient sequence comparison algorithms. Bit-parallel alignment offers one approach to dealing with these large data sets. Bit-parallel alignment algorithms use bits in computer words to represent alignment scores within a single row of the alignment scoring matrix and use logic operations and addition to compute the scores in the next row. For global alignment with substitution scoring we present a new approach and add an extension for affine gap scoring. Substitution scoring assigns a potentially different substitution weight to every pair of alphabet characters. Commonly used substitution scoring schemes include the BLOSUM table for aligning protein sequences, and transition-transversion tables for aligning DNA sequences. Affine gap scoring includes separate weights for initiating a gap and for extending a gap by one character. Our approach is based on partial sums of adjacent score differences. It runs in time $O(mn\log W/W + n|\Sigma|)$ for both the simple and affine gap implementations, where $W$ is the number of values that are stored in a computer word (in our implementation, $w/8$ where $w$ is the length of a computer word) and $\Sigma$ is the size of the alphabet. | |||
12:10 | Mike Goodrich | Learning Character Strings via Mastermind Queries, with Case Studies | |
We study the degree to which a character string, Q, leaks details about itself any time it engages in comparison protocols with a strings provided by a querier, Bob, even if those protocols are cryptographically guaranteed to produce no additional information other than the scores that assess the degree to which Q matches strings offered by Bob. We show that such scenarios allow Bob to play variants of the game of Mastermind with Q so as to learn the complete identity of Q. We show that there are a number of efficient implementations for Bob to employ in these Mastermind attacks, depending on knowledge he has about the structure of Q, which show how quickly he can determine Q. Indeed, we show that Bob can discover Q using a number of rounds of test comparisons that is much smaller than the length of Q, under reasonable assumptions regarding the types of scores that are returned by the cryptographic protocols. We also provide the results of case studies we performed on some example string databases. | |||
12:35 | Gonzalo Navarro | Range Majorities in Arrays | |
Karpinski and Nekrich (2008) introduced the problem of parameterized range majority, which asks to preprocess a string of length n such that, given the endpoints of a range, one can quickly find all the distinct elements whose relative frequencies in that range are more than a threshold τ . Subsequent authors have reduced their time and space bounds such that, when τ is fixed at preprocessing time, we need either O(n lg(1/τ )) space and optimal O(1/τ ) query time or linear space and O((1/τ ) lg lg σ) query time, where σ is the alphabet size. In this talk I'll give the first linear-space solution with optimal O(1/τ ) query time, even with variable τ (i.e., specified with the query). (joint work with D. Belazzougui, T. Gagie, Y. Nekrich, and I. Munro) | |||
13:00 | Lunch | ||
15:00 | Gadi Landau | My Time with Ami | |
15:10 | Edward Kaplan | Life with Dr. Goldman | |
15:35 | Dov Gabbay | Talmudic logic project | |
The Talmud is the most comprehensive and fundamental work of Jewish religious law, employing a large number of logical components centuries ahead of their time. In many cases the basic principles are not explicitly formulated, which makes it difficult to formalize and make available to the modern student of Logic. This book series on Talmudic Logic,aims to present logical analysis of Talmudic reasoning using modern logical tools. The series begins with the systematic analysis of Talmudic inference rules. The first book shows that we can present Talmudic reasoning intuitions as a systematic logical system basic to modern non-deductive reasoning , such as Argumentum A Fortiori, Abduction and Analogy. Forthcoming books will deal with additional topics like Deontic logic , and Temporal logic , Agency and processes in the Talmud and more. So far we published 14 books. The current book deals with changes through time. The aims of the series are twofold:
|
|||
16:00 | Coffee Break | ||
16:35 | Richard Cole | Fast sparse convolution | |
The task to to efficiently compute the convolution w of two vectors with positive integer entries in the case that the output is sparse. We give a Las Vegas algorithm with expected run time O(||w|| log^2 ||w||), where ||w|| denotes the number of non-zero entries in w. This is joint work with Ramesh Hariharan. The result was stated in a STOC'02 paper (Verifying candiate matches in sparse and wildcard matching), but to date the algorithm has been published only on Cole's website (see the last page of the extended STOC publication). | |||
17:00 | Esko Ukkonen | Identifiability of a String from Its Substrings | |
The genome assembly problem asks one to reconstruct a DNA sequence from short fragments (called reads) that a sequencing instrument samples from the original DNA sequence. The talk focuses on the exact version of the problem in which the reads are noiseless. Given a set F of reads sampled from an unknown target string T, the problem is to reconstruct T from F. If T has repeating substrings that are longer than the reads, as is often the case for DNA sequences, then the reconstruction becomes ambiguous because several different reconstructions may be consistent with the reads. We demonstrate that F determines a unique reconstruction, if F is the full k-mer spectrum of T and T has no 3-repeats of length k-1 and no interleaved pairs of 2-repeats of length k-1. A finite-state automaton-like representation of the pairwise overlaps of the reads is introduced such that the unique identifiability of T reduces to the uniqueness of the Eulerian path in this representation. | |||
17:25 | Ely Porat | A Glimpse into Ami's Research | |
17:50 | End of the lectures | ||
18:30 | Reception On the lawn between the Jakobovits building (1002) and the Language Studies building (1004) (map). | ||
19:15 | Dinner In the Language Studies building (1004) foyer. |