Abstracts

Abstracts:

Dan Alistarh

Computing with Soup: A Frugal Introduction to Population Protocols

Population protocols are a model of distributed computing in which n agents with limited local state interact randomly, cooperating to collectively compute global predicates. An extensive series of papers, across different communities, has examined the computability and complexity characteristics of this model, while applied work has shown that these algorithms can be actually implemented using synthetic DNA molecules. This talk will serve as a brief survey of recent results in this area, focusing on the use of probabilistic and combinatorial techniques, in the context of a fundamental task called majority. Majority (consensus) requires agents to collectively reach a decision as to which one of two initial states A or B had a higher initial count. Two complexity metrics are important: the time that a protocol requires to stabilize to an output decision, and the state space size that each agent requires. We will discuss recent results which almost tightly characterize the space-time trade-offs when solving majority in population protocols, with an eye towards techniques and open problems.

Lukas Barth

A Stroll Through the Woods: Engineering Balancing Binary Search Trees

Balancing binary search trees are one of the most widely used data structures. Aside from keeping ordered sets, they also are the basis for a whole lot of other data structures, such as dynamic segment trees (DSTs). DSTs, introduced by van Krefeld and Overmars[1], can play an important role e.g. in scheduling algorithms, as I will quickly outline. Motivated by this, I will present a comprehensive experimental evaluation of various binary search trees and their usefulness when used as a foundation for DSTs. For classic red-black trees, I will present a way of speeding up operations by reducing the number of branch mispredictions. For weight-balanced trees, I took a closer look at a top-down variant originally introduced by Lai and Wood [2], not only demonstrating that in practice, the advantages of a top-down approach are significant, but also gaining the insight that abandoning theoretical guarantees might be beneficial in practice. A third tree I looked at are the novel Zip Trees by Tarjan et al. [3], which prove to be performant especially as basis for small DSTs.
[1] van Kreveld, Marc J. and Mark H. Overmars. “Union-Copy Structures and Dynamic Segment Trees.” J. ACM 40 (1993): 635-652.
[2] Tony W. Lai and Derick Wood. "A top-down updating algorithm for weight-balanced trees." Int. J. Found. Comput. Sci., 4(4) (1993):309–324.
[3] Tarjan, Robert E., and Caleb C. Levy, and Stephen Timmel. "Zip Trees." arXiv preprint arXiv:1806.06726 (2018).

Michael A. Bender

Bloom Filters, Adaptivity, and the Dictionary Problem

A Bloom filter (or alternative) maintains a compact, probabilistic representation of a set S of keys from a universe U. The price of being small is that there is a (bounded) probability of false positives. This talk presets alternatives to Bloom filters that are faster, more space efficient, and support a wider range of operations. We show how these filters can adapt based on the results of past queries.
Joint work with Martin Farach-Colton, Mayank Goswami, Rob Johnson, Sam McCauley, and Shikha Singh.

Allan Borodin

Online Algorithms in Rounds

We introduce a variant of online algorithms where online input items are first partitioned into k priority classes, say {C1, . . . , Ck}, for some k ≥ 1. Input items in C1 arrive in an adversari- ally determined order, then items in C2 arrive in adversarial order, ... (This model was called prioritization algorithms in our AAMAS 2019 paper.) Clearly when k = 1, this is the standard online model. Our goal is to understand the limitations of this model when k is small compared to n, the number of input items. Similar to online algorithms with advice, the determination of these classes might be a function of the entire input instance. Hence, when k = n, prioritization algorithms can create an optimal ordering and for most optimization problems can then achieve an optimal solution. We illustrate the application of the prioritization model to online maximum bipartite matching algorithms. We can view these algorithms as mechanisms without money. We derive a number of results, for when the graph is both known and unknown to the mechanism.
This talk is based on the AAMAS 2019 paper “Efficient Allocation of Free Stuff”, co-authored with Yossi Azar, Michal Feldman, Amos Fiat and Kineret Segal.

Keren Censor-Hillel

Dynamic distributed computations

I will present new results on distributed computing in dynamic environments. The talk will focus on clique detection and a family of problems that we characterize combinatorially, which allows fast fixing even when the system is highly dynamic.
Based on joint papers with Matthias Bonne, Neta Dafni, Victor Kolobov, Ami Paz, and Gregory Schwartzman.

Artur Czumaj

Sublinear-time approximation of the cost of a metric k-nearest neighbor graph

Consider an n-point metric space (X,d) with distance oracle access. A k-nearest neighbor (kNN) graph for (X,d) is a directed graph G that has an edge to each of v's k nearest neighbors; let cost(G) denote the sum of edge weights of G. In this talk we will show how to approximate cost(G) in sublinear time.
Joint work with Christian Sohler.

Daniel Delling

Route Planning Algorithms in the Real World

Michal Dory

Fast Approximate Shortest Paths in the Congested Clique

I will discuss our recent shortest paths algorithms in the distributed congested clique model. Our results signiﬁcantly improve upon the state-of-the-art for approximate all-pairs shortest paths (APSP), multi-source shortest paths and diameter, as well as exact single-source shortest paths (SSSP). Our main techniques are new distance tools that are obtained via improved algorithms for sparse matrix multiplication, which we leverage to construct eﬃcient hopsets and shortest paths.
Joint work with Keren Censor-Hillel, Janne H. Korhonen and Dean Leitersdorf.

Faith Ellen

Competitive Analyses of Distributed Move-to-Front

List Accessing is a benchmark problem for the competitive analysis of self-adjusting data structures in sequential settings. Existing models for competitive analysis in distributed settings are not relevant for a distributed version of List Accessing. We present a distributed version of Move-to-Front, which is a simple example of a self-adjusting data structure for solving List Accessing in an asynchronous shared memory setting when each process has its own request sequence. Competitive analyses of this data structure are performed using new, more relevant models. Our analyses show that the effects of the adversarial scheduler can be quite significant, dominating the usual cost that arises from lack of information about the future.

Rati Gelashvili

Why Extension-based Proofs Fail

It is impossible to solve consensus without using locks or randomization in an asynchronous system where 2 processes communicate by reading from and writing to shared memory. The celebrated proof of this claim by Fischer, Lynch and Paterson builds an infinite execution, by repeatedly extending a finite execution, in which no process decides a value. In contrast, all known proofs of the impossibility of solving (n-1)-set agreement among n ≥ 3 processes rely on complex topological arguments, either directly or indirectly. (n-1)-set agreement requires n processes to decide at most n-1 different values and is equivalent to consensus when n=2. We define a class of extension-based proofs and show that no such proof can prove the unsolvability of (n-1)-set agreement among n ≥ 3 processes. To do so, we view a proof as an interaction between a prover, who is trying to construct a bad execution (in which n values are decided or some process takes an infinite number of steps without deciding a value), and an adversarial algorithm which is constructed adaptively. This, for the first time, sheds some light on why the conventional techniques have spectacularly failed for n>2.

Roberto Grossi

Reverse Search and Maximal Subgraphs with Strongly Accessible Properties

This talk describes an algorithmic framework for listing the inclusion-maximal subgraphs satisfying a given property (clique, cut, cycle, etc.). Among the existing algorithms, the best-known ones take time proportional to their number but may require exponential space. The framework employs the Avis-Fukuda reverse search to extend the class of problems that can be solved efficiently to strongly accessible set systems, reducing the additional space from exponential polynomial.
Joint work with Alessio Conte, Andrea Marino, and Luca Versari.

Torben Hagerup

Fast breadth-first search in little space

It is shown that a breadth-first search in a directed or undirected graph with n vertices and m edges can be carried out in O(n+m) time with n log_2 3+O((log n)^2) bits of working memory.

John Iacono

External memory priority queues with decrease-key and applications to graph algorithms

We present priority queues in the external memory model with block size $B$ and main memory size $M$ that support on $N$ elements, operation Update (a combination of operations Insert and Decrease-Key) in $O\left({\frac{1}{B}\log_{\frac{M}{B}} \frac{N}{B}}\right)$ amortized I/Os, operation Extract-Min in $O\left({\frac{M^{\varepsilon}}{B}\log_{\frac{M}{B}} \frac{N}{B}}\right)$ amortized I/Os and operation Delete in $O\left({\frac{M^{\varepsilon}}{B}\log^2_{\frac{M}{B}} \frac{N}{B}}\right)$ amortized I/Os, for any real $\varepsilon \in \left(0,1\right)$, using $O\left({\frac{N}{B}\log_{\frac{M}{B}} \frac{N}{B}}\right)$ blocks. Previous I/O-efficient priority queues either support these operations in $O\left({\frac{1}{B}\log_2 \frac{N}{B}}\right)$ amortized I/Os [Kumar and Schwabe, SPDP '96], or support only operations Insert, Delete and Extract-Min in optimal $O\left({\frac{1}{B}\log_{\frac{M}{B}} \frac{N}{B}}\right)$ amortized I/Os, however without supporting Decrease-Key [Fadel et al., TCS '99]. We also present buffered repository trees that support on a multi-set of $N$ elements, operations Insert and Extract (on $K$ extracted elements) in $O\left({\frac{1}{B}\log_{\frac{M}{B}} \frac{N}{B}}\right)$ and $O\left({\log_{\frac{M}{B}} \frac{N}{B} + K/B}\right)$ amortized I/Os, respectively, using $O\left({\frac{N}{B}}\right)$ blocks. This improves upon the previous I/O-bounds that achieve a base-$2$ logarithm instead [Buchsbaum et al., SODA '00]. Our results imply improved $O\left({\frac{E}{B}\log_{\frac{M}{B}} \frac{E}{B}}\right)$ I/Os for single-source shortest paths, depth-first search and breadth-first search algorithms on massive directed graphs with an edge-to-node ratio $\frac{E}{V}=\Omega({M^{\varepsilon}})$, which is equal to the I/O-optimal bound for sorting $E$ values in external memory.
Joint work with Riko Jacob and Konstantinos Tsakalidis.

Giuseppe F. Italiano

2-Connectivity in Directed Graphs

I will talk about some recent theoretical and experimental results on 2-edge and 2-vertex connectivity in directed graphs. Despite being complete analogs of the corresponding notions on undirected graphs, in digraphs 2-connectivity has a much richer and more complicated structure. For undirected graphs it has been known for over 40 years how to compute all bridges, articulation points, 2-edge- and 2-vertex-connected components in linear time, by simply using depth first search. In the case of digraphs, however, the very same problems have been much more challenging and have been tackled only recently.

Valerie King

The challenge of broadcasting and a new graph sampling lemma

The problem of broadcasting in a distributed network is to transmit a single message from one node to every other node. We assume each node knows its neighbors, and communication is by message-passing. For many years, the simple flooding algorithm was considered optimal but this requires a transmission across every edge. Since 2015, there has been substantial progress on this problem. Monte Carlo algorithms, for synchronous and asynchronous networks can build a spanning tree (or minimum spanning tree) with a number of messages sublinear in the number of edges, for dense graphs. But Monte Carlo algorithms have a probabilty of error. Is a deterministic algorithm possible? In this talk I’ll focus on recent work about a simplified version of the problem. It has led to a surprising graph sampling lemma and a new method for determining a spanning tree in an MPC model like hadoop or map-reduce.
This is joint work with Jacob Holm, Mikkel Thorup, Or Zamir, and Uri Zwick.

Tsvi Kopelowitz

The Optimal (?) Complexity of Set-disjointness Data-structures

In the set-disjointness problem the goal is to preprocess a family of sets $F=\{S_1,S_2,...,S_k\}$, all from universe $U$, so that given two sets $S_i,S_j\in F$, one can quickly establish whether the two sets are disjoint or not. If $N$ is the sum of the sizes of the sets in F, then let $N^p$ be the preprocessing time and let $N^q$ be the query time. A folklore combinatorial algorithm has a tradeoff curve of $p+q=2$, which is optimal (up to sub-polynomial terms) for combinatorial algorithms, assuming the combinatorial BMM conjecture. In SODA'16, Kopelowitz, Pettie, and Porat showed that, based on the 3SUM conjecture, there is a conditional lower bound curve of $p+2q \ge 2$, and so there exists a gap between the upper bound curve and the lower bound curve when allowing non-combinatorial techniques. In this talk we will show that both curves can be improved. Specifically, if one assumes that the constant in the exponent of fast matrix multiplication is $\omega=2$, then one can obtain an upper bound curve of $p+2q \ge 2$ for $q\le 1/3$ (matching the 3SUM based lower bound for this case), and $2p+q=3$ for $q\ge 1/3$. Moreover, we introduce a new conjecture on the time required for detecting a triangle in an unbalanced tripartite graph, which is closely related to the triangle detection conjecture for general graphs, and is used to show that the new upper bound curve for set-disjointness is tight.
Joint work with Virginia Vassilevska-Williams

William Kuszmaul

Achieving Optimal Backlog in Multi-Processor Cup Games

The single- and multi- processor cup games can be used to model natural problems in areas such as processor scheduling, deamortization, and buffer management. At the beginning of the single-processor cup game, $n$ cups are initially empty. In each step of the game, a filler distributes 1 unit of water among the cups, and then an emptier selects a cup and removes $1 + \epsilon$ units from that cup. The goal of the emptier is to minimize the amount of water in the fullest cup, also known as the backlog. It is known that the greedy algorithm (i.e., empty the fullest cup) achieves backlog $O(\log n)$, and that no deterministic algorithm can do better. We show that the performance of the greedy algorithm can be greatly improved with a small amount of randomization: After any step $t$, and for any $k \ge \Omega(\log \epsilon^{-1})$, the emptier achieves backlog at most $O(k)$ with probability at least $1 - O(2^{-2^{k}})$. In each step of the $p$-processor $n$-cup game, the filler distributes $p(1-\epsilon)$ unit of water among the cups, with no cup receiving more than $1-\delta$ units of water, and then the emptier selects $p$ cups and removes $1$ unit of water from each. Proving nontrivial bounds on the backlog for the multi-processor cup game has remained open for decades. We present a simple analysis of the greedy algorithm for the multi-processor cup game, establishing a backlog of $O(\epsilon^{-1} \log n)$, as long as $\delta$, the game's other speed-augmentation constant, is at least $1 / \poly(n)$. Turning to randomized algorithms, we encounter an unexpected phenomenon: When the number of processors $p$ is large, the backlog after each step drops to constant with large probability. Specifically, we show that if $\delta$ and $\epsilon$ satisfy reasonable constraints, then there exists an algorithm that bounds the backlog after a given step by three or less with probability at least $1 - \exp(-Omega(\epsilon^2p))$. We further extend the guarantees of our randomized algorithm to consider larger backlogs.

Hung Viet Le

Truly Optimal Euclidean Spanners

Euclidean spanners are important geometric structures, having found numerous applications over the years. Cornerstone results in this area from the late 80s and early 90s state that for any D dimensional N-point Euclidean space, there exists a (1+Epsilon)-spanner with N O(Epsilon^{-D+1}) edges and lightness (normalized weight) O(Epsilon^{-2D}). Surprisingly, the fundamental question of whether or not these dependencies on Epsilon and D for small D can be improved has remained elusive, even for D = 2. In this work, we give answers to several important problems in this area. We nail down the exact dependencies on Epsilon and D and showing that the greedy spanner is truly optimal. We show that any (1+Epsilon)-spanner must have N O(Epsilon^{-D+1}) edges, implying that the greedy (and other) spanners achieve the optimal size. We show that any (1+Epsilon)-spanner must have lightness N O(Epsilon^{-D}) , and then show that the greedy algorithm achieves optimal lightness. We then complement our negative result for the size of spanners with a rather counterintuitive positive result: Steiner points lead to a quadratic improvement in the size of spanners! Our bound for the size of Steiner spanners is tight as well (up to lower-order terms).
(Based on a joint work with Shay Solomon from Tel Aviv University)

Moshe Lewenstein

Improved Space-Time Tradeoffs for kSUM

In the kSUM problem we are given an array of numbers $a_1,a_2,...,a_n$ and we are required to determine if there are $k$ different elements in this array that their sum equals 0. This problem is a parameterized version of the well-studied SUBSET-SUM problem, and its special case is the $3$SUM problem that is extensively used for proving conditional hardness. Several works investigated the interplay between time and space in the context of SUBSET-SUM. Recently, improved time-space tradeoffs were proven for $k$SUM using both randomized and deterministic algorithms. In this paper we obtain an improvement over the best known results for the time-space tradeoff for $k$SUM. A major ingredient in achieving these results is a general self reduction from $k$SUM to $m$SUM where $m (i) The best known Las Vegas solution to $k$SUM running in approximately $O(n^{k-\delta\sqrt{2k}})$ time and using $O(n^{\delta})$ space, for $0 \leq \delta \leq 1$.
(ii) The best known deterministic solution to $k$SUM running in approximately $O(n^{k-\delta\sqrt{k}})$ time and using $O(n^{\delta})$ space, for $0 \leq \delta \leq 1$.
(iii) A space-time tradeoff for solving $k$SUM using $O(n^{\delta})$ space, for $\delta>1$.
(iv) An algorithm for $6$SUM running in $O(n^4)$ time using just $O(n^{2/3})$ space.
(v) A solution to $3$SUM on random input using $O(n^2)$ time and $O(n^{1/3})$ space, under the assumption of a random read-only access to random bits.

Giorgi Nadiradze

Relaxed Schedulers Can Efficiently Parallelize Incremental Algorithms

Several classic problems in graph processing and computational geometry are solved via incremental algorithms, which split com- putation into a series of small tasks acting on shared state, which gets updated progressively. While the sequential variant of such algorithms usually specifies a fixed (but sometimes random) order in which the tasks should be performed, a standard approach to parallelizing such algorithms is to relax this constraint to allow for out-of-order parallel execution. This is the case for parallel implementations of Dijkstra’s single-source shortest-paths (SSSP) algorithm, and for parallel Delaunay mesh triangulation. While many software frameworks parallelize incremental computation in this way, it is still not well understood whether this relaxed ordering approach can still provide any complexity guarantees. In this talk, we address this problem, and analyze the efficiency guarantees provided by a range of incremental algorithms, such as Delaunay mesh triangulation and sorting by insertion when parallelized via relaxed schedulers. We also show upper bounds on additional work for SSSP in this case. Finally, we provide lower bounds showing that certain algorithms will inherently incur a non-trivial amount of wasted work due to scheduler relaxation, even for relatively benign relaxed schedulers.

Liam Roditty

Approximating APSP and Diameter

In this talk I will survey the literature on diameter in general graphs. Then I will present a quadratic time algorithm that almost achieve 3/2-approximation.
The talk is based on a joint work with Backurs, Segal, V. Williams, Wein.

Jared Saia

Proof-of-Work Without All the Work

A common tool to defend against Sybil attacks is proof-of-work, whereby computational puzzles are used to limit the number of Sybil participants. Unfortunately, current Sybil defenses require significant computational effort to offset an attack. In particular, good participants must spend computationally at a rate that is proportional to the spending rate of an attacker. In this talk, I will present a Sybil defense algorithm that is asymmetric in the sense that good participants spend at a rate that is asymptotically less than an attacker. In particular, if T is the rate of the attacker’s spending, and J is the rate of joining for good participants, then our algorithm spends at a rate of O(\sqrt{TJ} + J). I'll discuss empirical results that suggest that our algorithm can be significantly more efficient than previous defenses under various attack scenarios. Additionally, I'll discuss a lower bound showing that our algorithm’s spending rate is asymptotically optimal among a large family of algorithms.

Siddhartha Sen

HAIbrid Algorithms

HAIbrid (Human + AI) Algorithms synergize human solutions with AI to enhance the performance and adaptivity of hand-designed data structures and algorithms. These data structures and algorithms underlie our cloud storage, search, and scheduling systems. Rather than avoiding AI or using it blindly, we seek the right combination -- a form of human-AI collaboration. I will describe a paradigm that combines reinforcement learning with the ability to ask counterfactual ("what if") questions about any decision-making algorithm, provided there is sufficient randomness in its decisions. This paradigm can readily be applied to data structures like skip lists and treaps which are naturally randomized. Our ultimate goal is to create a "universal data structure" that delivers the best performance for every phase of a workload, through human + AI co-design.

Robert E. Tarjan

Concurrent Connected Components Algorithms

The problem of finding the connected components of an undirected graph is one of the most basic in graph algorithms. It can be solved sequentially in linear time using graph search or in almost-linear time using a disjoint-set data structure. The latter solves the incremental version of the problem, in which edges are added singly or in batches on-line. With the growth of the internet, computing connected components on huge graphs has become important, and both experimentalists and theoreticians have explored the use of concurrency in speeding up the computation. We shall survey recent work. Even simple concurrent algorithms are hard to analyze, as we discuss. This work is joint with Sixue Liu of Princeton.

Jan van den Brand

Algebraic dynamic graph algorithms and data structures

Algebraic techniques are used in the best algorithms for many dynamic graph problems whose complexity is not yet well-understood, such as maintaining reachability, distances, maximum matching size in a graph. Understanding the complexity of the underlying algebraic problems is a key to understand these graph problems.

Renato Werneck

Massive TSPs on Road Networks

We present practical heuristics for the Traveling Salesman Path (TSP) problem on road networks. Our focus is on scalable solutions for very large inputs, which has applications to the creation of delivery routes. By leveraging exact distance oracles on road networks in non-trivial ways, our approach requires subquadratic time and space in practice. As a result, it can find high-quality solutions to instances with hundreds of thousands of points in minutes.

Or Zamir

Faster k-SAT Algorithms using Biased-PPSZ

The PPSZ algorithm, due to Paturi, Pudlak, Saks and Zane, is currently the fastest known algorithm for the k-SAT problem, for every k>3. For 3-SAT, a tiny improvement over PPSZ was obtained by Hertli. We introduce a biased version of the PPSZ algorithm using which we obtain an improvement over PPSZ for every k>=3. For k=3 we also improve on Herli's result and get a much more noticeable improvement over PPSZ, though still relatively small. In particular, for Unique 3-SAT, we improve the current bound from 1.308^n to 1.307^n.
Joint work with Thomas Dueholm Hansen, Haim Kaplan and Uri Zwick.

Home >
Abstracts >

ADS 2019

9th Bertinoro Workshop on Algorithms and Data Structures