Bas Ketsman

MSc · PhD student in Computer Science

Short Bio

I'm a 4th year PhD student in Computer Science at Hasselt University and a proud member of its Databases and Theoretical Computer Science group, where I work with professor Frank Neven (my advisor). I'm also a PhD Fellow of the Research Foundation - Flanders (FWO). I conduct research in the area of principles of database management and enjoy working on topics related to cloud computing and big data. Prior to starting a PhD I obtained an MSc degree in computer science, databases.

During the months February till April 2016 I visited UWDB, the database group of the University of Washington in Seattle, to work with professor Dan Suciu. I also collaborate with professor Thomas Schwentick from tu Dortmund.

Travel schedule: DBDBD'16 in Mons, Belgium; SIGMOD/PODS'17 in Raleigh, NC, USA.

Contact Information

Hasselt University, Agoralaan, Building D, B-3590 Diepenbeek, Belgium Room: D6 Tel: +3211268209 E-mail: Firstname.Lastname (at) uhasselt.be

Publications

Note: authors are in alphabetical order.

Conference Publications

  • A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries, PODS 2017 (accepted for publication). With Dan Suciu.
  • Parallel-Correctness and Containment for Conjunctive Queries with Union and Negation, ICDT 2016. With Gaetano Geck, Frank Neven and Thomas Schwentick. {abstract · drops · arXiv}

    Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query, also referred to as parallel-correctness. This paper extends the study of the complexity of parallel-correctness and its constituents, parallel-soundness and parallel-completeness, to unions of conjunctive queries with and without negation. As a by-product it is shown that the containment problem for conjunctive queries with negation is coNEXPTIME-complete.

  • Parallel-Correctness and Transferability for Conjunctive Queries, PODS 2015. With Tom Ameloot, Gaetano Geck, Frank Neven and Thomas Schwentick. {abstract · acm · arXiv · slides · poster}

    A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel- correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including, for instance, the Hypercube distribution.

  • Datalog Queries Distributing over Components, ICDT 2015. With Tom Ameloot, Frank Neven and Daniel Zinn. {abstract · drops}

    We investigate the class D of queries that distribute over components. These are the queries that can be evaluated by taking the union of the query results over the connected components of the database instance. We show that it is undecidable whether a (positive) Datalog program distributes over components. Additionally, we show that connected Datalog with negation (the fragment of Datalog with negation where all rules are connected) provides an effective syntax for Datalog with negation programs that distribute over components under the stratified as well as under the well-founded semantics. As a corollary, we obtain a simple proof for one of the main results in previous work, namely, that the classic win-move query is in F2 (a particular class of coordination-free queries).

  • Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data, ICDT 2015. With Frank Neven. {abstract · drops · slides}
    • Journal version invited to a special issue of TOCS

    In a distributed context where data is dispersed over many computing nodes, monotone queries can be evaluated in an eventually consistent and coordination-free manner through a simple but naive broadcasting strategy which makes all data available on every computing node. In this paper, we investigate more economical broadcasting strategies for full conjunctive queries without self-joins that only transmit a part of the local data necessary to evaluate the query at hand. We consider oblivious broadcasting strategies which determine which local facts to broadcast independent of the data at other computing nodes. We introduce the notion of broadcast dependency set (BDS) as a sound and complete formalism to represent locally optimal oblivious broadcasting functions. We provide algorithms to construct a BDS for a given conjunctive query and study the complexity of various decision problems related to these algorithms.

  • Weaker Forms of Monotonicity for Declarative Networking: a more fine-grained answer to the CALM-conjecture, PODS 2014. With Tom Ameloot, Frank Neven and Daniel Zinn. {abstract · acm · slides · poster}

    The CALM-conjecture, first stated by Hellerstein and proved in its revised form by Ameloot et al. within the framework of relational transducer networks, asserts that a query has a coordination-free execution strategy if and only if the query is monotone. Zinn et al. extended the framework of relational transducer networks to allow for specific data distribution strategies and showed that the non-monotone win-move query is coordination-free for domain-guided data distributions. In this paper, we complete the story by equating increasingly larger classes of coordination-free computations with increasingly weaker forms of mono-tonicity and make Datalog variants explicit that capture each of these classes. One such fragment is based on stratified Datalog where rules are required to be connected with the exception of the last stratum. In addition, we characterize coordination-freeness as those computations that do not require knowledge about all other nodes in the network, and therefore, can not globally coordinate. The results in this paper can be interpreted as a more fine-grained answer to the CALM-conjecture.

Journal Publications

  • Data partitioning for single-round multi-join evaluation in massively parallel systems, Sigmod Record - Special Issue on 2015 ACM SIGMOD Research Highlights, volume 45 issue 1. With Tom Ameloot, Gaetano Geck, Frank Neven and Thomas Schwentick.
  • Weaker Forms of Monotonicity for Declarative Networking: a more fine-grained answer to the CALM-conjecture, ACM TODS 2016, volume 40 issue 4. With Tom Ameloot, Frank Neven and Daniel Zinn.

Theses

Talks

  • A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries on Graphs, DBDBD 2016, Mons, Belgium, 28 October 2016. {abstract}
  • Single-Round Multi-Join Evaluation, University of Washington, Seattle, Washington, USA, 8 April 2016. {slides}
  • Parallel-Correctness and Transferability for Conjunctive Queries, PODS 2015, Melbourne, VIC, Australia, 1 June 2015. {slides · poster}
  • Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data, ICDT 2015, Brussels, Belgium, 25 March 2015. {slides}
  • How to Stay CALM While Seeing What is not There?, DBDBD 2014, Antwerp, Belgium, 17 October 2014. {abstract · slides}
  • Weaker Forms of Monotonicity for Declarative Networking: a more fine-grained answer to the CALM-conjecture, PODS 2014, Snowbird, Utah, USA, 22 June 2014. {slides · poster}