Abstract:
Linked Data Fragments (LDFs) refer to interfaces that allow for publishing and querying Knowledge Graphs on the Web. These interfaces primarily differ in their expressivity and allow for exploring different trade-offs when balancing the workload between clients and servers in decentralized SPARQL query processing. To devise efficient query plans, clients typically rely on heuristics that leverage the metadata provided by the LDF interface, since obtaining fine-grained statistics from remote sources is a challenging task. However, these heuristics are prone to potential estimation errors based on the metadata which can lead to inefficient query executions with a high number of requests, large amounts of data transferred, and, consequently, excessive execution times.
In this work, we investigate robust query processing techniques for Linked Data Fragment clients to address these challenges.
We first focus on robust plan selection by proposing CROP, a query plan optimizer that explores the cost and robustness of alternative query plans.
Then, we address robust query execution by proposing a new class of adaptive operators:
Polymorphic Join Operators. These operators adapt their join strategy in response to possible cardinality estimation errors. The results of our first experimental study show that CROP outperforms state-of-the-art clients by exploring alternative plans based on their cost and robustness.In our second experimental study, we investigate to what extent different planning approaches can benefit from polymorphic join operators and find that they enable more efficient query execution in the majority of cases.