From 475dae3f1926db5673e70b08b1f677b6848ce284 Mon Sep 17 00:00:00 2001 From: Marcin Chrzanowski Date: Wed, 24 Nov 2021 15:29:37 +0100 Subject: Rewrite parts of introduction --- mgr.tex | 67 ++++++++++++++++++++++++++++++++++------------------------------- 1 file changed, 35 insertions(+), 32 deletions(-) diff --git a/mgr.tex b/mgr.tex index 1a1d924..89f6c2d 100644 --- a/mgr.tex +++ b/mgr.tex @@ -66,43 +66,46 @@ in which we are tasked with answering whether a given structure $S$ satisfies a sentence $\varphi$ of some logic. Several other interesting computational problems arise when we generalize to formulae with free variables. -A natural computational problem one could consider is \definedterm{query -evaluation} in which, when given a formula $\varphi(\vec{X})$ and a structure -$S$, the expected output is all the valuations of $\vec{X}$ satisfying -$\varphi$. While in many respects this seems like a good generalization of model -checking, the size of the entire output could be exponential in the size of the -structure and query, thus waiting for an algorithm to output all the answers -might not always be feasible or desirable. +As an analogy to databases, formulae with free variables are often referred to +as \definedterm{queries} in the literature. We consider the \definedterm{query +answering} problem: given a query $\varphi(X_1, ..., X_k)$ and a structure $S$, +we want to answer questions of the form ``does assigning the set of elements +$W_1$ to $X_1$, ..., $W_k$ to $X_k$, satisfy $\varphi$?''. A naive solution +would be to transform the formula $\varphi$ into a sentence by adding $k$ unary +relations $U_1, \ldots, U_k$ and labeling elements of $W_i$ with $U_i$. Now we +can answer an individual question in the same time as model checking. Already model checking is known to be computationally difficult for various interesting logics, but the situation can improve when restricting the class of -structures considered. For example, MSO model checking is NP-hard in general, -but \textcite{courcelle1990} showed that deciding if an MSO sentence holds in a -structure of bounded treewidth can be done in time linear in the size of the -structure. - -Similarly, query evaluation of MSO queries on bounded treewidth structures was -shown to be polynomial in the size of the structure and the output by -\textcite{courcelle1992}. Their result was improved by \textcite{flum2001} to -linear time. - -In this work, instead of evaluation, we will instead consider \definedterm{query -answering}. In this setting, we are insterested in alrogithms that first -preprocess the structure $S$ and formula $\varphi(\vec{X})$, and later are able -to quickly answer individual queries of the form ``does the valuation $\vec{W}$ -of $\vec{X}$ satisfy $\varphi$?'' +structures considered. In this work we will focus on the case of MSO logic over +trees. While MSO model checking is NP-hard in general, \textcite{courcelle1990} +showed that deciding if an MSO sentence holds in a structure of bounded +treewidth can be done in time linear in the size of the structure. This, using +the naive method described above, gives us an algorithm for MSO query answering +over trees that answers each question in time $O(n)$. + +In our setting we suppose that we want to answer multiple such questions and +thus allow a preprocessing step over the formula and tree. In this case, the +above naive algorithm could be improved with Amarilli et al.'s +\cite{amarilli2019} data structure that allows updates of the underlying tree. +Let $m$ be the size of a given question, i.e. $|W_1 \cup ... \cup W_k|$. +Amarilli's result allows us to preprocess the tree in linear time, label it with +a choice of $W_1, \ldots, W_k$ in time $O(m \log n)$, then answer model checking +in constant time, thus greatly improving over the naive algorithm for questions +that talk about a small number of vertices compared to the size of the whole +tree. \section{Main result} -Our main result is that, for trees, after a preprocessing step linear in the -size of the tree, we can answer queries in time linearithmic with respect to the -query size (in particular, the time to answer a query does not depend on the -size of the tree). +We will improve on the above method even further showing an algorithm that, +after linear preprocessing of the tree can answer questions in time linearithmic +with respect to the question size. In particular, we eliminate any dependence on +tree size during question answering. More explicitly, we will show how to solve Problem \ref{mso-query-answering-problem}, with a preprocessing algorithm working in -time $O(n)$ (where $n$ is the size of the input tree), and then answer queries -in time $O(m \log m)$ (where $m$ is the size of the query): +time $O(n)$ (where $n$ is the size of the input tree), and then answer questions +in time $O(m \log m)$ (where $m$ is the size of the question): \queryproblem[% an MSO formula $\varphi(\vec{X})$ over trees with $k$ free @@ -117,10 +120,10 @@ in time $O(m \log m)$ (where $m$ is the size of the query): $\vec{X}$? In other words, does $T \models \varphi(\vec{W})$? } -Though we speak about MSO formulae with second-order variables only, first-order -variables are also supported simply by restricting to queries in which sets -assigned to single-order variables are singletons. Note that in the case when -the input formula has only first-order free variables, queries are answered in +Note that though we speak of MSO formulae with second-order variables only, +first-order variables are supported by simply restricting to questions in which +sets assigned to single-order variables are singletons. Also note, that for +formulae with first-order free variables only, question answering happens in constant time. Through a series of reductions, we will show that MSO query answering reduces to -- cgit v1.2.3