From 3b27436f177e54fd88d5c1603c5c6b039957f41b Mon Sep 17 00:00:00 2001
From: Marcin Chrzanowski <mc370754@students.mimuw.edu.pl>
Date: Thu, 14 Oct 2021 13:08:02 +0200
Subject: Expand introduction

---
 mgr.tex | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 82 insertions(+), 11 deletions(-)

(limited to 'mgr.tex')

diff --git a/mgr.tex b/mgr.tex
index 899986f..fd09164 100644
--- a/mgr.tex
+++ b/mgr.tex
@@ -61,7 +61,47 @@
 \tableofcontents
 
 \chapter{Introduction}
-% \addcontentsline{toc}{chapter}{Introduction}
+
+The model checking problem is a classic problem in theoretical computer science
+in which we are tasked with answering whether a given structure $S$ satisfies a
+sentence $\varphi$ of some logic. Several other interesting computational
+problems arise when we generalize to formulae with free variables.
+
+A natural computational problem one could consider is \definedterm{query
+evaluation} in which, when given a formula $\varphi(\vec{X})$ and a structure
+$S$, the expected output is all the valuations of $\vec{X}$ satisfying
+$\varphi$. While in many respects this seems like a good generalization of model
+checking, the size of the entire output could be exponential in the size of the
+structure and query, thus waiting for an algorithm to output all the answers
+might not always be feasible or desirable.
+
+Already model checking is known to be computationally difficult for various
+interesting logics, but the situation can improve when restricting the class of
+structures considered. For example, MSO model checking is NP-hard in general,
+but Courcelle showed that deciding if an MSO sentence holds in a structure of
+bounded treewidth can be done in time linear in the size of the structure.  % TODO: referencja do Courcella
+
+Similarly, query evaluation of MSO queries on bounded treewidth structures was
+shown to be polynomial in the size of the structure and the output by Courcelle
+and Mosbah. Their result was improved by Frick, Flum, and Grohe to linear time.
+
+In this work, instead of evaluation, we will instead consider \definedterm{query
+answering}. In this setting, we are insterested in alrogithms that first
+preprocess the structure $S$ and formula $\varphi(\vec{X})$, and later are able
+to quickly answer individual queries of the form ``does the valuation $\vec{W}$
+of $\vec{X}$ satisfy $\varphi$?''
+
+\section{Main result}
+
+Our main result is that, for trees, after a preprocessing step linear in the
+size of the tree, we can answer queries in time linearithmic with respect to the
+query size (in particular, the time to answer a query does not depend on the
+size of the tree).
+
+More explicitly, we will show how to solve Problem
+\ref{mso-query-answering-problem}, with a preprocessing algorithm working in
+time $O(n)$ (where $n$ is the size of the input tree), and then answer queries
+in time $O(m \log m)$ (where $m$ is the size of the query):
 
 \queryproblem[%
     an MSO formula $\varphi(\vec{X})$ over trees with $k$ free
@@ -74,11 +114,17 @@
     given a $k$-tuple of subsets of $T$'s vertices $\vec{W}
     \in \mathcal{P}(V(T))^{k}$, is $\vec{W}$ a satisfying assignment to
     $\vec{X}$? In other words, does $T \models \varphi(\vec{W})$?
-}
+}\label{mso-query-answering-problem}
 
 Though we speak about MSO formulae with second-order variables only, first-order
 variables are also supported simply by restricting to queries in which sets
-assigned to single-order variables are singletons.
+assigned to single-order variables are singletons. Note that in the case when
+the input formula has only first-order free variables, queries are answered in
+constant time.
+
+Through a series of reductions, we will show that MSO query answering reduces to
+the following problem about tree automata, which we solve in Chapter
+\ref{mso-query-answering}:
 
 \queryproblem[%
     a deterministic bottom-up tree automaton $A$ over ranked alphabet $\Sigma$.
@@ -93,13 +139,38 @@ assigned to single-order variables are singletons.
     label modified to the corresponding $a_i$.
 }
 
-\chapter{Preliminaries}\label{r:pojecia}
-
-% \begin{quote}
-  % Blaba, który jest blaba, nie jest prawdziwym blaba.
-
-% \raggedleft\slshape tłum. z~chińskiego Robert Blarbarucki
-% \end{quote}
+\section{Related Work}
+
+A different generalization of model checking to the case of formulae with free
+variables is \definedterm{query enumeration}. In query enumeration, as in query
+evaluation, we are interested in outputing all the valuations satisfying a
+formula. However, instead of considering the global time of the algorithm to
+output every single result, we are interested in enumerating the valuations one
+by one with as small a delay between individual answers as possible. In other
+words, we split the entire algorithm into a preprocessing algorithm which builds
+an indexing structure, and an enumeration algorithm that, given the indexing
+structure, outputs the next answer and prepares the indexing structure for the
+next step.
+
+The complexity class \constantdelaylin\ refers to those enumeration
+problems in which the preprocessing step takes linear time and the delay between
+each successive answer (i.e. the time complexity of the second algorithm) is
+constant. \lineardelaylin\ is a generalization of \constantdelaylin\ where the
+time complexity of the second algorithm is linear in the size of the answer
+being output. \textcite{bagan2006} introduced this second notion and showed that
+enumerating MSO queries is \lineardelaylin\ on trees. This in particular means
+that enumerating MSO queries for formulae whose free variables are all first
+order is in \constantdelaylin, and \textcite{kazana2013} revisit this result,
+proving it using Colcombet's deterministic factorization forests rather than
+Bagan's intricate indexing structure.
+
+Colcombet's result could likely be used to solve the problem of branch infix
+regular queries, which we introduce as a subproblem of MSO query answering in
+Chapter \ref{branchinfix}. However, Colcombet's factorization depends on fairly
+deep and complex applications of semigroup theory. We present a much more
+straightforward algorithmic approach.
+
+\chapter{Preliminaries}
 
 \section{Definitions}
 \subsection{Trees}
@@ -539,7 +610,7 @@ down. Handling each jump takes constant time, and the number of jumps is bounded
 by $|Q|$ since we can only jump to a lower color. Thus we can answer branch
 infix regular queries in constant time.
 
-\chapter{MSO Query Answering on Trees}
+\chapter{MSO Query Answering on Trees}\label{mso-query-answering}
 
 We now have all the pieces necessary to present our \qptime{$O(n)$}{$O(m \log
 m)$} algorithm for MSO query answering on trees. We fix an MSO formula
-- 
cgit v1.2.3