m-chrzan.xyz
aboutsummaryrefslogtreecommitdiff
path: root/mgr.tex
diff options
context:
space:
mode:
authorMarcin Chrzanowski <mc370754@students.mimuw.edu.pl>2021-08-09 20:50:02 -0400
committerMarcin Chrzanowski <mc370754@students.mimuw.edu.pl>2021-08-09 20:50:02 -0400
commitfce0c20992e6235680b53621cf749fd142f81dd1 (patch)
tree189595f6bb8469cf02603d8cbdbeb6bf4cf50b07 /mgr.tex
parentd8637ef9adee6a033e311ba2b71ad9a6089149a2 (diff)
Expand query problems section
Diffstat (limited to 'mgr.tex')
-rw-r--r--mgr.tex84
1 files changed, 59 insertions, 25 deletions
diff --git a/mgr.tex b/mgr.tex
index a7bb567..8929626 100644
--- a/mgr.tex
+++ b/mgr.tex
@@ -170,8 +170,34 @@ We note that the converse of this theorem is also true (i.e. that for every tree
automaton, there is a corresponding MSO formula), however we will use only the
MSO to automata direction in this work.
+\section{Query answering problems}
+
+Consider a computational problem whose inputs are of the form $(S, q) \in
+\mathcal{S} \times Q$ for some set of \definedterm{structures} $\mathcal{S}$ and
+some set of \definedterm{queries} $Q$. This induces a \definedterm{query
+answering problem} which is divided into two phases:
+
+\begin{description}
+ \item[preprocessing] An input structure $S \in \mathcal{S}$ is given and a
+ \definedterm{preprocessing algorithm} outputs an indexing structure
+ $S'$.
+ \item[queries] On-line queries $q_1, \ldots$ from $Q$ about $S$ need to be
+ handled, with access to the preprocessing output $S'$.
+\end{description}
+
+We are interested in the time complexities of both phases. We use the following
+notation for algorithms that have both a preprocessing and query phase: If
+it takes $f(n)$ time to complete the preprocessing step for an input of size
+$n$, and $g(n, m)$ time to then handle a query of size $m$, we say the algorithm
+has time complexity \qptime{$f(n)$}{$g(n, m)$}.
+
+We turn to a discussion of several query problems with known solutions, both to
+serve as examples and because we will be using them in our algorithm.
+
+\subsection{Lowest Common Ancestor}
+
\queryproblem{%
- Least Common Ancestor (LCA) Queries
+ Lowest Common Ancestor (LCA) Queries
}{%
a tree $T$.
}{%
@@ -180,23 +206,28 @@ MSO to automata direction in this work.
ancestor.
}
-\subsection{Range Minimum Query}
-f
-
-\chapter{Branch Infix Regular Queries}\label{r:branchinfix}
+\textcite{tarjan1984} were the first to show an optimal \qptime{$O(n)$}{$O(1)$}
+algorithm for LCA. \textcite{schieber1988} used a similar approach but
+simplified the indexing structure, keeping the same time complexities.
+\textcite{berkman1993} showed a completely new approach to the problem, which
+relies on answering range minimum queries (see below) about an array of properly
+ordered tree vertices. \textcite{bender2000} offer a simpler presentation of the
+algorithm in \cite{berkman1993} and note the equivalence between the LCA and RMQ
+problems.
-In this chapter we will present a solution to the following query problem:
+\subsection{Range Minimum Query (RMQ)}
-\queryproblem[%
- regular language $L$ over alphabet $\Sigma$.
-]{Branch Infix Regular Queries}{%
- a $\Sigma$-labeled tree $T$.
+\queryproblem{Range Minimum Queries}{%
+ an array $A$ of integers.
}{%
- given a vertex $x$ and its descendant $y$, does the word given by labels on
- the path from $x$ to $y$ belong to $L$?
+ given indices $i$ and $j$, return the index of the smallest
+ element in the subarray $A[i, j]$.
}
-\section{Word infix regular queries}
+As mentioned above, \textcite{bender2000} show an \qptime{$O(n)$}{$O(1)$}
+algorithm for the RMQ problem.
+
+\subsection{Word infix regular queries}
\queryproblem[%
regular language $L$ over alphabet $\Sigma$.
@@ -211,7 +242,7 @@ There is a simple and well-known data structure that, given a fixed automaton
$A$, preprocesses an input word $w$ in time $O(n)$, and afterwards is able to
answer in time $O(1)$ questions of the form ``Is the infix $w[i, j]$ accepted by
$A$?''. We present the full construction as we will be generalizing its
-internals for the tree case.
+internals for the tree case in \Ref{branchinfix}.
Take $A$ that is deterministic. We begin by replacing each letter of $w$ with
the set of states of $A$, $Q$.
@@ -254,6 +285,19 @@ The construction is as follows:
Additionally, in each vertex, we store the index of the next copy of $Q$ in
which the path of this vertex's color is broken by a lower color.
+\chapter{Branch Infix Regular Queries}\label{branchinfix}
+
+In this chapter we will present a solution to the following query problem:
+
+\queryproblem[%
+ regular language $L$ over alphabet $\Sigma$.
+]{Branch Infix Regular Queries}{%
+ a $\Sigma$-labeled tree $T$.
+}{%
+ given a vertex $x$ and its descendant $y$, does the word given by labels on
+ the path from $x$ to $y$ belong to $L$?
+}
+
\section{Generalizing to trees}
We generalize the above construction from regular infix queries on a
@@ -280,17 +324,7 @@ to handle such queries in constant time. The structure is heavily inspired by
\textcite{bender2000}, where a simple algorithm for computing LCA queries is
presented.
-First, let's define an auxillary problem we will use, that of Range Minimum
-Queries (RMQ):
-
-\queryproblem{Range Minimum Queries}{%
- an array $A$ of integers.
-}{%
- given indices $i$ and $j$, return the index of the smallest
- element in the subarray $A[i, j]$.
-}
-
-Fact: RMQ queries can be answered in constant time after linear preprocessing of
+Recall that RMQ queries can be answered in constant time after linear preprocessing of
the input array.
We turn to describing the index structure for our tree problem.