m-chrzan.xyz
aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--mgr.tex63
1 files changed, 35 insertions, 28 deletions
diff --git a/mgr.tex b/mgr.tex
index ee1ce7b..ce27d02 100644
--- a/mgr.tex
+++ b/mgr.tex
@@ -504,7 +504,8 @@ its own.
The question answering problem we will solve is:
\queryproblem[%
- regular language $L$ over alphabet $\Sigma$.
+ regular language $L$ over alphabet $\Sigma$, given by deterministic
+ automaton $A$.
]{Branch Infix Regular Questions}{%
a $\Sigma$-labeled tree $T$.
}{%
@@ -512,19 +513,23 @@ The question answering problem we will solve is:
the path from $x$ to $y$ belong to $L$?
}
-We begin with a similar construction as in the word case, i.e. we replace each
-vertex of the tree with a copy of the states of $A$. Our new graph has vertex
-set $V(T) \times Q$, and we can refer to a specific vertex as $x.q$ for $x \in
-V(T)$, $q \in Q$.
+Let $n$ be the size of $V(T)$. We begin with a similar construction as in the
+word case, i.e. we create a graph of $n+1$ copies of $A$'s states $Q$. The
+vertex set of the graph will be $(V(T) \cup \{ \mathsf{root} \}) \times Q \}$
+and we can refer to a specific vertex as $x.q$ for $x \in V(T) \cup \{
+\mathsf{root} \}, q \in Q$.
Again, each letter $a \in \Sigma$ defines a function $a : Q \to Q$, and these
functions induce edges in our ``fattened'' tree: consider a vertex $x$ of $T$,
-labeled with letter $a$ and having children $y$ and $z$. If $A$ in state $q$,
-reading letter $a$ goes to $q'$, then there will be edges from $x.q$ to $y.q'$
-and $z.q'$.
+labeled with letter $a$ and let $y$ be its parent (or the new vertex
+$\mathsf{root}$ when $x$ is the root of $T$). If $A$ in state $q$, reading
+letter $a$ goes to $q'$, then there will be an edge from $y.q$ to $x.q'$. In the
+word case we had placed copies of $Q$ around each letter and replaced the
+letters with edges implied by $A$'s transition function, here we performed an
+analogous transformation on a tree shaped input.
We also color all vertices with colors $1, \ldots, |Q|$ in this graph,
-analogously to how we did so in the word case: begin by coloring an arbitrary
+analogously to how we did in the word case: begin by coloring an arbitrary
vertex in the root with $1$, then follow edges downwards, coloring all visited
vertices with $1$. Then begin the same process with the next color, restarting
with a different vertex in a given copy of $Q$ if we run into an already colored
@@ -542,8 +547,8 @@ Now let's consider how we could answer the question ``is the branch infix from
$x$ down to $y$ in $L$?''. Here we can't proceed exactly as in the word case. We
don't have a $\breaktable$ table since a vertex in an internal node can have
arbitrarily many points below it where its color is broken by a lower one.
-Somehow we need to be able to find such a color break point that is ``in the
-direction of $y$ from $x$''.
+Ideally we would like to be able to find such a color break point that is ``in
+the direction of $y$ from $x$''.
In the next section we formulate this question formally and solve it as a
subproblem.
@@ -640,15 +645,16 @@ check).
\section{Answering branch infix regular questions}
With the above problem solved, we are ready to finish our algorithm for branch
-infix regular questions. Recall that we have replaced the input tree $T$'s
-vertices with copies of $Q$, then connected and colored them in an analogous way
-as in the word case from Section \ref{wordinfix}. We additionally remember in
-each vertex $v$ a unique identifier of the connected component of vertices of
-the same color that it is in (the identifiers can be the root vertices of each
-such component), call this data $\componenttable[v]$.
+infix regular questions. Recall that we have created a graph of copies of $Q$
+shaped similar to $T$ where edges between vertices correspond to transitions of
+$A$ along corresponding vertices of $T$. During preprocessing we additionally
+need to remember in each vertex $x.q$ a unique identifier of the connected
+component of vertices of the same color that it is in (the identifiers can be
+the root vertices of each such component), call this data
+$\componenttable[x.q]$.
For each color $c$, we will preprocess the above graph for answering questions
-of the form ``if we start in a vertex $v$ colored with $c$, which is the first
+of the form ``if we start in a vertex $z.q$ colored with $c$, which is the first
copy of $Q$ on the path towards the copy of $Q$ that contains $y$?''. To achieve
this, we create a copy of $T$ in which we mark the vertices where an edge from a
$c$-colored vertex points to a vertex with a lower color. This tree we
@@ -656,16 +662,17 @@ preprocess for highest marked descendant on path questions, solved in the
previous section.
Now question answering can proceed similar to the word case. Given a question
-``does the word on vertices from $x$ down to $y$ belong to $L$?'' look at the
-vertex $x.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to
+``does the word on vertices from $x$ down to $y$ belong to $L$?'', let $x'$ be
+$x$'s parent in $T$ (or $\mathsf{root}$ if $x$ was $T$'s root), look at the
+vertex $x'.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to
$y$ consider the vertex $y.q$, the unique vertex here of color $c$. If it is in
-the same connected component as $x.q_0$ (which we check by comparing
-$\componenttable[x.q_0]$ with $\componenttable[y.q]$), we can immediately answer
-whether or not $A$ will accept the word on this path based on whether this state
-is accepting. Indeed, if that is the case then we have not changed connected
-components, thus there is a single-color path from $x.q_0$ down to $y.q$ and $q$
-is indeed the state $A$ would have ended in after running on the word given by
-the labels from $x$ down to $y$.
+the same connected component as $x'.q_0$ (which we check by comparing
+$\componenttable[x'.q_0]$ with $\componenttable[y.q]$), we can immediately
+answer whether or not $A$ will accept the word on this path based on whether
+this state is accepting. Indeed, if that is the case then we have not changed
+connected components, thus there is a single-color path from $x'.q_0$ down to
+$y.q$ and $q$ is indeed the state $A$ would have ended in after running on the
+word given by the labels from $x$ down to $y$.
Otherwise, we need to jump down to a lower copy of $Q$. We can find the first
such copy where the color on our path changes from $c$ to something lower by
@@ -676,7 +683,7 @@ color $c'$ on the path towards $y$. From here we continue as at the beginning,
checking $\componenttable$ to see if we can answer immediately, or taking
another jump down. Handling each jump takes constant time, and the number of
jumps is bounded by $|Q|$ since we can only jump to a lower color. Thus we can
-answer branch infix regular questions in constant time.
+answer branch infix regular questions in time constant with respect to $|T|$.
\chapter{MSO Query Answering on Trees}\label{mso-query-answering}