From 57d83c228f31b0fbe751575ff0ff0a435f27d17b Mon Sep 17 00:00:00 2001 From: Marcin Chrzanowski Date: Sat, 11 Dec 2021 16:11:38 +0100 Subject: Add missing root vertex in tree case --- mgr.tex | 63 +++++++++++++++++++++++++++++++++++---------------------------- 1 file changed, 35 insertions(+), 28 deletions(-) (limited to 'mgr.tex') diff --git a/mgr.tex b/mgr.tex index ee1ce7b..ce27d02 100644 --- a/mgr.tex +++ b/mgr.tex @@ -504,7 +504,8 @@ its own. The question answering problem we will solve is: \queryproblem[% - regular language $L$ over alphabet $\Sigma$. + regular language $L$ over alphabet $\Sigma$, given by deterministic + automaton $A$. ]{Branch Infix Regular Questions}{% a $\Sigma$-labeled tree $T$. }{% @@ -512,19 +513,23 @@ The question answering problem we will solve is: the path from $x$ to $y$ belong to $L$? } -We begin with a similar construction as in the word case, i.e. we replace each -vertex of the tree with a copy of the states of $A$. Our new graph has vertex -set $V(T) \times Q$, and we can refer to a specific vertex as $x.q$ for $x \in -V(T)$, $q \in Q$. +Let $n$ be the size of $V(T)$. We begin with a similar construction as in the +word case, i.e. we create a graph of $n+1$ copies of $A$'s states $Q$. The +vertex set of the graph will be $(V(T) \cup \{ \mathsf{root} \}) \times Q \}$ +and we can refer to a specific vertex as $x.q$ for $x \in V(T) \cup \{ +\mathsf{root} \}, q \in Q$. Again, each letter $a \in \Sigma$ defines a function $a : Q \to Q$, and these functions induce edges in our ``fattened'' tree: consider a vertex $x$ of $T$, -labeled with letter $a$ and having children $y$ and $z$. If $A$ in state $q$, -reading letter $a$ goes to $q'$, then there will be edges from $x.q$ to $y.q'$ -and $z.q'$. +labeled with letter $a$ and let $y$ be its parent (or the new vertex +$\mathsf{root}$ when $x$ is the root of $T$). If $A$ in state $q$, reading +letter $a$ goes to $q'$, then there will be an edge from $y.q$ to $x.q'$. In the +word case we had placed copies of $Q$ around each letter and replaced the +letters with edges implied by $A$'s transition function, here we performed an +analogous transformation on a tree shaped input. We also color all vertices with colors $1, \ldots, |Q|$ in this graph, -analogously to how we did so in the word case: begin by coloring an arbitrary +analogously to how we did in the word case: begin by coloring an arbitrary vertex in the root with $1$, then follow edges downwards, coloring all visited vertices with $1$. Then begin the same process with the next color, restarting with a different vertex in a given copy of $Q$ if we run into an already colored @@ -542,8 +547,8 @@ Now let's consider how we could answer the question ``is the branch infix from $x$ down to $y$ in $L$?''. Here we can't proceed exactly as in the word case. We don't have a $\breaktable$ table since a vertex in an internal node can have arbitrarily many points below it where its color is broken by a lower one. -Somehow we need to be able to find such a color break point that is ``in the -direction of $y$ from $x$''. +Ideally we would like to be able to find such a color break point that is ``in +the direction of $y$ from $x$''. In the next section we formulate this question formally and solve it as a subproblem. @@ -640,15 +645,16 @@ check). \section{Answering branch infix regular questions} With the above problem solved, we are ready to finish our algorithm for branch -infix regular questions. Recall that we have replaced the input tree $T$'s -vertices with copies of $Q$, then connected and colored them in an analogous way -as in the word case from Section \ref{wordinfix}. We additionally remember in -each vertex $v$ a unique identifier of the connected component of vertices of -the same color that it is in (the identifiers can be the root vertices of each -such component), call this data $\componenttable[v]$. +infix regular questions. Recall that we have created a graph of copies of $Q$ +shaped similar to $T$ where edges between vertices correspond to transitions of +$A$ along corresponding vertices of $T$. During preprocessing we additionally +need to remember in each vertex $x.q$ a unique identifier of the connected +component of vertices of the same color that it is in (the identifiers can be +the root vertices of each such component), call this data +$\componenttable[x.q]$. For each color $c$, we will preprocess the above graph for answering questions -of the form ``if we start in a vertex $v$ colored with $c$, which is the first +of the form ``if we start in a vertex $z.q$ colored with $c$, which is the first copy of $Q$ on the path towards the copy of $Q$ that contains $y$?''. To achieve this, we create a copy of $T$ in which we mark the vertices where an edge from a $c$-colored vertex points to a vertex with a lower color. This tree we @@ -656,16 +662,17 @@ preprocess for highest marked descendant on path questions, solved in the previous section. Now question answering can proceed similar to the word case. Given a question -``does the word on vertices from $x$ down to $y$ belong to $L$?'' look at the -vertex $x.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to +``does the word on vertices from $x$ down to $y$ belong to $L$?'', let $x'$ be +$x$'s parent in $T$ (or $\mathsf{root}$ if $x$ was $T$'s root), look at the +vertex $x'.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to $y$ consider the vertex $y.q$, the unique vertex here of color $c$. If it is in -the same connected component as $x.q_0$ (which we check by comparing -$\componenttable[x.q_0]$ with $\componenttable[y.q]$), we can immediately answer -whether or not $A$ will accept the word on this path based on whether this state -is accepting. Indeed, if that is the case then we have not changed connected -components, thus there is a single-color path from $x.q_0$ down to $y.q$ and $q$ -is indeed the state $A$ would have ended in after running on the word given by -the labels from $x$ down to $y$. +the same connected component as $x'.q_0$ (which we check by comparing +$\componenttable[x'.q_0]$ with $\componenttable[y.q]$), we can immediately +answer whether or not $A$ will accept the word on this path based on whether +this state is accepting. Indeed, if that is the case then we have not changed +connected components, thus there is a single-color path from $x'.q_0$ down to +$y.q$ and $q$ is indeed the state $A$ would have ended in after running on the +word given by the labels from $x$ down to $y$. Otherwise, we need to jump down to a lower copy of $Q$. We can find the first such copy where the color on our path changes from $c$ to something lower by @@ -676,7 +683,7 @@ color $c'$ on the path towards $y$. From here we continue as at the beginning, checking $\componenttable$ to see if we can answer immediately, or taking another jump down. Handling each jump takes constant time, and the number of jumps is bounded by $|Q|$ since we can only jump to a lower color. Thus we can -answer branch infix regular questions in constant time. +answer branch infix regular questions in time constant with respect to $|T|$. \chapter{MSO Query Answering on Trees}\label{mso-query-answering} -- cgit v1.2.3