diff options
| -rw-r--r-- | mgr.tex | 63 | 
1 files changed, 35 insertions, 28 deletions
| @@ -504,7 +504,8 @@ its own.  The question answering problem we will solve is:  \queryproblem[% -    regular language $L$ over alphabet $\Sigma$. +    regular language $L$ over alphabet $\Sigma$, given by deterministic +    automaton $A$.  ]{Branch Infix Regular Questions}{%      a $\Sigma$-labeled tree $T$.  }{% @@ -512,19 +513,23 @@ The question answering problem we will solve is:      the path from $x$ to $y$ belong to $L$?  } -We begin with a similar construction as in the word case, i.e. we replace each -vertex of the tree with a copy of the states of $A$. Our new graph has vertex -set $V(T) \times Q$, and we can refer to a specific vertex as $x.q$ for $x \in -V(T)$, $q \in Q$. +Let $n$ be the size of $V(T)$. We begin with a similar construction as in the +word case, i.e. we create a graph of $n+1$ copies of $A$'s states $Q$. The +vertex set of the graph will be $(V(T) \cup \{ \mathsf{root} \}) \times Q \}$ +and we can refer to a specific vertex as $x.q$ for $x \in V(T) \cup \{ +\mathsf{root} \}, q \in Q$.  Again, each letter $a \in \Sigma$ defines a function $a : Q \to Q$, and these  functions induce edges in our ``fattened'' tree: consider a vertex $x$ of $T$, -labeled with letter $a$ and having children $y$ and $z$. If $A$ in state $q$, -reading letter $a$ goes to $q'$, then there will be edges from $x.q$ to $y.q'$ -and $z.q'$. +labeled with letter $a$ and let $y$ be its parent (or the new vertex +$\mathsf{root}$ when $x$ is the root of $T$). If $A$ in state $q$, reading +letter $a$ goes to $q'$, then there will be an edge from $y.q$ to $x.q'$. In the +word case we had placed copies of $Q$ around each letter and replaced the +letters with edges implied by $A$'s transition function, here we performed an +analogous transformation on a tree shaped input.  We also color all vertices with colors $1, \ldots, |Q|$ in this graph, -analogously to how we did so in the word case: begin by coloring an arbitrary +analogously to how we did in the word case: begin by coloring an arbitrary  vertex in the root with $1$, then follow edges downwards, coloring all visited  vertices with $1$. Then begin the same process with the next color, restarting  with a different vertex in a given copy of $Q$ if we run into an already colored @@ -542,8 +547,8 @@ Now let's consider how we could answer the question ``is the branch infix from  $x$ down to $y$ in $L$?''. Here we can't proceed exactly as in the word case. We  don't have a $\breaktable$ table since a vertex in an internal node can have  arbitrarily many points below it where its color is broken by a lower one. -Somehow we need to be able to find such a color break point that is ``in the -direction of $y$ from $x$''. +Ideally we would like to be able to find such a color break point that is ``in +the direction of $y$ from $x$''.  In the next section we formulate this question formally and solve it as a  subproblem. @@ -640,15 +645,16 @@ check).  \section{Answering branch infix regular questions}  With the above problem solved, we are ready to finish our algorithm for branch -infix regular questions. Recall that we have replaced the input tree $T$'s -vertices with copies of $Q$, then connected and colored them in an analogous way -as in the word case from Section \ref{wordinfix}. We additionally remember in -each vertex $v$ a unique identifier of the connected component of vertices of -the same color that it is in (the identifiers can be the root vertices of each -such component), call this data $\componenttable[v]$. +infix regular questions. Recall that we have created a graph of copies of $Q$ +shaped similar to $T$ where edges between vertices correspond to transitions of +$A$ along corresponding vertices of $T$. During preprocessing we additionally +need to remember in each vertex $x.q$ a unique identifier of the connected +component of vertices of the same color that it is in (the identifiers can be +the root vertices of each such component), call this data +$\componenttable[x.q]$.  For each color $c$, we will preprocess the above graph for answering questions -of the form ``if we start in a vertex $v$ colored with $c$, which is the first +of the form ``if we start in a vertex $z.q$ colored with $c$, which is the first  copy of $Q$ on the path towards the copy of $Q$ that contains $y$?''. To achieve  this, we create a copy of $T$ in which we mark the vertices where an edge from a  $c$-colored vertex points to a vertex with a lower color. This tree we @@ -656,16 +662,17 @@ preprocess for highest marked descendant on path questions, solved in the  previous section.  Now question answering can proceed similar to the word case. Given a question -``does the word on vertices from $x$ down to $y$ belong to $L$?'' look at the -vertex $x.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to +``does the word on vertices from $x$ down to $y$ belong to $L$?'', let $x'$ be +$x$'s parent in $T$ (or $\mathsf{root}$ if $x$ was $T$'s root), look at the +vertex $x'.q_0$ and consider its color $c$. In the copy of $Q$ corresponding to  $y$ consider the vertex $y.q$, the unique vertex here of color $c$. If it is in -the same connected component as $x.q_0$ (which we check by comparing -$\componenttable[x.q_0]$ with $\componenttable[y.q]$), we can immediately answer -whether or not $A$ will accept the word on this path based on whether this state -is accepting. Indeed, if that is the case then we have not changed connected -components, thus there is a single-color path from $x.q_0$ down to $y.q$ and $q$ -is indeed the state $A$ would have ended in after running on the word given by -the labels from $x$ down to $y$. +the same connected component as $x'.q_0$ (which we check by comparing +$\componenttable[x'.q_0]$ with $\componenttable[y.q]$), we can immediately +answer whether or not $A$ will accept the word on this path based on whether +this state is accepting. Indeed, if that is the case then we have not changed +connected components, thus there is a single-color path from $x'.q_0$ down to +$y.q$ and $q$ is indeed the state $A$ would have ended in after running on the +word given by the labels from $x$ down to $y$.  Otherwise, we need to jump down to a lower copy of $Q$. We can find the first  such copy where the color on our path changes from $c$ to something lower by @@ -676,7 +683,7 @@ color $c'$ on the path towards $y$. From here we continue as at the beginning,  checking $\componenttable$ to see if we can answer immediately, or taking  another jump down. Handling each jump takes constant time, and the number of  jumps is bounded by $|Q|$ since we can only jump to a lower color. Thus we can -answer branch infix regular questions in constant time. +answer branch infix regular questions in time constant with respect to $|T|$.  \chapter{MSO Query Answering on Trees}\label{mso-query-answering} |