From 0acf29d29a54dcf9f927bd806c2bef128bb66cbb Mon Sep 17 00:00:00 2001 From: Marcin Chrzanowski Date: Thu, 7 Oct 2021 13:33:08 +0200 Subject: Elaborate on regular infix algorithm --- mgr.tex | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) (limited to 'mgr.tex') diff --git a/mgr.tex b/mgr.tex index 6f6a55e..cfb7058 100644 --- a/mgr.tex +++ b/mgr.tex @@ -267,26 +267,32 @@ construction as we will be generalizing its internals for the tree case in Chapter \ref{branchinfix}. We begin by replacing each letter of $w$ with the set of states of $A$, $Q$. +More precisely, we will be working with a graph whose vertex set is $\{1, +\ldots, |w|\} \times Q$. We will refer to a vertex in the $i$th copy of $Q$ +labeled with state $q$ as $i.q$. Since $A$ is deterministic, each letter $a \in \Sigma$ can be seen as a function $a : Q \to Q$, and we draw these functions as directed edges between successive -copies of $Q$. For example, if $A$ in state $q$, reading $a$, would move to -state $q'$, then, for any copy of $Q$ corresponding to an instance of $a$ in the -original word, there will be an edge from $q$ there to $q'$ in the successive -copy of $Q$. +copies of $Q$. For example, suppose the $i$th letter of $w$ is $a$. If $A$ in +state $q$, reading $a$, would move to state $q'$, then there will be an edge +from $i.q$ to $(i+1).q'$. + +Note that by determinism, each vertex has exactly one outgoing edge (except for +vertices in the last copy of $Q$ which have no outgoing edges). Now we will color the vertices of the graph we just constructed with the colors $1, 2, \ldots, |Q|$ in such a way that \begin{enumerate} \item every copy of $Q$ has one vertex of each of the $|Q|$ colors; - \item when a vertex of color $i$ has an edge to a vertex of color $j$ in a - successive copy of $Q$, then $i \geq j$. + \item when $i.q$ has color $c$ and there is an edge to $(i+1).q'$, which is + colored with color $c'$, then $c \geq c'$. \end{enumerate} The second point basically means that we will be trying to draw single-color paths, but when paths need to join, it is the higher-colored path that gets cut -off. +off (think of the colors as priorities, with lower numbers representing more +important priorities). The construction is as follows: @@ -304,9 +310,15 @@ The construction is as follows: \item Repeat steps 3.-5. for each successive color up to $|Q|$. \end{enumerate} -Additionally, for each vertex $v$, we store the index of the next copy of $Q$ in +Additionally, for each vertex $i.q$, we store the index of the next copy of $Q$ in which the path of this vertex's color is broken by a lower color. Put this -information in table \breaktable. +information in table \breaktable. For example, if $i.q$ is colored with color +$c$, and when following the deterministic path from $i.q$, all encountered +vertices are colored $c$ until $j.q'$ which is colored $c'$, then we set +$\breaktable[i.q] = j - 1$. + +We can compute \breaktable\ in linear time with a single backwards pass through +the colored graph. To handle the query ``does $w[i, j] \in L$'', we look at vertex $v$, which is the vertex of $A$'s initial state in the $i$th copy of $Q$ and note its color -- cgit v1.2.3