From 81fdd3f346ed286cbf413515e66f21fcd3c28d27 Mon Sep 17 00:00:00 2001
From: Marcin Chrzanowski <mc370754@students.mimuw.edu.pl>
Date: Tue, 7 Dec 2021 13:21:08 +0100
Subject: Formally show automaton

---
 mgr.tex | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 81 insertions(+), 9 deletions(-)

(limited to 'mgr.tex')

diff --git a/mgr.tex b/mgr.tex
index c2d3405..1c1644b 100644
--- a/mgr.tex
+++ b/mgr.tex
@@ -961,15 +961,87 @@ path from the hole $w$ up to the root $v$. What we claim is that the question
 ``if $A$'s state in $w$ is $q$, is the state in $v$ going to be $q'$?'' can be
 answered with a branch infix regular query.
 
-This is the case because a finite automaton can compute $A$'s states along the
-path from $w$ to $v$ when given access to each of the node's label, state of its
-children in the run of $A$ on the original labeling of $T$, and the information
-whether this node is a left or right child of its parent. Thus the language
-$L_{q, q'}$ over alphabet $Q \times \Sigma \times Q \times \{ left, right \}$ of
-proper semgents of a run of $A$ that start in state $q$ and end in state $q'$ is
-a regular language. Our algorithm for branch infix regular queries dealt with
-top-down queries, but regular languages are reversible. So we preprocess $T$ for
-branch infix regular queries for each language $L_{q, q'}^R$.
+\subsection{Computing the root state of a subtree with a hole}
+
+Consider the subtree of $x$ with hole $y$. It can be divided into the following
+components:
+
+\begin{itemize}
+    \item $y$'s subtree
+    \item A path from $y$ up to $x$, notate it as $x_0 = y, x_1, \ldots, x_p =
+        x$.
+    \item For each $i$ from 1 to $p$, the subtree of $x_i$'s child that doesn't
+        lie on the path from $y$ to $x$. Let $x_i'$ be the child of $x_i$ that
+        doesn't lie on the $y$ to $x$ path.
+\end{itemize}
+
+With our bottom-up computation, we've already computed the new state in $y$.
+There have been no relabelings in the subtrees of the $x_i'$'s, so we can get
+their states from the precomputed run of $A$ on the original tree. Now consider
+how, using all this information, we could easily compute $x$'s new state.
+
+To compute the state in $x_1$, all we need is $x_1$'s label and the states of
+$x_0$ and $x_1'$ -- $x_1$'s children. This is all information we already have.
+So going up the path from $y$ to $x$, we can compute each $x_i$'s new state
+from its label and the states of $x_{i-1}$ and $x_i'$. The computation as just
+described would take time linear in $p$, the length of the path, but the
+computation is simple enough that it could be performed by a finite state word
+automaton, allowing us to use our result from Chapter \ref{branchinfix}. To show
+this formally, consider the alphabet $\Sigma' := Q \times \Sigma \times Q \times
+\{ \leftsymbol, \rightsymbol \}$. A word over this alphabet can be used to
+represent a subtree with a hole. A subtree with hole as described above would be
+represented by word $a_0 a_1 \ldots a_p$ where
+
+\begin{itemize}
+    \item Each $a_i$ is a quadruple from $\Sigma'$, $\langle p_i, b_i, q_i, d_i
+        \rangle$ (with $p_i, q_i \in Q$, $b_i \in \Sigma$, $d_i$ either
+        $\leftsymbol$ or $\rightsymbol$).
+    \item $b_i$ is the original label of $x_i$.
+    \item $p_i$ and $q_i$ are the states of $x_i$'s children in the precomputed
+        run of $A$ over $T$.
+    \item $d_i$ encodes the information of whether $x_i$ is the left or right
+        child of its parent.
+\end{itemize}
+
+Note that this is slightly more information than we previously mentioned: in our
+description of the computation above we only cared about the state of the child
+not on the $y$ to $x$ path. The information about the other child is redundant
+when considering just one subtree with hole and can be ignored when doing
+computation on that particular subtree with hole. But we will keep information
+about both children so that we can handle questions about any subtree with hole.
+
+Now consider the language $L_{q,q'} \subseteq \Sigma'^*$ of words $a_0 \ldots
+a_p$ such that the subtree with hole encoded by such a word, if the hole's state
+were set to $q$, would lead to $A$ arriving at state $q'$ after running up it.
+
+\begin{lemma}
+    $L_{q,q'}$ is a regular language.
+\end{lemma}
+
+Let's explicitly construct the automaton recognizing $L_{q,q'}$. Its state
+space will be $\{ \mathsf{initial} \} \cup Q \times \{ \leftsymbol, \rightsymbol
+\}$, states other than $\mathsf{initial}$ encoding, for the vertex represented
+by the previously read letter, what state we arrived in at it and whether it was
+the current vertex's left or right child. The automaton works as follows:
+
+\begin{enumerate}
+    \item In the initial state $\mathsf{initial}$, read $a_0 = \langle p_0, b_0,
+        q_0, d_0 \rangle$, set the state to $\langle q, d_0 \rangle$.
+    \item Now when reading letter $\langle p_i, b_i, q_i, d_i \rangle$ in state
+        $\langle r, d \rangle$, if $d$ if $\leftsymbol$, use $A$'s transition
+        function $\delta$ to compute the new tree state as $\delta(r, b_i, q_i)$
+        (the previously computed tree state is of our left child's, and our
+        right child's comes from the current letter). If $d$ is $\rightsymbol$,
+        the new tree state will instead be $\delta(p_i, b_i, r)$. Store this and
+        $d_i$ as the new state.
+    \item Accept if the final tree state is $q'$.
+\end{enumerate}
+
+Our algorithm for branch infix regular queries dealt with top-down queries, but
+regular languages are reversible. So we preprocess $T$ for branch infix regular
+queries for each language $L_{q, q'}^R$. Now a query about whether the word on
+the path from $x$ down to $y$ belongs to $L_{q, q'}^R$ is the same as asking if
+the word from $y$ up to $x$ belongs to $L{q,q'}$.
 
 Now to compute $v$'s state after the relabeling of $T$, given we've already
 computed $w$'s new state $q$, take $v'$ -- $v$'s child in the direction of $w$,
-- 
cgit v1.2.3