m-chrzan.xyz
aboutsummaryrefslogtreecommitdiff
path: root/presentation
diff options
context:
space:
mode:
Diffstat (limited to 'presentation')
-rw-r--r--presentation/presentation.md198
1 files changed, 198 insertions, 0 deletions
diff --git a/presentation/presentation.md b/presentation/presentation.md
new file mode 100644
index 0000000..ec491e9
--- /dev/null
+++ b/presentation/presentation.md
@@ -0,0 +1,198 @@
+% MSO Query Answering on Trees
+% Marcin Chrzanowski
+% December 20, 2022
+
+## Question Answering
+
+* Fix a function $f : \mathcal{S} \times \mathcal{Q} \to \mathcal{A}$
+* A *preprocessing* algorithm that on input $S \in \mathcal{S}$ outputs an
+ indexing structure $S'$ (doesn't need to be in $\mathcal{S}$).
+* An *answering* algorithm that on input $Q \in \mathcal{Q}$, together with
+ $S'$, outputs $f(S, Q)$.
+
+## Question Answering - Time Complexity
+
+* Let $n = |S|$, $m = |Q|$.
+* If the preprocessing algorithm works in time $f(n)$ and the answering
+ algorithm works in time $g(n, m)$, we notate the full algorithm's time
+ complexity as $\langle f(n), g(n, m) \rangle$.
+
+## Example: LCA
+
+### Least Common Ancestor problem (LCA)
+
+**Given**: a tree $T$.
+
+**Questions**: for two vertices $x$ and $y$, what is the lowest (furthest from the
+root) vertex that is an ancestor of both $x$ and $y$?
+
+\
+
+Classic Harel and Tarjan result: this can be solved in $\langle O(n), O(1)
+\rangle$.
+
+## MSO Query Answering on Trees
+
+**Fixed**: an MSO formula $\varphi(\vec{X})$ over trees with $k$ free
+second-order variables.
+
+**Given**: a tree T.
+
+**Questions**: is a given $k$-tuple of subsets of $T$'s vertices $\vec{W}$, a
+satisfying assignment to $\vec{X}$? I.e., does $T \models \varphi(\vec{W})$.
+
+Note: first-order variables are supported by restricting input sets to
+singletons.
+
+# Prior Work
+
+## Naïve Solutions - Model Checking
+
+* Introduce unary relations $U_i$.
+* For question $\vec{W}$, label elements of $W_i$ with $U_i$.
+* If we can answer MSO model checking on trees in time $f(n)$, we can answer
+ questions in $O(f(n))$.
+
+## Naïve Solution - Courcelle
+
+* Courcelle: MSO model checking over structures of bounded treewidth is
+ $O_{\varphi}(n)$.
+* So we can solve MSO query answering in $\langle O(1), O_{\varphi}(n) \rangle$.
+
+## Naïve Solution - Amarilli et al.
+
+* Amarilli et al.: algorithm that preprocess a tree in time $O_{\varphi}(n)$,
+ then allows relabeling in time $O_{\varphi}(\log{n})$, answers model checking
+ in $O(1)$ for the updated tree.
+* Preprocess tree according to Amarilli.
+* For a question $\vec{W}$, run an update for each vertex in $\vec{W}$, get the
+ answer.
+* So an $\langle O_{\varphi}(n), O_{\varphi}(m \log{n}) \rangle$ solution.
+
+## Kazana
+
+* In his PhD thesis, solved MSO query answering (there called query *testing*)
+ in $\langle O_{\varphi}(n), O_{\varphi}(1) \rangle$.
+* But limited to formulae whose free variables are all first-order.
+* Uses Colcombet's factorization forests.
+
+# Our solution
+
+##
+
+* I show an $\langle O_{\varphi}(n), O_{\varphi}(m \log{m}) \rangle$ solution to
+ MSO query answering.
+* Answering is $O_{\varphi}(1)$ when restricting to only first-order free
+ variables, matching Kazana's result.
+* Should be understandable by a CS student who has taken undergraduate
+ algorithmics and automata theory courses.
+
+## Reductions
+
+* Reduce from MSO to tree automaton (nonelementary wrt to $\varphi$, linear wrt
+ to tree).
+* Transform to a binary tree.
+
+## Relabel Regular Questions on Trees
+
+**Fixed**: a deterministic bottom-up tree automaton $A$ over $\Sigma$.
+
+**Given**: a tree T labeled with $\Sigma$.
+
+**Questions**: given $m$ relabelings $v_1 \mapsto a_1, \ldots, v_m \mapsto a_m$,
+for $v_i \in V(T)$ and $a_i \in \Sigma$, what state does $A$ arrive at in the
+root of $T'$, where $T'$ is $T$ with each $v_i$'s label modified to the
+corresponding $a_i$?
+
+## Answering Questions
+
+Let $W$ be the set of relabeled vertices, $m := |W|$.
+
+We'll partition the tree such that:
+
+* There will be $O(m)$ rooted parts.
+* Each element of $W$ will be the root of some part (there may be other parts
+ not rooted in an element of $W$).
+* Working bottom-up, we can compute $A$'s state in the root of each part in
+ $O(1)$.
+
+Note: computing the partition takes time $O(m \log{m})$ as it uses a sort.
+
+## Types of Parts
+
+* Subtree.
+* Singleton.
+* Subtree with a hole.
+
+## Types of Parts
+
+![](figures/partition.pdf)
+
+## Computing Roots of Parts - Subtree
+
+* During preprocessing, we precomputed a run of $A$ over $T$.
+* In a subtree part, no vertices other than the root were relabeled.
+* So we can apply $A$'s transition function to the precomputed states of the
+ root's children, and the root's new label.
+
+## Computing Roots of Parts - Singleton
+
+* Both of the singleton's children are roots of parts.
+* Since we're working bottom-up, we've already computed the states in the roots
+ of those.
+* Apply $A$'s transition function to those and the singleton's new label.
+
+## Computing Roots of Parts - Subtree With a Hole
+
+* Nontrivial case.
+* Idea: can be computed by a DFA walking up from hole to root.
+* Chapter 3: Branch Infix Regular Questions
+ * Show how to solve this in $\langle O_A(n), O_A(1) \rangle$.
+ * Generalization of a clever algorithm on words.
+
+## Computing Roots of Parts - Subtree With a Hole
+
+![Information needed for subtree of $v$ with hole $w$](figures/subtree-with-hole.svg){height=80%}
+
+# Branch Infix Regular Questions
+
+## Branch Infix Regular Questions
+
+**Fixed**: regular language $L$ over alphabet $\Sigma$, given by DFA $A$.
+
+**Given**: a tree T labeled with $\Sigma$.
+
+**Questions**: given a vertex $x$ and its descendant $y$, does the word given by
+labels on the path from $x$ to $y$ belong to $L$?
+
+## The Word Case
+
+![](figures/word.svg)
+
+## Generalizing to Trees
+
+
+![](figures/tree.pdf){width=40%} ![](figures/tree-graph.pdf){width=40%}
+
+## Jumping Down in a Tree
+
+* Need to be able to decide which node to jump down to when color path breaks.
+* For each color, mark nodes where the color breaks.
+* We can compute the highest marked descendant on a path between two nodes in
+ $\langle O(n), O(1) \rangle$
+ * Method inspired by RMQ algorithm by Bender and Farach-Colton.
+
+## Highest Marked Descendant on Path
+
+* `pre`: pre-order numbers of $T$'s vertices, arranged in post-order.
+* `index[`$v$`]`: $v$'s index in `pre`.
+* For $x$ and its descendant $y$, consider the range
+ `pre[index[`$y$`], index[`$x$`] - 1]`:
+ * All the values correspond to descendants of $x$.
+ * Values smaller than `pre[index[`$y$`]]` correspond to ancestors of $y$.
+* For unmarked nodes, set their value in `pre` to $\infty$.
+* Now a range minimum query over the above range gives us the answer.
+
+## Highest Marked Descendant on Path
+
+![`pre` for example tree](figures/tree-marked.pdf){height=70%}