m-chrzan.xyz
aboutsummaryrefslogtreecommitdiff
path: root/presentation/presentation.md
blob: 9652d3d25310b6a487db4a2d77754d90b14c4b3d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
% MSO Query Answering on Trees
% Marcin Chrzanowski
% December 20, 2022

## Question Answering

* Fix a function $f : \mathcal{S} \times \mathcal{Q} \to \mathcal{A}$
* *preprocessing*: on input $S \in \mathcal{S}$ output an indexing structure
  $S'$.
* *answering*: on input $Q \in \mathcal{Q}$, $S'$ outputs $f(S, Q)$.

## Question Answering - Time Complexity

* Let $n = |S|$, $m = |Q|$.
* Preprocessing algorithm works in time $f(n)$
* Answering algorithm works in time $g(n, m)$
* Notate the full algorithm's time complexity as $\langle f(n), g(n, m)
  \rangle$.

## Example: LCA

### Least Common Ancestor problem (LCA)

**Given**: a tree $T$.

**Questions**: for two vertices $x$ and $y$, what is the lowest (furthest from the
root) vertex that is an ancestor of both $x$ and $y$?

\

Classic Harel and Tarjan result: this can be solved in $\langle O(n), O(1)
\rangle$.

## MSO Query Answering on Trees

**Fixed**: an MSO formula $\varphi(\vec{X})$ over trees with $k$ free
second-order variables.

**Given**: a tree T.

**Questions**: is a given $k$-tuple of subsets of $T$'s vertices $\vec{W}$, a
satisfying assignment to $\vec{X}$? I.e., does $T \models \varphi(\vec{W})$.

Note: first-order variables are supported by restricting input sets to
singletons.

# Prior Work

## Reduce to Model Checking

* Courcelle: MSO model checking over structures of bounded treewidth is
  $O_{\varphi}(n)$.
    * MSO query answering in $\langle O(1), O_{\varphi}(n) \rangle$.
* Amarilli et al.: $O_{\varphi}(n)$ preprocessing, $O_{\varphi}(\log{n})$
  relabeling.
    * MSO query answering in $\langle O_{\varphi}(n), O_{\varphi}(m \log{n})
      \rangle$.

## Kazana

* In his PhD thesis, solved MSO query answering (there called query *testing*)
  in $\langle O_{\varphi}(n), O_{\varphi}(1) \rangle$.
* But limited to formulae whose free variables are all first-order.
* Uses Colcombet's factorization forests.

# Our solution

##

* I show an $\langle O_{\varphi}(n), O_{\varphi}(m \log{m}) \rangle$ solution to
  MSO query answering.
* Answering is $O_{\varphi}(1)$ when restricting to only first-order free
  variables, matching Kazana's result.
* Should be understandable by a CS student who has taken undergraduate
  algorithmics and automata theory courses.

## Reductions

* Reduce from MSO to tree automaton (nonelementary wrt to $\varphi$, linear wrt
  to tree).
* Transform to a binary tree.

## Relabel Regular Questions on Trees

**Fixed**: a deterministic bottom-up tree automaton $A$ over $\Sigma$.

**Given**: a tree T labeled with $\Sigma$.

**Questions**: given $m$ relabelings $v_1 \mapsto a_1, \ldots, v_m \mapsto a_m$,
for $v_i \in V(T)$ and $a_i \in \Sigma$, what state does $A$ arrive at in the
root of $T'$, where $T'$ is $T$ with each $v_i$'s label modified to the
corresponding $a_i$?

## Answering Questions

Let $W$ be the set of relabeled vertices, $m := |W|$.

We'll partition the tree such that:

* There will be $O(m)$ rooted parts.
* Each element of $W$ will be the root of some part (there may be other parts
  not rooted in an element of $W$).
* Working bottom-up, we can compute $A$'s state in the root of each part in
  $O(1)$.

Note: computing the partition takes time $O(m \log{m})$ as it uses a sort.

## Types of Parts

* Subtree.
* Singleton.
* Subtree with a hole.

## Types of Parts

![](figures/partition.pdf)

## Computing Roots of Parts - Subtree

* During preprocessing, we precomputed a run of $A$ over $T$.
* In a subtree part, no vertices other than the root were relabeled.
* So we can apply $A$'s transition function to the precomputed states of the
  root's children, and the root's new label.

## Computing Roots of Parts - Singleton

* Both of the singleton's children are roots of parts.
* Since we're working bottom-up, we've already computed the states in the roots
  of those.
* Apply $A$'s transition function to those and the singleton's new label.

## Computing Roots of Parts - Subtree With a Hole

* Nontrivial case.
* Idea: can be computed by a DFA walking up from hole to root.
* Chapter 3: Branch Infix Regular Questions
    * Show how to solve this in $\langle O_A(n), O_A(1) \rangle$.
    * Generalization of a clever algorithm on words.

## Computing Roots of Parts - Subtree With a Hole

![Information needed for subtree of $v$ with hole $w$](figures/subtree-with-hole.svg){height=80%}

# Branch Infix Regular Questions

## Branch Infix Regular Questions

**Fixed**: regular language $L$ over alphabet $\Sigma$, given by DFA $A$.

**Given**: a tree T labeled with $\Sigma$.

**Questions**: given a vertex $x$ and its descendant $y$, does the word given by
labels on the path from $x$ to $y$ belong to $L$?

## The Word Case

![](figures/word.svg)

## Generalizing to Trees


![](figures/tree.pdf){width=40%} ![](figures/tree-graph.pdf){width=40%}

## Jumping Down in a Tree

* Need to be able to decide which node to jump down to when color path breaks.
* For each color, mark nodes where the color breaks.
* We can compute the highest marked descendant on a path between two nodes in
  $\langle O(n), O(1) \rangle$
    * Method inspired by RMQ algorithm by Bender and Farach-Colton.

## Highest Marked Descendant on Path

* `pre`: pre-order numbers of $T$'s vertices, arranged in post-order.
* `index[`$v$`]`: $v$'s index in `pre`.
* For $x$ and its descendant $y$, consider the range
    `pre[index[`$y$`], index[`$x$`] - 1]`:
    * All the values correspond to descendants of $x$.
    * Values smaller than `pre[index[`$y$`]]` correspond to ancestors of $y$.
* For unmarked nodes, set their value in `pre` to $\infty$.
* Now a range minimum query over the above range gives us the answer.

## Highest Marked Descendant on Path

![`pre` for example tree](figures/tree-marked.pdf){height=70%}