Determinization Construction: NSA ↬ DNA

ω-Automata


Definition 1

A Non Deterministic Street Automaton (NSA) is a 5-tuple: A = <Q, Σ, δA, q0, CA>, consisting of:

  • a finite set of states (Q)
  • a finite set of input symbols called the alphabet (Σ)
  • a transition function (δA : Q × Σ → 2Q)
  • an initial or start state (q0Q)
  • a set of pairs CA = {(Ri, Gi)}i ∈ {0, ..., l - 1}, for Ri, GiQ

Let w = a1a2 ... be an infinite string over the alphabet Σ. The automaton A accepts the string w if there exists a run π,π12,... ∈ Qω with the following conditions:

  • π0 = q0
  • πm+1 = δAm, am) for all m ∈ {0, 1, 2, ...}
  • i ∈ {0, ..., l - 1} if πmRi infinetly often, then πmGi infinetly often.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function δ. The last condition says that the machine accepts w if for every Ri in the infinity set of the run, there must be a nonempty intersection of the infinity set with Gi for all i. Otherwise, it is said that the automaton rejects the string. The set of strings that A accepts is the language recognized by A and this language is denoted by L(A).


Definition 2

A Deterministic Numbered Automaton (DNA) is a 6-tuple: D = <Q, Σ, δD, q0, N, k>, consisting of:

  • a finite set of states (Q)
  • a finite set of input symbols called the alphabet (Σ)
  • a transition function (δD : Q × Σ → Q)
  • an initial or start state (q0Q)
  • a numbering function (N : Q × Σ → [k])

Let w = a1a2 ... be an infinite string over the alphabet Σ. The automaton A accepts the string w if there exists a run π,π12,... ∈ Qω with the following conditions:

  • π0 = q0
  • πm+1 = δAm, am) for all m ∈ {0, 1, 2, ...}
  • mini[Nm, am) = i for infinetle many m's] % 2 = 1

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function δ. The last condition says that the machine accepts w if the minimal numbering of the transitions the run goes through infinitely is odd. Otherwise, it is said that the automaton rejects the string. The set of strings that D accepts is the language recognized by D and this language is denoted by L(D).
Observe that the number associated with the transition leading to πm is completely ignored by the transition function (that is, numbering is memoryless).
Note: this is a special case of the parity-acceptance condition for ω-automata in terms of expression. That is, one can associate the number assigned to the transition, to the resulting state instead, multiplying the number of states by k, and resulting in a parity automaton. This however could be more efficient in terms of the size of the resulting automaton as well as the complexity of deciding non-emptiness, which is only logarithmic in k.




Determinization Construction

Given an n-states, l-acceptance-pairs Street automaton NSA A, the determinized NBA automaton D has states that are labeled and annotated trees: the labels are subsets of A's states, while every vertex of the tree is annotated by either an index i (one of the pairs {0, ..., l − 1} in the acceptance condition CA) or by one of the special symbols ✠ or Φ.
Intuitively, the annotation i corresponds to "waiting for Gi to be visited". The special symbol ✠ corresponds to runs that may never visit Gi - these still may be accepting, if they refrain from visiting Ri. Runs whose only chance of being accepting is if they avoid visits to all Ri-s, have their states occur in vertices annotated by Φ.
The one special child annotated by ✠ corresponds to those runs that, since the vertex was last green, have not visited Gi and therefore have to be considered for being accepting by not visiting Ri more than finitely many times.
The rest of the children correspond (much like in the NBA to DNA construction) to runs for which we have decided to move on, deserting those states in the child annotated by ✠ to their misery. These extended trees satisfy further properties as described next.

Denote by n' =def n(l + 1).

The trees utilized for determinizing Streett automata are in Tn', namely, have ≤ n' vertices.

Subsets'-Labeling, Index-Annotation

Assume without loss of generalit A's states are {1, ..., n}, and the starting state is 1. Let us now present the subsets'-labeled, index-annotated trees, in which vertices are labeled by subsets of the states, and, in addition, every vertex is annotated by an index. In other words, each vertex is associated with a pair comprised of a subset of A's states and an index: more formally, an item in 2{1,...,n} × {0, ..., l − 1, ✠, Φ}.
The subsets' labeling of the tree, combined with the tree itself must satisfy some properties which will be specified momentarily. The actual representation of the labeling consequently becomes quite efficient, as will be specified below.


Trees

Let us represent an n'-vertex tree by an n'-dimensional vector. Each entry in the vector corresponds to a (not necessarily active) vertex in the tee, where the entry points to its parent:

An n'-vertex tree T is a function T:{1,2,...n' - 1} → {0, ..., n' - 2} such that T[u] < u for all 1 ≤ un - 1. T[u] is reffered to as u's parent. parent's index is always smaller than its children.
This function is interpreted as a tree in which 0 is the root, and each vertex represented at entry u, has T[u] pointing to its parent entry.

Let us denote by Tn' the set of all such trees (vectors). Clearly, since T[u] can take < u values, the number of such tress is (n' - 1)! at most.

The following terminology will be used throughout this paper:

  • T*[u] is the set of ancestors of u, namely: T*[u] =def { Tj[u] | j > 0 }
  • u is a child of v if T[u] = v.
  • u is a senior sibling of v if T[v] = T[u] and u < v.
  • u is a descendant of v if T*[u] = v.

Definition 3

A labeled, annotated tree is a triplet comprised of:

  • a tree TTn'
  • a labeling S which assigns a subset of the states {1, ..., n} to every vertex u of T;
  • an annotation of every active vertex u by an acceptance-pair index or the special symbol:
    τ: {0, ..., n' − 1} → {0, ..., l − 1, ✠, Φ}

It is referred to as NSA-proper if it satisfies the following properties:

  • Hereditary: the label of a vertex u is a subset of the label of its parent T[u]:

    u, S[u] ∈ S[T[u]]

  • Excluding: any vertex has at most one child annotated by ✠. Moreover, a vertex's annotation is excluded from annotating all of that child's descendants:

    u s.t. τ[u] = ✠ and v s.t. uT*[v] ≠ τ[T[u]]

    Furthermore, a vertex annotated by ✠ has no child annotated by ✠.
    Lastly, a vertex is annotated by Φ exactly if all indices have been excluded by its ancestors, and it is not annotated by ✠.
  • Partition: children's labels are pairwise non-intersecting:

    u,v s.t. T[u] = T[v] : S[u] ∩ S[v] = ∅

  • Packed: there is a 1 ≤ tn' so that all vertices u > t are labeled by the empty set S[u] = ∅ and T[u] = 0 (namely, inactive vertices arbitrary point to the root as their parent), and all ut are labeled by a non-empty subset of states.

It is hence possible to specify for each state, the highest indexed vertex u in the tree in whose labeling it appears. The following definition captures the fact that if a triplet (T, S, τ) is proper, then S can be represented efficiently.


Definition 4

An n'-state–labeling of a tree TTn' is a partial function s: {0, ..., n' - 1} which maps every reachable state q to a vertex u in the tree T.


Definition 5

Excluding set: Let us denote by τ^ the set of indices forbidden by the Exclcuding rule, due to those indices annotating the parent of a vertex annotated by ✠:

τ^[u] = {τ[T[v]] | vT*[u] ∧ τ[v] = ✠};

less formally, a special child annotated by ✠ is where runs that supposedly avoid Gi (i is the annotation of the parent vertex) are considered. Its descendants must therefore avoid visiting Ri, otherwise such runs are not accepting. That special child itself does allow visits to Ri, so as to allow finitely many such visits. A run that indeed stops visiting Ri point on, can find itself in a child of that special child.


Definition 6

A labeling/annotation (s, τ) is proper (with respect to a tree T) if

t is then referred to as the actual size of the tree T.

The subset S[u] associated with a vertex u is the set of states assigned by s to any of its descendants:

S[u] =def {q | s[q] = uuT*[s[q]] }

Let us denote the set of proper n–state-labeling by Ln and annotations by τn,l. Combined, the tree, labeling and annotation specify:


Definition 7

An (n,l)-labeled-annotated tree is a triplet (T, s, τ) ∈ Tn' × Ln × τn,l where (s, τ) is proper (for T).

Determinization of a Streett-ω-automaton with n states

Given an NSA A with n states and l-sized acceptance condition, A = < {1, ..., n}, Σ, δA, 1, CA >, the states of the DNA, D, are of the form

QTn' × Ln × τn,l

and in particular are the following subset of the above:

Q =def {(T, s, τ)| T is an n'-vertex tree, s is a proper labeling, τ is a proper annotation }

The number-range k = 2n'.

Initial State

D's initial state is the tree that has only the root. The set {1} (the singleton of the initial state of A) is the label at the root so s[1] = 0, and t = 0. Namely, s is undefined for all other states and T[u] = 0 for all u. Furthermore, τ[0] = 0, so that the root is annotated by 0.

The Transition Function

Given a state (T, s, τ) ∈ Q and a letter a ∈ Σ, one has to describe the outcome state δD((T, s, τ), a) of the deterministic automaton D, plus the number N((T, s, τ), a) by which the transition is labeled. This will be carried out via a few steps and few intermediate constructs. One starts with a state (T, s, τ) and constructs δD((T, s, τ), a) = (T''', s''', τ''') andN((T, s, τ), a) for any letter a read.





Delta and Oops

For every vertex u in T, apply A's transition relation δA to obtain those reachable states that stem from S[u]. Then, immediately remove any state q that belong to Ri the forbidden set for u:

S'[u] =def qS[u] δA(q, a) \ iτ^[u] Ri

One better note here that a special child of a vertex annotated i still allows visit to Ri, as it is not forbidden. Its descendant already include i in their forbidden set.





Spawn: Special

For every vertex u with no children in T, annotated by a real index τ[u] ∈ {0, ..., l − 1}, add to the tree T' a vertex at u + 2n' with u as its parent, annotated by ✠ and labeled by the same subset of states as u:

T'[u + 2n'] =def u ; S'[u + 2n'] =def S'[u]




Specials' Child

For every vertex u in T, annotated by ✠, add to the tree T' a vertex at u + n' with u as its parent, labeled by the same subset of states as u and annotated by the first available index

τ'[u + n'] =def min[ {0, ..., l - 1} - τ^[u] - {τ[u]}]

In case τ^[u] = {0, ..., l - 1} (in other words, there is no real index by which vertex u + n' can be annotated) annotate u + n' by Φ.





Spawn: Green

For every vertex u in T, annotated by a real index τ[u] ∈ {0, ..., l − 1}, add to the tree T' a vertex at u + n' with u as its parent (namely T'[u + n'] =def u) and label the new vertex with all states of u's labe that are in Gτ[u]:

S'[u + n'] =def S[u] Gτ[u]

The annotation of such a new vertex is the next available index:

τ'[u + n'] =def min{i | (i mod l ∉ τ^[u]) } mod l

Moving on

The outcome of these steps is the triplet, comprised of the tree T' ∈ T3n'; S'[u] which assigns subsets to vertices of T'; and the annotation τ which assigns index to every vertex.

Note that this triplet does not necessarily satisfy several of the properties that need to be kept. For starters, there are too many vertices in the tree. Clearly, the hereditary property still holds. The partition condition, however, is not necessarily maintained. This is fixed by a couple of adjustments presented next.





Seniority Rules

The goal of this adjustment is to restore the partition condition. This is achieved by the following recursive process: when two siblings‘ labels share a state, only the senior of them is allowed to keep it, where the special child, annotated by ✠ is considered the least senior regardless of its position in the tree.

Formally, the new labeling S'' is defined as follows. For every vertex u in the tree for which τ'[u] ≠ ✠:

S''[u] = S'[u] \ v < u: T[v] = T[u], τ[v] ≠ ✠ S'[v]

where the states removed from u's label

v < u: T[v] = T[u], τ[u] ≠ ✠ S'[v]

are also removed from all of u's descendants in order to retain hereditary.

While, for every vertex u in the tree for which τ'[u] = ✠:

S''[u] = S'[u] \ vu: T[v] = T[u] S'[v]

again, the states removed from u's label:

vu: T[v] = T[u] S'[v]

are also removed from all of u's descendants in order to retain hereditary.

In other words, s'' assigns to every reachable state q the one vertex whose children's label do not contain it at the end of a recursive search, where one always chooses the most senior (smallest indexed) child among those not annotated by ✠, whose label contains q.


Claim 0.1

S'' has the partition property. Namely, if u, v are siblings then S''[u] ∩ S''[u] = ∅.

Observe that these steps may exhaust all of a vertex’s states, leaving it with the empty set as its label. In that case, we refer to that vertex as dead, and indeed below, it will be removed from the tree.




Going Green

For any vertex u in T'' that is not labeled by ✠, if the union of the labels of all its children that are not annotated by ✠ equals its own label

S''[u] = v: T[v] = u ∧ τ[v] ∈ { 0, ..., l - 1 } S''[v]

then eliminate all of u's descendants, resulting in T'''. In that case, annotate u by the next available index:

τ'''[u] =def min{i | (i mod l ∉ τ^[u]) } mod l

Furthermore, adjust state-labeling accordingly (namely, let s'''[q] of all states q so that s''[q] is a descendants to u) with the labeling S''' as the outcome.

Note: a vertex labeled by Φ never has a child, hence no change is applied to it in this step.

Denote by g the smallest index of a vertex u such that either:


Claim 0.2

S''' has the partition property. τ''' has the excluding property.





Packing the Tree

In the course of these steps, some of the vertices may so be eliminated. So as to amend that, assign to every vertex u, the entry u' which is its order among non eliminated vertices. Adjust pointers and labeling accordingly.

This restores all properties of the original tree: there is some t' so all vertices up to there are alive, namely, have a nonempty set as their labeling.

Since every active vertex u has at least one state qS[u], which does not occur in any of its children annotated not by ✠, and the child annotated by ✠ excludes an index, there are at most n' vertices to the tree.

Let b be the index of the first eliminated vertex. Namely, all vertices u < b have stayed intact. b is set to 2n' if no vertices were eliminated. The number assigned to the new state is calculated as follows: Let k be the minimum between g and b. The number is 2k if b is not larger than g, and 2k + 1 otherwise.

This ends the description of the transition function, and thus the construction.




Proof of Correctness


Claim 0.3

At any time m (that is, the state reached by the automaton after reading a1, ..., am) the subset S0 is exactly the reachable states in A at time m.


Soundness

Let us first prove that if a string w ∈ Σ ω is accepted by D, then w is accepted by A.

Given an accepting run of D, let the smallest number that occurs on the transition (determined by N) infinitely many times be 2b + 1. It must then be the case that, past some prefix of the word w, all first b vertices stay intact and, furthermore, the vertex b, for infinitely many times, becomes green (in the transition, where all of its children are removed) due to all the states in its label, belonging to the labels of those of its children annotated by a real index i ≠ ✠ (that is, g = b infinitely often). Let m0 be the first such time that g = b past that prefix, m1 the second, and so on.


Claim 0.4

For any vertex u, τ^[u] does not change (throughout the life of u).

Proof

For any vertex v, τ[v] change only once it becomes green, thus, all its descendants are killed. Since u is fixed in the tree, all of its ancestors have a smaller index in T, and thus are fixed as well and their annotation is fixed as they would otherwise become green and kill u. ∎


As a corollary, past m0 the set of indices τ^[b] is fixed throughout. Consequently, any state qRi for i ∈ τ^[b] is removed from S[b] throughout the run of D.

Now, between any mi and mi + 1 all states in S[b] at time mi + 1 have a run segments of A, over ami + 1, ..., ami + 1 , which stem from the label S[b] at mi and have visited in between in Gτ[b] (the annotation of b over the entire segment between mi + 1 and mi + 1 remain constant). All through these segments they have also have avoided Ri for all i ∈ τ^[b] (otherwise b would have been killed).

This implies, by Konig's lemma, that there exists a run that visits each Gi for i ∉ τ^[b] infinitely many times, while avoiding Ri for i ∈ τ^[b] beyond m0 (hence, visits in those only finitely many times).


Completeness

Now, let us prove that


Lemma 0.5

if w is accepted by A, then it is accepted by D.


Proof

Given an accepting A-run, π ∈ [n] over w ∈ Σω, let us follow πm on the sequence of states (Tm, sm, τm) of the run of D on w. That is, look at the sequence of states of D on the run over w, consider at what vertices' labels πm appears, and what that implies to the the numbers associated with the transitions by N, and in turn to whether the run is accepting.
The proof proceeds recursively, in search of a vertex that will be green infinitely many times, with no preceding even index preventing the run from accepting.
Observe first, that πmS[0] - the labeling of the root 0 induced by the state Tm, sm, τm) - for every m.
If the number associated by N to the transitions is 1 infinitely many time (namely the root is green infinitely often), D accepts w and there is nothing more to prove.
Otherwise, let m0 be the last time in which g = 0. The annotation of the root, τ[0] is fixed thereafter (for all mm0), as the annotation can change for a vertex only when it becomes green.
There are now two options for which child to follow, in search for a vertex that causes acceptance:

  • π goes Gi i.o. The run π may visit Gτ[0] past m0, in which case it will appear in one of the normal children of the root (those not annotated by ✠) - "spawning-rule: green" implies the state πm is copied to a child just born to a vertex u whenever πmGτ[u].
    It may nevertheless move to a more senior child, or have its index decrease when packing the tree but eventually, for an m larger than some m1, πm will appear forever in the label of a fixed entry in the tree - let us denote it by b.
    Since b is fixed for eternity, past m1, the first b entries in the tree stay intact henceforth, and consequently, no even number ≤ 2b is going to be associated with any transition subsequently, ergo, if 2b + 1 is infinitely often green D accepts w.
    Proceed thus to recursively follow πm on the subtree of b, from m1 forward.
  • Waiting for Godot: Alternatively, π may never get into Gτ[0] past m0.Observe that in that case, the state πm appears in the child annotated ✠ of the root, for all m > m0. Denote that vertex u.
    Observe also that, since π is accepting, Rτ[0] is visited only finitely many times.
    Let m1 denote the last time that red set is visited, more formally, where
    mm1: πmRτ[0]

    The state πm appears in a child of u - denote it v - for all m > m1, and the added restriction, for all children of u - never to visit Rτ[0]- does not affect πm, beyond m > m1.
    Now, πm may move to a more senior child of u than v; moreover, the index of the vertex it is in may decrease when packing the tree. Nevertheless, past some time m2, there is a normal child of u, vertex b, so that the first b vertices of the tree Tm remains intact, and πmS[b], for all m > m2.
    Since π never visits Rτ[0], one may proceed recursively, on the subtree of b.

D must thus accepts w. ∎