Formal Languages

Overview

Formal languages. We fix a finite alphabet $\Sigma$ (for example $\{0,1\}$). A word is a finite sequence of symbols from $\Sigma$, and a language is a set of words. The central question: how to describe and recognize a language? Two complementary tools are available. Regular expressions describe languages through formulas (union, concatenation, Kleene star). Finite automata recognize them via a computation mechanism: one reads the word letter by letter, changing state, and accepts or rejects based on the final state.

Deterministic and nondeterministic automata. A deterministic automaton (DFA) has exactly one transition per symbol from each state: the computation is entirely determined by the word. A nondeterministic automaton (NFA) allows multiple possible transitions (or none) and accepts if there exists at least one path leading to an accepting state. One can also add $\varepsilon$-transitions: spontaneous transitions that consume no symbol. Despite this extra flexibility, NFAs recognize exactly the same languages as DFAs.

Regular languages and their limits. The languages recognized by finite automata are called regular languages. Kleene's theorem shows that these are exactly those described by regular expressions. Regular languages are closed under union, concatenation, Kleene star, intersection, and complement. But they have limits: the pumping lemma provides a criterion to prove that a language is not regular. For example, $\{a^n b^n : n \geq 0\}$ escapes finite automata because it would require memorizing an arbitrary number of $a$'s.

Minimality. The Myhill-Nerode theorem characterizes regular languages via an algebraic condition (finitely many equivalence classes) and shows that every regular language has a unique minimal DFA. It can be effectively constructed.

Course

Words and languages

Definition (Alphabet, word, language). An alphabet is a finite nonempty set $\Sigma$. A word over $\Sigma$ is a finite sequence $w = a_1 a_2 \cdots a_n$ with $a_i \in \Sigma$. The integer $n$ is the length of $w$, denoted $|w|$. The empty word, denoted $\varepsilon$, is the unique word of length $0$. We write $\Sigma^*$ for the set of all words over $\Sigma$. A language over $\Sigma$ is a subset $L \subseteq \Sigma^*$.

Definition (Concatenation). The concatenation of two words $u = a_1 \cdots a_m$ and $v = b_1 \cdots b_n$ is the word $uv = a_1 \cdots a_m b_1 \cdots b_n$. This operation is associative, with identity element $\varepsilon$. The monoid $(\Sigma^*, \cdot, \varepsilon)$ is the free monoid on $\Sigma$.

Definition (Operations on languages). Let $L, L' \subseteq \Sigma^*$.

Deterministic finite automata

Definition (Deterministic finite automaton). A deterministic finite automaton (DFA) is a quintuple $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ where: $Q$ is a finite set of states, $\Sigma$ is an alphabet, $\delta : Q \times \Sigma \to Q$ is the transition function, $q_0 \in Q$ is the initial state, $F \subseteq Q$ is the set of accepting states.

Definition (Extended transition function, recognized language). We extend $\delta$ to $Q \times \Sigma^*$ by: $\hat{\delta}(q, \varepsilon) = q$ and $\hat{\delta}(q, wa) = \delta(\hat{\delta}(q, w), a)$. A word $w$ is accepted by $\mathcal{A}$ if $\hat{\delta}(q_0, w) \in F$. The recognized language of $\mathcal{A}$ is $L(\mathcal{A}) = \{w \in \Sigma^* : \hat{\delta}(q_0, w) \in F\}$.

Definition (Regular language). A language $L \subseteq \Sigma^*$ is regular if there exists a DFA $\mathcal{A}$ such that $L = L(\mathcal{A})$.

Example. Over $\Sigma = \{0,1\}$, the language of words containing an even number of $1$'s is recognized by the DFA with two states $Q = \{q_0, q_1\}$, $F = \{q_0\}$, $\delta(q_0, 0) = q_0$, $\delta(q_0, 1) = q_1$, $\delta(q_1, 0) = q_1$, $\delta(q_1, 1) = q_0$. The current state stores the parity of the number of $1$'s read so far.

Nondeterministic finite automata

Definition (Nondeterministic finite automaton). A nondeterministic finite automaton (NFA) is a quintuple $\mathcal{A} = (Q, \Sigma, \Delta, I, F)$ where: $Q$ is a finite set of states, $\Sigma$ is an alphabet, $\Delta \subseteq Q \times \Sigma \times Q$ is the transition relation, $I \subseteq Q$ is the set of initial states, $F \subseteq Q$ is the set of accepting states. Equivalently, one can view $\Delta$ as a function $\delta : Q \times \Sigma \to \mathcal{P}(Q)$. The difference with a DFA: there can be $0$, $1$, or several transitions from a state for the same symbol, and there can be multiple initial states.

Definition (Acceptance by an NFA). We extend $\delta$ to sets of states: for $S \subseteq Q$ and $a \in \Sigma$, $\delta(S, a) = \bigcup_{q \in S} \delta(q, a)$, then to words: $\hat{\delta}(S, \varepsilon) = S$ and $\hat{\delta}(S, wa) = \delta(\hat{\delta}(S, w), a)$. A word $w$ is accepted if $\hat{\delta}(I, w) \cap F \neq \varnothing$. The recognized language is $L(\mathcal{A}) = \{w \in \Sigma^* : \hat{\delta}(I, w) \cap F \neq \varnothing\}$.

Definition (NFA with $\varepsilon$-transitions). An $\varepsilon$-NFA is an NFA where the transition relation is $\Delta \subseteq Q \times (\Sigma \cup \{\varepsilon\}) \times Q$. For $S \subseteq Q$, the $\varepsilon$-closure $\mathrm{Cl}_\varepsilon(S)$ is the smallest set containing $S$ and closed under $\varepsilon$-transitions. Acceptance is defined by taking the $\varepsilon$-closure at each step.

Theorem (DFA-NFA equivalence). For every NFA $\mathcal{N}$, there exists a DFA $\mathcal{D}$ such that $L(\mathcal{D}) = L(\mathcal{N})$. More precisely, if $\mathcal{N}$ has $n$ states, $\mathcal{D}$ has at most $2^n$ states.

Proof. Subset construction. Let $\mathcal{N} = (Q, \Sigma, \delta, I, F)$. We build the DFA $\mathcal{D} = (\mathcal{P}(Q), \Sigma, \delta', I, F')$ with $\delta'(S, a) = \bigcup_{q \in S} \delta(q, a)$ and $F' = \{S \in \mathcal{P}(Q) : S \cap F \neq \varnothing\}$. The state set of $\mathcal{D}$ is $\mathcal{P}(Q)$: each DFA state encodes the set of NFA states reachable after reading a given prefix. We show by induction on $|w|$ that $\hat{\delta}'(I, w) = \hat{\delta}(I, w)$, where $\hat{\delta}'$ denotes the extended transition of $\mathcal{D}$ (with values in $\mathcal{P}(Q)$) and $\hat{\delta}$ the transition of $\mathcal{N}$ extended to subsets of $Q$. Base case. $\hat{\delta}'(I, \varepsilon) = I = \hat{\delta}(I, \varepsilon)$. Inductive step. Let $w = ua$ with $a \in \Sigma$. Setting $S = \hat{\delta}'(I, u) = \hat{\delta}(I, u)$ (induction hypothesis), we get $\hat{\delta}'(I, ua) = \delta'(S, a) = \bigcup_{q \in S} \delta(q, a) = \hat{\delta}(S, a) = \hat{\delta}(I, ua)$. Conclusion. $w \in L(\mathcal{D})$ iff $\hat{\delta}'(I, w) \in F'$ iff $\hat{\delta}(I, w) \cap F \neq \varnothing$ iff $w \in L(\mathcal{N})$.

Corollary. $\varepsilon$-NFAs recognize exactly the regular languages.

Proof. We first transform an $\varepsilon$-NFA into an NFA (without $\varepsilon$-transitions): set $\delta'(q, a) = \mathrm{Cl}_\varepsilon(\delta(\mathrm{Cl}_\varepsilon(\{q\}), a))$ and $I' = \mathrm{Cl}_\varepsilon(I)$, then apply the subset construction.

Regular expressions

Definition (Regular expression). Regular expressions over $\Sigma$ are defined inductively: (i) $\varnothing$ is a regular expression (denotes $\varnothing$), (ii) $\varepsilon$ is a regular expression (denotes $\{\varepsilon\}$), (iii) for every $a \in \Sigma$, $a$ is a regular expression (denotes $\{a\}$), (iv) if $r$ and $s$ are regular expressions, then $(r + s)$, $(r \cdot s)$ and $(r^*)$ are also regular expressions, denoting $L(r) \cup L(s)$, $L(r) \cdot L(s)$ and $L(r)^*$ respectively.

Theorem (Kleene). A language is regular if and only if it is denoted by a regular expression.

Proof. $(\Leftarrow)$ We show that for every regular expression $r$, $L(r)$ is recognized by an $\varepsilon$-NFA, by structural induction. Base cases. $\varnothing$: one state, no accepting state. $\varepsilon$: one state, both initial and accepting. $a$: two states $q_0, q_1$, one transition $q_0 \xrightarrow{a} q_1$, $q_0$ initial, $q_1$ accepting. Union ($r + s$). Take the $\varepsilon$-NFAs $\mathcal{A}_r$ and $\mathcal{A}_s$ (with disjoint states), add a new initial state $q_0$ with $\varepsilon$-transitions to the initial states of $\mathcal{A}_r$ and $\mathcal{A}_s$. The accepting states are those of $\mathcal{A}_r$ and $\mathcal{A}_s$. Concatenation ($r \cdot s$). Add $\varepsilon$-transitions from each accepting state of $\mathcal{A}_r$ to each initial state of $\mathcal{A}_s$. The initial states are those of $\mathcal{A}_r$, the accepting states are those of $\mathcal{A}_s$. Star ($r^*$). Add a new state $q_0$, initial and accepting, with an $\varepsilon$-transition to each initial state of $\mathcal{A}_r$, and $\varepsilon$-transitions from each accepting state of $\mathcal{A}_r$ to $q_0$. In each case, the resulting $\varepsilon$-NFA recognizes the desired language, so it is regular. $(\Rightarrow)$ We use the state elimination method. Let $\mathcal{A}$ be a DFA with $n$ states. We generalize transition labels to carry regular expressions (a generalized automaton). We eliminate states one by one: when removing a state $q_k$, for every pair $(q_i, q_j)$ of remaining states, we replace the label from $q_i$ to $q_j$ by $r_{ij} + r_{ik} \cdot r_{kk}^* \cdot r_{kj}$, where $r_{ij}$, $r_{ik}$, $r_{kj}$, $r_{kk}$ are the current labels. When only the initial state $q_0$ and an accepting state $q_f$ remain, the resulting regular expression denotes the recognized language. If $F$ contains multiple states, we take the union.

Closure properties

Proposition (Closure under union). If $L_1$ and $L_2$ are regular, then $L_1 \cup L_2$ is regular.

Proof. By Kleene's theorem, $L_1$ and $L_2$ are denoted by regular expressions $r_1$ and $r_2$. Then $r_1 + r_2$ is a regular expression denoting $L_1 \cup L_2$, which is therefore regular.

Proposition (Closure under complement). If $L$ is regular, then $\Sigma^* \setminus L$ is regular.

Proof. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$. The DFA $(Q, \Sigma, \delta, q_0, Q \setminus F)$ recognizes $\Sigma^* \setminus L$.

Proposition (Closure under intersection). If $L_1$ and $L_2$ are regular, then $L_1 \cap L_2$ is regular.

Proof. By De Morgan: $L_1 \cap L_2 = \overline{\bar{L_1} \cup \bar{L_2}}$, and we use closure under complement and union. One can also directly construct the product automaton: if $\mathcal{A}_i = (Q_i, \Sigma, \delta_i, q_{0,i}, F_i)$ for $i = 1,2$, set $\mathcal{A} = (Q_1 \times Q_2, \Sigma, \delta, (q_{0,1}, q_{0,2}), F_1 \times F_2)$ with $\delta((q_1, q_2), a) = (\delta_1(q_1, a), \delta_2(q_2, a))$. Then $\hat{\delta}((q_{0,1}, q_{0,2}), w) = (\hat{\delta}_1(q_{0,1}, w), \hat{\delta}_2(q_{0,2}, w))$ (by immediate induction), so $w \in L(\mathcal{A})$ iff $w \in L_1 \cap L_2$.

Proposition (Closure under morphism image and preimage). Let $\varphi : \Sigma^* \to \Gamma^*$ be a monoid morphism. If $L$ is a regular language over $\Gamma$, then $\varphi^{-1}(L)$ is regular over $\Sigma$. If $L$ is a regular language over $\Sigma$, then $\varphi(L)$ is regular over $\Gamma$.

Proof. For $\varphi^{-1}(L)$: let $(Q, \Gamma, \delta, q_0, F)$ be a DFA for $L$. We build the DFA $(Q, \Sigma, \delta', q_0, F)$ with $\delta'(q, a) = \hat{\delta}(q, \varphi(a))$. Then $\hat{\delta}'(q_0, w) = \hat{\delta}(q_0, \varphi(w))$ (by induction on $|w|$), so $L(\mathcal{A}') = \varphi^{-1}(L)$. For $\varphi(L)$: let $r$ be a regular expression for $L$. Replace each letter $a$ by $\varphi(a)$ in $r$. The resulting expression denotes $\varphi(L)$.

Pumping lemma

Theorem (Pumping Lemma). Let $L$ be a regular language. There exists an integer $p \geq 1$ (the pumping length) such that every word $w \in L$ with $|w| \geq p$ can be decomposed as $w = xyz$ with: (i) $|y| \geq 1$, (ii) $|xy| \leq p$, (iii) $\forall k \geq 0,\; xy^k z \in L$.

Proof. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$, and $p = |Q|$. Let $w = a_1 \cdots a_n \in L$ with $n \geq p$. Reading $w$ visits states $r_0 = q_0, r_1, \ldots, r_n$ with $r_i = \hat{\delta}(q_0, a_1 \cdots a_i)$. Among the first $p+1$ states $r_0, r_1, \ldots, r_p$, there are $|Q| = p$ possible states, so by the pigeonhole principle, there exist $0 \leq i < j \leq p$ such that $r_i = r_j$. Set $x = a_1 \cdots a_i$, $y = a_{i+1} \cdots a_j$, $z = a_{j+1} \cdots a_n$. Then $|y| = j - i \geq 1$, $|xy| = j \leq p$, and for all $k \geq 0$: reading $xy^k z$ starts from $q_0$, reaches $r_i$ after $x$, loops $k$ times on the cycle $r_i \to \cdots \to r_j = r_i$ by reading $y^k$, then reaches $r_n$ after $z$. Since $r_n \in F$, we have $xy^k z \in L$.

Myhill-Nerode theorem and minimal automaton

Definition (Nerode congruence). Let $L \subseteq \Sigma^*$. The Nerode congruence of $L$ is the relation $\sim_L$ defined on $\Sigma^*$ by: $u \sim_L v$ if and only if $\forall w \in \Sigma^*,\; (uw \in L \iff vw \in L)$.

Proposition. The relation $\sim_L$ is an equivalence relation on $\Sigma^*$, right-compatible with concatenation: if $u \sim_L v$, then for all $a \in \Sigma$, $ua \sim_L va$.

Proof. Reflexivity, symmetry, and transitivity are immediate. For compatibility: if $u \sim_L v$, then for all $a \in \Sigma$ and all $w \in \Sigma^*$, $uaw \in L \iff vaw \in L$ (taking $aw$ as suffix in the definition), so $ua \sim_L va$.

Theorem (Myhill-Nerode). Let $L \subseteq \Sigma^*$. The following are equivalent: (1) $L$ is regular. (2) $\sim_L$ has finitely many equivalence classes. Moreover, the number of classes of $\sim_L$ is exactly the number of states of the smallest DFA recognizing $L$.

Proof. $(1 \Rightarrow 2)$. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$. Define $\equiv_\mathcal{A}$ by $u \equiv_\mathcal{A} v$ iff $\hat{\delta}(q_0, u) = \hat{\delta}(q_0, v)$. If $u \equiv_\mathcal{A} v$, then for all $w$, $\hat{\delta}(q_0, uw) = \hat{\delta}(\hat{\delta}(q_0, u), w) = \hat{\delta}(\hat{\delta}(q_0, v), w) = \hat{\delta}(q_0, vw)$, so $uw \in L \iff vw \in L$, giving $u \sim_L v$. Thus $\equiv_\mathcal{A}$ refines $\sim_L$, and $\sim_L$ has at most $|Q|$ classes. $(2 \Rightarrow 1)$. Suppose $\sim_L$ has finitely many $n$ classes. We build the DFA $\mathcal{A}_L = (Q_L, \Sigma, \delta_L, [\varepsilon], F_L)$ with $Q_L = \Sigma^* / {\sim_L}$ (the set of classes), $\delta_L([u], a) = [ua]$ (well-defined by right compatibility), $[\varepsilon]$ as initial state, and $F_L = \{[u] : u \in L\}$ (well-defined since $u \sim_L v$ and $u \in L$ implies $v \in L$ by taking $w = \varepsilon$). We verify $\hat{\delta}_L([\varepsilon], w) = [w]$ by induction on $|w|$ (base: $\hat{\delta}_L([\varepsilon], \varepsilon) = [\varepsilon]$; step: $\hat{\delta}_L([\varepsilon], wa) = \delta_L([w], a) = [wa]$). So $w \in L(\mathcal{A}_L)$ iff $[w] \in F_L$ iff $w \in L$. Minimality. We saw that for any DFA $\mathcal{A}$ recognizing $L$, $\equiv_\mathcal{A}$ refines $\sim_L$, so $|Q| \geq n$. Since $\mathcal{A}_L$ has exactly $n$ states, $\mathcal{A}_L$ is a DFA of minimal size. Uniqueness. If $\mathcal{A}$ is a DFA recognizing $L$ with $n$ states, then $\equiv_\mathcal{A}$ has exactly $n$ classes and refines $\sim_L$ which also has $n$, so $\equiv_\mathcal{A}$ and $\sim_L$ coincide. The map $\hat{\delta}(q_0, u) \mapsto [u]$ is an isomorphism between $\mathcal{A}$ and $\mathcal{A}_L$ (if all states of $\mathcal{A}$ are reachable).

Corollary. Every regular language has, up to isomorphism, a unique minimal DFA (among DFAs whose states are all reachable). It is the Nerode automaton $\mathcal{A}_L$.

Proposition (Effective construction of the minimal automaton). Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA whose states are all reachable. The minimal automaton is obtained by identifying equivalent states: $q \sim q'$ if $\forall w \in \Sigma^*$, $\hat{\delta}(q, w) \in F \iff \hat{\delta}(q', w) \in F$. This relation is computed by successive refinement: start with the partition $\{F, Q \setminus F\}$ and refine by distinguishing $q$ and $q'$ if there exists $a \in \Sigma$ such that $\delta(q, a)$ and $\delta(q', a)$ are not in the same block. Iterate until stabilization.

Proof. Define $\sim_k$ by: $q \sim_0 q'$ iff $(q \in F \iff q' \in F)$, and $q \sim_{k+1} q'$ iff $q \sim_k q'$ and $\forall a \in \Sigma,\;\delta(q, a) \sim_k \delta(q', a)$. Clearly ${\sim_{k+1}} \subseteq {\sim_k}$. Since $Q$ is finite, there exists $k_0$ such that ${\sim_{k_0}} = {\sim_{k_0+1}}$. We show that ${\sim_{k_0}} = {\sim}$. The inclusion ${\sim} \subseteq {\sim_k}$ for all $k$ is immediate by induction. For the other direction, if $q \sim_{k_0} q'$, then for all $a$, $\delta(q, a) \sim_{k_0} \delta(q', a)$, so by induction on $|w|$, $\hat{\delta}(q, w) \sim_{k_0} \hat{\delta}(q', w)$ for all $w$, which implies $\hat{\delta}(q, w) \in F \iff \hat{\delta}(q', w) \in F$, so $q \sim q'$. The quotient $Q / {\sim}$ gives the minimal automaton.

Exercises

Exercise. Show that the language $L = \{w \in \{a,b\}^* : |w|_a \equiv 0 \pmod{3}\}$ is regular.

Proof. We build the DFA $\mathcal{A} = (\{q_0, q_1, q_2\}, \{a,b\}, \delta, q_0, \{q_0\})$ with $\delta(q_i, a) = q_{(i+1) \bmod 3}$ and $\delta(q_i, b) = q_i$. The state $q_i$ means the number of $a$'s read is congruent to $i$ modulo $3$. So $\hat{\delta}(q_0, w) = q_{|w|_a \bmod 3}$, and $w \in L(\mathcal{A})$ iff $|w|_a \equiv 0 \pmod{3}$.

Exercise. Show that $L = \{a^n b^n : n \geq 0\}$ is not regular.

Proof. Suppose for contradiction that $L$ is regular, with pumping length $p$. The word $w = a^p b^p \in L$ satisfies $|w| = 2p \geq p$. By the lemma, $w = xyz$ with $|y| \geq 1$, $|xy| \leq p$. Since $|xy| \leq p$, $x$ and $y$ consist only of $a$'s, so $y = a^m$ with $m \geq 1$. Then $xy^0 z = xz = a^{p-m} b^p \notin L$ since $p - m \neq p$. Contradiction.

Exercise. Show that $L = \{w \in \{0,1\}^* : w \text{ is a palindrome}\}$ is not regular.

Proof. Let $p$ be the pumping length. The word $w = 0^p 1 0^p \in L$ satisfies $|w| \geq p$. By the lemma, $w = xyz$ with $|xy| \leq p$ and $|y| \geq 1$, so $y = 0^m$ with $1 \leq m \leq p$, in the first block of $0$'s. Then $xy^2 z = 0^{p+m} 1 0^p$ is not a palindrome, so $xy^2 z \notin L$. Contradiction.

Exercise. Show that $L = \{a^n : n \text{ is prime}\}$ is not regular.

Proof. Suppose $L$ is regular with pumping length $p$. Let $q$ be a prime $\geq p$. The word $w = a^q \in L$ satisfies $|w| = q \geq p$. By the pumping lemma, $w = xyz$ with $|y| = m \geq 1$, $|xy| \leq p$. Then $|xy^{q+1} z| = q + qm = q(1 + m)$. This is a product of two integers $\geq 2$ ($q$ is prime $\geq 2$ and $1 + m \geq 2$), so it is not prime. Thus $xy^{q+1} z \notin L$, contradiction.

Exercise. Show that $L = \{0^n 1^n 0^n : n \geq 0\}$ is not regular.

Proof. Suppose $L$ is regular with pumping length $p$. The word $w = 0^p 1^p 0^p \in L$ satisfies $|w| = 3p \geq p$. By the pumping lemma, $w = xyz$ with $|xy| \leq p$ and $|y| \geq 1$, so $y = 0^m$ with $1 \leq m \leq p$ (in the first block). Then $xy^0 z = 0^{p-m} 1^p 0^p \notin L$ since $p - m \neq p$. Contradiction.

Exercise. Show that $L = \{ww : w \in \{a,b\}^*\}$ is not regular.

Proof. Suppose $L$ is regular with pumping length $p$. The word $w = a^p b a^p b \in L$ satisfies $|w| \geq p$. By the lemma, $w = xyz$ with $|xy| \leq p$, $|y| \geq 1$, so $y = a^m$ ($1 \leq m \leq p$). The word $xy^2 z = a^{p+m} b a^p b$ has length $2p + 2 + m$. If $m$ is odd, this length is odd, so $xy^2 z \notin L$ (every element of $L$ has even length). If $m$ is even, $m = 2\ell$ with $\ell \geq 1$, each half has length $p + 1 + \ell$. Since $p + 1 + \ell \leq p + 2\ell$, the cut falls in the first block of $a$'s. The first half is $a^{p + 1 + \ell}$ (only $a$'s), the second is $a^{\ell - 1} b a^p b$ (contains $b$'s). They differ, so $xy^2 z \notin L$. Contradiction.

Exercise. Show that $L = \{a^{n^2} : n \geq 0\}$ is not regular.

Proof. Suppose $L$ is regular with pumping length $p$. The word $w = a^{p^2} \in L$ satisfies $|w| = p^2 \geq p$. By the pumping lemma, $w = xyz$ with $|y| = m \geq 1$ and $|xy| \leq p$. Then $|xy^2 z| = p^2 + m$. Since $1 \leq m \leq p$, we have $p^2 < p^2 + m \leq p^2 + p < p^2 + 2p + 1 = (p+1)^2$. So $p^2 + m$ lies strictly between two consecutive perfect squares, hence is not a perfect square, and $xy^2 z \notin L$. Contradiction.

Exercise. For a word $w = a_1 \cdots a_n$, we write $w^R = a_n \cdots a_1$ for the mirror word of $w$. Let $L$ be a regular language. Show that $L^R = \{w^R : w \in L\}$ is regular.

Proof. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$. We build the NFA $\mathcal{A}^R = (Q, \Sigma, \delta^R, F, \{q_0\})$ by reversing all transitions: $(q', a, q) \in \delta^R$ iff $\delta(q, a) = q'$. The initial states of $\mathcal{A}^R$ are the accepting states of $\mathcal{A}$, and the sole accepting state is $q_0$. We show that $L(\mathcal{A}^R) = L^R$. A word $a_1 \cdots a_n \in L$ iff there exists a path $q_0 \xrightarrow{a_1} r_1 \xrightarrow{a_2} \cdots \xrightarrow{a_n} r_n$ in $\mathcal{A}$ with $r_n \in F$. Reversing, $r_n \xrightarrow{a_n} r_{n-1} \xrightarrow{a_{n-1}} \cdots \xrightarrow{a_1} q_0$ is a path in $\mathcal{A}^R$ from $r_n \in F$ (initial state of $\mathcal{A}^R$) to $q_0$ (accepting state of $\mathcal{A}^R$), showing that $a_n \cdots a_1 \in L(\mathcal{A}^R)$. The argument is reversible.

Exercise (Arden's lemma). Let $A, B \subseteq \Sigma^*$ with $\varepsilon \notin A$. Show that the unique solution of the equation $X = AX \cup B$ is $X = A^* B$. Deduce a method for computing the language recognized by a finite automaton: to each state $q_i$ of a DFA $(Q, \Sigma, \delta, q_0, F)$, associate the language $L_i = \{w \in \Sigma^* : \hat{\delta}(q_i, w) \in F\}$, and solve the resulting system of equations. Apply this method to the DFA over $\Sigma = \{a, b\}$ with three states $\{0, 1, 2\}$, initial state $0$, accepting state $\{0\}$, with $\delta(0, a) = 1$, $\delta(0, b) = 0$, $\delta(1, a) = 2$, $\delta(1, b) = 0$, $\delta(2, a) = 2$, $\delta(2, b) = 2$.

Proof. Verification. $A^* B = A(A^* B) \cup B$ since $A^* = AA^* \cup \{\varepsilon\}$. So $X = A^* B$ is a solution. Uniqueness. Let $X$ be a solution. By iterated substitution: $X = AX \cup B = A(AX \cup B) \cup B = A^2 X \cup AB \cup B$. By induction, $X = A^n X \cup \left(\bigcup_{k=0}^{n-1} A^k\right) B$ for all $n \geq 1$. Since $\varepsilon \notin A$, every word in $A$ has length $\geq 1$, so every word in $A^n$ has length $\geq n$. Let $w \in X$: if $w \in A^n X$ for all $n$, then $|w| \geq n$ for all $n$, which is impossible. So there exists $n$ such that $w \in \left(\bigcup_{k=0}^{n-1} A^k\right) B \subseteq A^* B$. Thus $X \subseteq A^* B$. The reverse inclusion follows from the fact that $A^* B$ is a solution and $X = AX \cup B \supseteq B$, then by induction $X \supseteq A^n B$ for all $n$, so $X \supseteq A^* B$. Application. The system of equations is: $L_i = \bigcup_{a \in \Sigma} a \cdot L_{\delta(i, a)}$, with the term $\cup\;\{\varepsilon\}$ added if $i \in F$. Here: $L_0 = a \cdot L_1 \cup b \cdot L_0 \cup \{\varepsilon\}$, $L_1 = a \cdot L_2 \cup b \cdot L_0$, $L_2 = a \cdot L_2 \cup b \cdot L_2$. By Arden on $L_2$: $L_2 = (a \cup b) L_2 \cup \varnothing$, so $L_2 = (a \cup b)^* \varnothing = \varnothing$. Then $L_1 = b \cdot L_0$. Substituting into $L_0$: $L_0 = ab \cdot L_0 \cup b \cdot L_0 \cup \{\varepsilon\} = (ab \cup b) L_0 \cup \{\varepsilon\}$. By Arden: $L_0 = (ab \cup b)^*$.

Exercise. Construct the minimal automaton of the DFA over $\Sigma = \{a, b\}$ with five states $\{1, 2, 3, 4, 5\}$, initial state $1$, accepting states $\{1, 4\}$, with $\delta(1, a) = 2$, $\delta(1, b) = 3$, $\delta(2, a) = 4$, $\delta(2, b) = 5$, $\delta(3, a) = 5$, $\delta(3, b) = 4$, $\delta(4, a) = 2$, $\delta(4, b) = 3$, $\delta(5, a) = 3$, $\delta(5, b) = 2$.

Proof. Initial partition: $P_0 = \{A, B\}$ with $A = \{1, 4\}$ (accepting) and $B = \{2, 3, 5\}$. Refinement. For $a$: $\delta(1, a) = 2 \in B$, $\delta(4, a) = 2 \in B$ (identical). For $b$: $\delta(1, b) = 3 \in B$, $\delta(4, b) = 3 \in B$. So $1$ and $4$ stay together. For the states in $B$: $\delta(2, a) = 4 \in A$, $\delta(3, a) = 5 \in B$, $\delta(5, a) = 3 \in B$. So $2$ is distinguished from $3$ and $5$ on $a$. We get $P_1 = \{\{1, 4\}, \{2\}, \{3, 5\}\}$. Check $\{3, 5\}$: $\delta(3, a) = 5 \in \{3,5\}$, $\delta(5, a) = 3 \in \{3,5\}$; $\delta(3, b) = 4 \in \{1,4\}$, $\delta(5, b) = 2 \in \{2\}$. We distinguish $3$ and $5$ on $b$. We get $P_2 = \{\{1, 4\}, \{2\}, \{3\}, \{5\}\}$. Check $\{1, 4\}$: $\delta(1, a) = 2 \in \{2\}$, $\delta(4, a) = 2 \in \{2\}$; $\delta(1, b) = 3 \in \{3\}$, $\delta(4, b) = 3 \in \{3\}$. Stable. So $P_2$ is the final partition. The minimal automaton has $4$ states: $[1] = \{1,4\}$, $[2] = \{2\}$, $[3] = \{3\}$, $[5] = \{5\}$, with initial state $[1]$, accepting state $[1]$, and transitions $\delta([1], a) = [2]$, $\delta([1], b) = [3]$, $\delta([2], a) = [1]$, $\delta([2], b) = [5]$, $\delta([3], a) = [5]$, $\delta([3], b) = [1]$, $\delta([5], a) = [3]$, $\delta([5], b) = [2]$.

Exercise. Determine the number of classes of the Nerode congruence for $L = \{w \in \{a,b\}^* : w \text{ contains } ab \text{ as a factor}\}$ and deduce the minimal automaton.

Proof. A word $u$ without the factor $ab$ has the form $b^* a^*$ (since after an $a$, every next character must be $a$, otherwise we create a factor $ab$). We distinguish three cases: (1) $u$ contains $ab$ as a factor, (2) $u \in b^* a^+$ (no factor $ab$, ends with $a$), (3) $u \in b^*$ (no factor $ab$, does not end with $a$). Class 1 ($u$ contains $ab$): for all $w \in \Sigma^*$, $uw$ contains $ab$, so $uw \in L$. Class 2 ($u \in b^* a^+$): $u$ ends with $a$, so $uw$ contains $ab$ iff $w$ starts with $b$ or $w$ contains $ab$. That is, $uw \in L \iff w \in b\Sigma^* \cup \Sigma^* ab \Sigma^*$. Class 3 ($u \in b^*$): the concatenation $uw$ does not create a factor $ab$ at the junction (since $u$ ends with $b$ or $u = \varepsilon$), so $uw \in L \iff w$ contains $ab$. The three classes are distinct: with $w = \varepsilon$, class 1 gives $uw \in L$, classes 2 and 3 give $uw \notin L$; with $w = b$, class 2 gives $ub \in L$ (since $u$ ends with $a$), class 3 gives $ub \notin L$. So $\sim_L$ has exactly $3$ classes. The minimal automaton has three states $\{q_0, q_1, q_2\}$: $q_0 = [\varepsilon]$ (class 3), $q_1 = [a]$ (class 2), $q_2 = [ab]$ (class 1). Initial state $q_0$, accepting state $q_2$. Transitions: $\delta(q_0, a) = q_1$, $\delta(q_0, b) = q_0$, $\delta(q_1, a) = q_1$, $\delta(q_1, b) = q_2$, $\delta(q_2, a) = q_2$, $\delta(q_2, b) = q_2$.

Exercise. Let $L$ be a regular language over $\Sigma$ and $M$ an arbitrary language over $\Sigma$. Show that the right quotient $L / M = \{x \in \Sigma^* : \exists y \in M,\; xy \in L\}$ is regular.

Proof. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$. We build the DFA $\mathcal{A}' = (Q, \Sigma, \delta, q_0, F')$ with $F' = \{q \in Q : \exists y \in M,\; \hat{\delta}(q, y) \in F\}$. We only change the set of accepting states. Then $x \in L(\mathcal{A}')$ iff $\hat{\delta}(q_0, x) \in F'$ iff $\exists y \in M,\; \hat{\delta}(\hat{\delta}(q_0, x), y) \in F$ iff $\exists y \in M,\; \hat{\delta}(q_0, xy) \in F$ iff $\exists y \in M,\; xy \in L$ iff $x \in L/M$. So $L(\mathcal{A}') = L/M$, which is regular. Note that $M$ need not be regular.

Exercise. Let $L$ be a regular language over $\Sigma$. Show that $\sqrt{L} = \{w \in \Sigma^* : ww \in L\}$ is regular.

Proof. Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA recognizing $L$ with $n = |Q|$. We build a DFA whose states encode both the progress of the first copy of $w$ and the effect of the second copy on every possible state. Set $Q' = Q \times Q^Q$ where $Q^Q$ is the set of functions from $Q$ to $Q$. A state $(q, f)$ means: the first copy has reached state $q$, and for each hypothetical starting state $s$, the second copy would have reached state $f(s)$. The transitions are: $\delta'((q, f), a) = (\delta(q, a),\; s \mapsto \delta(f(s), a))$. The initial state is $(q_0, \mathrm{id}_Q)$: the first copy starts from $q_0$, and the second has not read anything yet. The set of accepting states is $F' = \{(q, f) : f(q) \in F\}$: the second copy starts from the state $q$ reached by the first, and must arrive in $F$. By induction on $|w|$, $\hat{\delta}'((q_0, \mathrm{id}_Q), w) = (\hat{\delta}(q_0, w),\; s \mapsto \hat{\delta}(s, w))$. So $w \in L(\mathcal{A}')$ iff $\hat{\delta}(\hat{\delta}(q_0, w), w) \in F$ iff $ww \in L$.