唧唧复唧唧,十日九风雨。

# Regular language

Language is the set of strings.

# Finite automata

Definition(Finite automaton)

A finite automaton is a 5-tuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_0,F), where

  • QQ is a finite set called the states.
  • Σ\Sigma is a finite set called the alphabet.
  • δ:Q×ΣQ\delta:Q\times\Sigma\rightarrow Q is the transition function. (Not partial function!)
  • q0Qq_0\in Q is the start state.
  • FQF\subseteq Q is the set of accept states.

If AA is the set of all strings that machine MM accepts, we say that AA is the
language of
machine MM and write L(M)=AL(M) = A.

We say that MM recognizes AA or that M accepts A. Because the term accept has different meanings when we refer to machines accepting strings and machines accepting languages, we prefer the
term recognize for languages in order to avoid confusion.

A machine may accept several strings, but it always recognizes only one language.

Definition(Computation)

Let M=(Q,Σ,δ,q0,F)M=(Q,\Sigma,\delta,q_0,F) be a finite automaton and let w=w1w2...wnw=w_1w_2...w_n be a string where each wiw_i is a member of the alphabet Σ\Sigma. Then MM accepts ww if a sequence of states r0,r1,...,rnr_0,r_1,...,r_n in QQ exists with three conditions:

  • r0=q0r_0=q_0.
  • δ(ri,wi+1)=ri+1\delta(r_i,w_{i+1})=r_{i+1}, for i=0,...,n1i=0,...,n-1.
  • rnFr_n\in F.

We say that MM recognizes language AA if A={wM  accepts  w}A=\{w|M\;accepts\;w\}.

Definition(Regular language)

A language is called a regular language if some finite automaton recognizes it.


Definition

Let AA and BB be languages. We define the regular operations:

  • Union:

    AB={xxA  or  xB}A\cup B=\{x|x\in A\;or\;x\in B\}

  • Concatenation:

    AB={xyxA,yB}A\circ B=\{xy|x\in A,y\in B\}

  • Star:

    A={x1x2...xkk0,xiA}A^*=\{x_1x_2...x_k|k\geq 0,x_i\in A\}

Theorem:

The class of regular languages is closed under the union operation.

Proof: Let M1=(Q1,Σ1,δ1,q1,F1)M_1=(Q_1,\Sigma_1,\delta_1,q_1,F_1) recognize A1A_1 and M2=(Q2,Σ2,δ2,q2,F2)M_2=(Q_2,\Sigma_2,\delta_2,q_2,F_2) recognize A2A_2. Then we construct M=(Q,Σ,δ,q,F)M=(Q,\Sigma,\delta,q,F) to recognize A1A2A_1\cup A_2.

  • Q={(r1,r2)r1Q1,r2Q2}Q=\{(r_1,r_2)|r_1\in Q_1,r_2\in Q_2\}

  • Σ=Σ1Σ2\Sigma=\Sigma_1\cup\Sigma_2

  • δ((r1,r2),a)=(δ1(r1,a),δ2(r2,a))\delta((r_1,r_2),a)=(\delta_1(r_1,a),\delta_2(r_2,a))

  • q=(q1,q2)q=(q_1,q_2)

  • F={(r1,r2)r1F1  or  r2F2}F=\{(r_1,r_2)|r_1\in F_1\;or\;r_2\in F_2\}

# Nondeterminism

Definition(Nondeterministic finite automata)

A nondeterministic finite automaton is a 5-tuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_0,F), where

  • QQ is a finite set called the states.
  • Σ\Sigma is a finite set called the alphabet.
  • δ:Q×(Σ{ε})P(Q)\delta:Q\times(\Sigma\cup\{\varepsilon\})\rightarrow \mathcal{P}(Q) is the transition function.
  • q0Qq_0\in Q is the start state.
  • FQF\subseteq Q is the set of accept states.

Definition(Computation)

Let M=(Q,Σ,δ,q0,F)M=(Q,\Sigma,\delta,q_0,F) be a nondeterministic finite automaton and let w=w1w2...wnw=w_1w_2...w_n be a string where each wiw_i is a member of the alphabet Σ\Sigma. Then MM accepts ww if we can write ww as w=y1y2...ymw=y_1y_2...y_m, where yiΣ{ε}y_i\in\Sigma\cup\{\varepsilon\} and a sequence of states r0,r1,...,rmr_0,r_1,...,r_m exists in QQ with three conditions:

  • r0=q0r_0=q_0.

  • ri+1δ(ri,yi+1)r_{i+1}\in \delta(r_i,y_{i+1}), for i=0,...,m1i=0,...,m-1.

  • rmFr_m\in F.

Theorem:

Every nondeterministic finite automaton has an equivalent deterministic finite automaton.

Proof: Let N=(Q,Σ,δ,q0,F)N=(Q,\Sigma,\delta,q_0,F) be the NFA recognizing some language AA. We construct a DFA M=(Q,Σ,δ,q0,F)M=(Q',\Sigma,\delta',q_0',F') recognizing AA. Before doing the full construction, let’s first consider the easier case when NN has no ε\varepsilon arrows.

  • Q=P(Q)Q'=\mathcal{P}(Q)

  • δ(R,a)=rRδ(r,a)\delta'(R,a)=\bigcup_{r\in R}\delta(r,a)

  • q0={q0}q_0'=\{q_0\}

  • F={RQR  contains  an  accept  state  of  N}F'=\{R\in Q'|R\;contains\;an\;accept\;state\;of\;N\}

Now we need to consider the ε\varepsilon arrows. To do so we set up an extra bit of notation. For any state RR of NN we define E(R)E(R) to be the collection of states that can be reached from RR by going only along ε\varepsilon arrows, including the members of RR themselves. We need to replace δ(R,a)\delta'(R,a) by

δ(R,a)={qQqE(δ(r,a))}\delta'(R,a)=\{q\in Q|q\in E(\delta(r,a))\}

and replace q0q_0' with

q0=E({q0})q_0'=E(\{q_0\})

Example:

1

II IaI_a IbI_b
[1,3] [1,3] [2]
[2] [2,3] [3]
[2,3] [1,2,3] [3]
[3] [1,3] []
[1,2,3] [1,2,3] [2,3]
[] [] []

Then the DFA is

2

Theorem:

The class of regular languages is closed under the union operation.

Proof: since the theorem has been proved before, here we give an informal proof: given two DFAs: N1N_1 and N2N_2, we can construct an NFA to recognize L(N1)L(N2)L(N_1)\cup L(N_2):

3

Theorem:

The class of regular languages is closed under the concatenation operation.

Proof: Suppose we are given two DFAs: N1N_1 and N2N_2, we can construct an NFA to recognize L(N1)L(N2)L(N_1)\circ L(N_2):

4

Theorem:

The class of regular languages is closed under the star operation.

Proof:

5

Theorem:

The class of regular languages is closed under the complement.

Proof: we can swap the accept states and other states in the DFA

Theorem:

The class of regular languages is closed under the intersection.

Proof: L1L2=(L1L2)L_1\cap L_2=\sim(\sim L_1\cup \sim L_2), since regular language is closed under union, complement, so it is closed under intersection.

# Regular expressions

Definition(regular expression)

Say that RR is a regular expression if RR is

  • aa for some aa in the alphabet Σ\Sigma
  • ε\varepsilon
  • \emptyset
  • (R1R2)(R_1\cup R_2), where R1,R2R_1,R_2 are regular expressions
  • (R1R2)(R_1\circ R_2), where R1,R2R_1,R_2 are regular expressions
  • (R1)(R_1)^*, where R1R_1 is regular expressions

Example:

1=={ε}(ΣΣ)={wthe  length  of  w  is  even}(01+)=(01+)(01+)...(01+)={wevery  0  in  w  is  followed  by  at  least  one  1}1^*\circ\emptyset=\emptyset\\ \emptyset^*=\{\varepsilon\}\\ (\Sigma\Sigma)^*=\{w|the\;length\;of\;w\;is\;even\}\\ (01^+)^*=(01^+)(01^+)...(01^+)=\\ \{w|every\;0\;in\;w\;is\;followed\;by\;at\;least\;one\;1\}

Theorem:

A language is regular if and only if some regular expression describe it.

Proof:

One direction: If a language is described by regular expression, then it is a regular language.

We need to construct NFA to recognize corresponding regular expressions, which is the same as the proof of closure under operations.

Example: building an NFA recognizing (ab)aba(a\cup b)^*aba

6

The other direction: If a language is regular, then it is described by a regular expression. I have my own method to do it. example:

7

  • Step1. construct a new accept state, and add ε\varepsilon arrows from accept states to the new state:

8

  • Step2. remove self-loops. If a state qq has a self-loop with character aa, then we need to add aa^* to every arrow from qq to other states. Specially if qq has no arrows from it, then you can just remove qq.

9

  • Step3. remove states. If we want to remove state qq, then we need to enumerate all states q1,q2q_1,q_2 which form following transitions:

    q1R1qR2q2q_1\stackrel{R_1}{\rightarrow}q\stackrel{R_2}{\rightarrow}q_2

    where R1R_1 and R2R_2 are regular expressions. Then we can remove qq, and add arrows from q1q_1 to q2q_2 with regular expressions R1R2R_1\circ R_2.

10

since in the example, the state 2 has only one in-arrow and only one out-arrow, so it can be removed easily. However, for those states who have nn in-arrows and mm out-arrows, we need to add n×mn\times m new arrows.

# Nonregular languages

Theorem (Pumping Lemma)

If AA is a regular languages, then there is a number pp (the pumping length) where, if ss is any string in AA of length at least pp, then ss can be divided into three pieces, s=xyzs=xyz, satisfying the following conditions:

  • for each i0i\geq 0, xyizAxy^iz\in A (Notice! when i=0i=0, xzAxz\in A)
  • y>0|y|>0
  • xyp|xy|\leq p

Proof: Let M=(Q,Σ,δ,q1,F)M=(Q,\Sigma,\delta,q_1,F) be a DFA recognizing AA and pp be the number of states of M: p=Qp=|Q|.

Let s=s1s2...sns=s_1s_2...s_n be a string in AA of length nn, where npn\geq p. Let r1,...,rn+1r_1,...,r_{n+1} be the sequence of states that MM enters when processing ss, so ri+1=δ(si,ri)r_{i+1}=\delta(s_i,r_i).

Since npn\geq p, there are two states in the sequence are the same. Without loss of generality, we assume ri=rjr_i=r_j. Besides, jp+1j\leq p+1 since if j>p+1j>p+1, then there must be two same states in r1,...,rp+1r_1,...,r_{p+1}.

Then we can divide the string as

x=s1s2...si1y=sisi+1...sj1z=sj...snx=s_1s_2...s_{i-1}\\ y=s_{i}s_{i+1}...s_{j-1}\\ z=s_{j}...s_n

since rj=rir_{j}=r_i, so after reading si1s_{i-1} and sj1s_{j-1}, MM would enter the same state.

So no matter how many times yy repeats, MM always enter the same state and loop.

So xyizxy^iz can always be accepted.

Example: Consider the language {0n1nn0}\{0^n1^n|n\geq 0\}. Now we can prove it’s not a regular language with pumping lemma.

Assume to the contrary that it’s regular, then there exists a pumping lemma pp.

However, for any pp, we can construct a string: 0p1p0^p1^p, whose length is greater than pp and cannot be divided into three parts satisfying three conditions.

So it’s not a regular language.

# Context-free language

Definition(context-free grammar)

A context-free grammar is a 4-tuple (V,Σ,R,S)(V,\Sigma,R,S), where

  • VV is a finite set called the variables
  • Σ\Sigma is a finite set, disjoint from VV, called the terminals
  • RR is finite set of rules, with each rule being a variable and a string of variables and terminals
  • SVS\in V is the start variable

If u,vu,v and ww are strings of variables and terminals, and AwA\rightarrow w is a rule of grammar, we say that uAvuAv yields uwvuwv, written uAvuwvuAv\Rightarrow uwv/

Say that uu derives vv, written uvu\stackrel{*}{\Rightarrow} v, if u=vu=v or if a sequence exists:

uu1...ukvu\Rightarrow u_1\Rightarrow...\Rightarrow u_k\Rightarrow v

The language of grammar is {wΣSw}\{w\in\Sigma^*|S\stackrel{*}{\Rightarrow} w\}.

A derivation of a string ww in a grammar GG is a leftmost derivation if at every step the leftmost remaining variable is the one replaced.

Definition

A string ww is derived ambiguously in context-free grammar GG if it has two or more different leftmost derivations. Grammar GG is ambiguous if it generates some string ambiguously.

Sometimes when we have the ambiguous grammar, we can construct another unambiguous grammar generating the same languages. While some ambiguous grammars cannot. We call the latter grammar inherently ambiguous.

Example: {aibjcki=j  or  j=k}\{a^ib^jc^k|i=j\;or\;j=k\} s inherently ambiguous.

Definition(Chomsky normal form)

A context-free grammar is in Chomsky normal form if every rule is of the form

ABCAaA\rightarrow BC\\ A\rightarrow a

where A,B,CA,B,C are any variables except that B,CB,C can’t be the start variable, and aa is any terminals.

In addition we permit the rule SεS\rightarrow\varepsilon when SS is the start variable.

Theorem

Any context-free language is generated by a context-free grammar in Chomsky normal form.

Proof: If we have a context-free grammar, we can construct the corresponding Chomsky normal form as follows.

Example:

SASA    aBAB    SBb    εS\rightarrow ASA\;|\;aB\\ A\rightarrow B\;|\;S\\ B\rightarrow b\;|\;\varepsilon

  • Step1, we add a new start variable S0S_0 and the rule S0SS_0\rightarrow S, where SS was the original start variable.

    S0SSASA    aBAB    SBb    εS_0\rightarrow S\\ S\rightarrow ASA\;|\;aB\\ A\rightarrow B\;|\;S\\ B\rightarrow b\;|\;\varepsilon

  • Step2, we take care of all ε\varepsilon rules. We remove AεA\rightarrow\varepsilon where AA is not the start variable. Then for each occurrence of an AA on the right side of a rule, we add a new rule with that occurrence deleted.

    {S0SSASA    aBAB    SBb    ε{S0SSASA    aB    aAB    S    εBb(remove  Bε){S0SSASA    aB    a    SA    AS    SAB    SBb(remove  Aε)\begin{cases}S_0\rightarrow S\\ S\rightarrow ASA\;|\;aB\\ A\rightarrow B\;|\;S\\ B\rightarrow b\;|\;\varepsilon\end{cases}\Rightarrow \begin{cases}S_0\rightarrow S\\ S\rightarrow ASA\;|\;aB\;|\;a\\ A\rightarrow B\;|\;S\;|\;\varepsilon\\ B\rightarrow b\end{cases}(remove\;B\rightarrow\varepsilon)\\ \Rightarrow\begin{cases}S_0\rightarrow S\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\;|\;S\\ A\rightarrow B\;|\;S\\ B\rightarrow b\end{cases}(remove\;A\rightarrow\varepsilon)

  • Step3, we remove unit rules SS,AB,ASS\rightarrow S,A\rightarrow B,A\rightarrow S:

    {S0SSASA    aB    a    SA    ASAB    SBb(remove  SS){S0ASA    aB    a    SA    ASSASA    aB    a    SA    ASAB    SBb(remove  S0S){S0ASA    aB    a    SA    ASSASA    aB    a    SA    ASAb    SBb(remove  AB){S0ASA    aB    a    SA    ASSASA    aB    a    SA    ASAb    ASA    aB    a    SA    ASBb(remove  AS)\begin{cases}S_0\rightarrow S\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow B\;|\;S\\ B\rightarrow b\end{cases}(remove\;S\rightarrow S)\\ \Rightarrow\begin{cases}S_0\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow B\;|\;S\\ B\rightarrow b\end{cases}(remove\;S_0\rightarrow S)\\ \Rightarrow\begin{cases}S_0\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;S\\ B\rightarrow b\end{cases}(remove\;A\rightarrow B)\\ \Rightarrow\begin{cases}S_0\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ B\rightarrow b\end{cases}(remove\;A\rightarrow S)

  • Step4, we destruct rules that length 3\geq 3

{S0ASA    aB    a    SA    ASSASA    aB    a    SA    ASAb    AT    aB    a    SA    ASTSABb(destruct  AASA){S0ASA    aB    a    SA    ASSAT    aB    a    SA    ASAb    AT    aB    a    SA    ASTSABb(destruct  SASA){S0AT    aB    a    SA    ASSAT    aB    a    SA    ASAb    AT    aB    a    SA    ASTSABb(destruct  S0ASA){S0AT    UB    a    SA    ASSAT    UB    a    SA    ASAb    AT    UB    a    SA    ASTSAUaBb(destruct  S0,S,AaB)\begin{cases}S_0\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ T\rightarrow SA\\ B\rightarrow b\end{cases}(destruct\;A\rightarrow ASA)\\ \Rightarrow \begin{cases}S_0\rightarrow ASA\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ T\rightarrow SA\\ B\rightarrow b\end{cases}(destruct\;S\rightarrow ASA)\\ \Rightarrow \begin{cases}S_0\rightarrow AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;AT\;|\;aB\;|\;a\;|\;SA\;|\;AS\\ T\rightarrow SA\\ B\rightarrow b\end{cases}(destruct\;S_0\rightarrow ASA)\\ \Rightarrow \begin{cases}S_0\rightarrow AT\;|\;UB\;|\;a\;|\;SA\;|\;AS\\ S\rightarrow AT\;|\;UB\;|\;a\;|\;SA\;|\;AS\\ A\rightarrow b\;|\;AT\;|\;UB\;|\;a\;|\;SA\;|\;AS\\ T\rightarrow SA\\ U\rightarrow a\\ B\rightarrow b\end{cases}(destruct \;S_0,S,A\rightarrow aB)

Done.

# Pushdown automata

Definition(Pushdown automaton)

A pushdown automaton is a 6-tuple (Q,Σ,Γ,δ,q0,F)(Q,\Sigma,\Gamma,\delta,q_0,F), where Q,Σ,ΓQ,\Sigma,\Gamma and FF are all finite sets, and

  • QQ is the set of states

  • Σ\Sigma is the input alphabet

  • Γ\Gamma is the stack alphabet

  • δ:Q×(Σ{ε})×(Γ{ε})P(Q×(Γ{ε}))\delta:Q\times(\Sigma\cup\{\varepsilon\})\times(\Gamma\cup\{\varepsilon\})\rightarrow \mathcal{P}(Q\times(\Gamma\cup\{\varepsilon\}))

    is the transition function

  • q0Qq_0\in Q is the start state

  • FQF\subseteq Q is the set of accept states

Definition(Computation)

A pushdown automaton M=(Q,Σ,Γ,δ,q0,F)M=(Q,\Sigma,\Gamma,\delta,q_0,F) computes as follows. It accepts input ww if ww can be written as w=w1w2...wmw=w_1w_2...w_m, where each wiΣ{ε}w_i\in\Sigma\cup\{\varepsilon\} and sequences of states r0,r1,...,rmQr_0,r_1,...,r_m\in Q and strings s0,s1,...,smΓs_0,s_1,...,s_m\in\Gamma^* exists that satisfy the following three conditions:

  • r0=q0,s0=εr_0=q_0,s_0=\varepsilon.
  • (ri+1,b)δ(ri,wi+1,a)(r_{i+1},b)\in\delta(r_i,w_{i+1},a) where si=ats_i=at and si+1=bts_{i+1}=bt.
  • rmFr_m\in F.

Intuitively, s0,s1,...,sms_0,s_1,...,s_m are the state of stacks at each position. Initially s0=εs_0=\varepsilon means that the stack is empty.

Example:

11

Notice that the state transition function is of the form “q1w,abq2q_1\stackrel{w,a\rightarrow b}{\longrightarrow}q_2” which means that at state q1q_1, we read character ww from tape, the pop character aa from the stack and push character bb into the stack. a,ba,b can be ε\varepsilon so we can choose not to pop/push characters.

Theorem

A language is context free if and only if some pushdown automaton recognizes it.

Proof: we can transform any CFG to pushdown automaton and any pushdown automaton to CFG.

  • One direction: from CFG to pushdown automaton. Given a CFG and an input string ww, we need to construct a pushdown automaton to accept ww if and only if the CFG can yield ww.

    12

    The pushdown automaton work as follows:

    1. Place the marker symbol $ and the start variable on the stack,
    2. Repeat the following steps forever:
      • If the top of stack is a variable AA, then nondeterministically select one of the rules for AA and substitute AA by the string on the right side of the rule.
      • If the top of stack is a terminal symbol aa, read the next symbol from the input and compare it to aa. If they match, repeat. If they do not match, reject on this branch of the nondeterminism.
      • If the top stack is the symbol $, enter the accept state.

    We can give the algorithm:

    • Construct four states as follows. And put all rules self-loop in qloopq_{loop} state:

      13

    • Destruct the derivation rules:

      14

      In fact this step is to pop SS and push its derivation string. Pay attention that the string is pushed from right to left.

    • Add the compare of terminals:

      15

      if the top of stack and the readin symbol is the same, then pop it.

    Now we can transform any CFG to PDA. Another example:

    16

  • The other direction: transform PDA to CFG.

    Given a PDA, for each pair of its states p,qp,q, we create a variable ApqA_{pq} in the grammar. This variable will generates all the strings that can take the PDA from pp with an empty stack to qq with an empty stack. Obverve that such strings can also take the PDA from pp to qq with the same stack contents at pp and at qq.

    Firstly, we need to modify the PDA slightly to give it the following three features:

    1. It has a single accept state, qacceptq_{accept}.
    2. It empties its stack before accepting.
    3. Each transition either pushes a symbol onto the stack, or pops one off the stack, but does not do both at the same time.

    Then, we construct the grammar as follows:

    • For each p,q,r,s,t,a,bp,q,r,s,t,a,b, if we have the form:

      pa,εtr...sb,tεqp\stackrel{a,\varepsilon\rightarrow t}{\longrightarrow}r\rightarrow...\rightarrow s\stackrel{b,t\rightarrow\varepsilon}{\longrightarrow}q

      then we add the rule: ApqaArsbA_{pq}\rightarrow aA_{rs}b.

    • For each p,q,rp,q,r, add the rule A_{pq}\rightarrow A_{pr}A_

    • For each pp, add AppεA_{pp}\rightarrow \varepsilon.

    • AqstartqacceptA_{q_{start}q_{accept}} is the start variable.

    You may get some insight for the construction from the following figures:

    18

    Example:

    17

Theorem

Every regular language is context free.

Proof: obviously PDA can simulate any DFA.

Theorem

Context free language is closed under union.

Proof: Add the new rule SS1    S2S\rightarrow S_1\;|\;S_2.

Theorem

Context free language is closed under concatenation.

Proof: Add the new rule SS1    S2S\rightarrow S_1\;|\;S_2.

Theorem

Context free language is closed under star operation.

Theorem

Context free language is NOT closed under intersection.

Proof: {anbncmn,m0}{anbmcmn,m0}\{a^nb^nc^m|n,m\geq 0\}\cap \{a^nb^mc^m|n,m\geq 0\} is not context free.

Theorem

Context free language is NOT closed under complement.

Proof: We assume to the contrary, then L1L2=(L1L2)L_1\cap L_2=\sim(\sim L_1\cup\sim L_2). If CFL is closed under complement, then it should be closed under intersection, which is false.

Theorem

The intersection of a CFL and a regular language is context free.

Proof: For a PDA PP and NFA QQ, we can construct a PDAPDA recognizing L(P)L(Q)L(P)\cap L(Q).

state: (p1,q1)(p_1,q_1) where p1P,q1Qp_1\in P,q_1\in Q.

transition:

(p1,q1)a,bc(p2,q2)(p_1,q_1)\stackrel{a,b\rightarrow c}{\longrightarrow}(p_2,q_2)

if and only if p1a,bcp2p_1\stackrel{a,b\rightarrow c}{\longrightarrow}p_2 in PDA and q1aq2q_1\stackrel{a}{\longrightarrow}q_2 in NFA.

# Non-context-free languages

Theorem(Pumping lemma for context-free languages)

If AA is a context-free languages, then there is a number pp (pumping length) where, if ss is any string in AA of length at least pp, then ss may be divided into five pieces s=uvxyzs=uvxyz satisfying:

  • for each i0i\geq 0, uvixyizAuv^ixy^iz\in A
  • vy>0|vy|>0
  • vxyp|vxy|\leq p

Proof: Say V|V| is the number of variables in GG, bb is the maximum number of symbols in the right side of a rule. We set pp to be bV+1b^{|V|}+1.

If the parser tree’s height is hh, then the string it parses is at most bhb^h long.

So if the length of string is greater than bVb^{|V|}, then the height of the parser tree is bigger than V|V|, in other words, we definitely have two same variables in the longest path in the parser tree.

19

Then we can substitute the small substree by the bigger subtree (their roots are RR both) as the figure above, then condition 1 is satisfied since uvixyizuv^ixy^iz can always be parsed.

Condition 2 is natually satisified since we don’t have the derivation RRR\rightarrow R, so either vv or yy is not empty.

Condition 3 is a bit tricky. We need to modify pp to bV+2b^{|V|+2}. Then the longest path in the parser tree is greater than V+1|V|+1 still. We choose RR so that both occurences fall within tht bottom V+1|V|+1 variables on the path. So the subtree is at most V+2|V|+2 high (the bottom is terminals). Then we can guarantee that vxybV+2=p|vxy|\leq b^{|V|+2}=p.


Example: {anbncnn0}\{a^nb^nc^n|n\geq 0\} is not context free, since for any pp, we can construct apbpcpa^pb^pc^p, it cannot be divided into five pieces satisfying three conditions.

# The church-turing thesis

# Turing machines

Definition(Turing machine)

A Turing machine is a 7-tuple, (Q,Σ,Γ,δ,q0,qaccept,qreject)(Q,\Sigma,\Gamma,\delta,q_0,q_{accept},q_{reject}), where Q,Σ,ΓQ,\Sigma,\Gamma are all finite sets and

  • QQ is the set of states.
  • Σ\Sigma is the input alphabet not containing the blank symbol \sqcup.
  • Γ\Gamma is the tape alphabet, where Γ\sqcup\in\Gamma and ΣΓ\Sigma\subseteq\Gamma.
  • δ:Q×ΓQ×Γ×{L,R}\delta:Q\times\Gamma\rightarrow Q\times\Gamma\times\{L,R\} is the transition function.
  • q0Qq_0\in Q is the start state.
  • qacceptQq_{accept}\in Q is the accept state.
  • qrejectQq_{reject}\in Q is the reject state, where qacceptqrejectq_{accept}\neq q_{reject}.

The configuration is

  • a current state qq
  • a string uu over the tape alphabet
  • a string vv over the tape alphabet

the configuration means that, the current of turing machine is qq, the content of the tape is uvuv and the head location is at the first symbol of vv.

The configuration C1=(q,u,v)C_1=(q,u,v) yields C2=(q,u,v)C_2=(q',u',v') if:

  • δ(q,a)=(q,b,L)\delta(q,a)=(q',b,L)

    u=wcv=awu=wv=cbwu=wc\\ v=aw'\\ u'=w\\ v'=cbw'

  • or δ(q,a)=(q,b,R)\delta(q,a)=(q',b,R)

    u=wcv=awu=wcbv=wu=wc\\ v=aw'\\ u'=wcb\\ v'=w'

Intuitively, it’s just modifying the head location content from aa to bb and move the head.

Remark(Important!): When the head is already at the leftmost location of the tape, then if we try to move it left, then it will stay still. Thanks to this property, we can easily move head to the leftmost.

If we enter the accept configuration (the state in the configuration is qacceptq_{accept}) then the Turing machin will accept the input string. Similarly we have reject configuration. Both of them are called halting configuration.

Definition

Call a language Turing-recognizable if some Turing machine recoginizes it.

TM recognizes language LL means that for all string in LL, TM will accept it. While for strings not in LL, TM will reject or not halt.

Definition

Call a language Turing-decidable if some Turing machine decides it.

TM decides language LL means that for all string in LL, TM will accept it. While for strings not in LL, TM will reject it. The TM must halt.

Example: Here we describe a Turing machine that decides A={02nn0}A=\{0^{2^n}|n\geq 0\}.

M=M= "On input string ww:

  1. Sweep left to right across the tape, crossing off every other 0.
  2. If in stage 1 the tape contained a single 0, accept.
  3. If in stage 1 the tape contained more than a single 0 and the number of 0s was odd, reject.
  4. Return the head to the left-hand end of the tape
  5. Goto state 1."

The stage 1 is cutting the number of 0s in half.

# Varaints of turing machine

A multiple Turing machine is like an ordinary Turing machine with several tapes. Each tape has its own head for reading and writing.

If we have kk tapes, then the transition functions is like:

δ:Q×ΓkQ×Γk×{L,R,S}k\delta:Q\times\Gamma^k\rightarrow Q\times \Gamma^k\times \{L,R,S\}^k

SS means “stay”. So all kk tapes can do their own work or pause.

Theorem

Every multitape Turing machine has an equivalent single-tape Turing machine.

Proof: Say that MM has kk tapes, we can construct a single tape Turing machine to simulate its behavior.

SS = "On input w=w1...wnw=w_1...w_n:

  1. First SS puts its tape into the format that represents all kk tapes of MM. The formatted tape contains:

    @w1˙w2...wn@˙@˙@...@...@\dot{w_1}w_2...w_n@\dot{\sqcup}@\dot{\sqcup}@...@\sqcup\sqcup\sqcup...

  2. To simulate a single move, SS scans its tape from the first @, which marks the left-hand end, to the (k+1)(k+1)st @, which marks the right-hand end, in order to determine the symbols under the virtual heads. Then SS makes a second pass to update the tapes according to the way that MM's transition function dictates.

  3. If at any point SS moves one of the virtual heads to the right onto a @, this action signifies that MM has moved th corresponding head onto the previously unread blank portion of that tape. So SS writes a blank symbol on this tape cell and shifts the tape contents, from this cell until the rightmost @, one unit to the right.

    @w1˙w2...wn@a˙@˙@...@shifted  right@\dot{w_1}w_2...w_n@a\dot{\sqcup}\underbrace {@\dot{\sqcup}@...@}_{shifted\;right}

    Then repeat the simulation."


Nondeterministic Turing machine can proceed different possibilities.

δ:Q×ΓP(Q×Γ×{L,R})\delta:Q\times\Gamma\rightarrow\mathcal{P}(Q\times\Gamma\times\{L,R\})

and it accepts the input string if any branch accepts it.

Theorem

Every nondeterministic Turing machine has an equivalent single-tape Turing machine.

Proof: We can use a three-tapes Turing machine to simulate nondeterministic Turing machine.

20

Let’s first consider the data representation on tatpe 3. Every node in the searching tree can have most bb children, where bb is the size of largest set of possible choices given by NN's transition function.

To every node in the three we ssign an address that is a string over {1,2,...,b}\{1,2,...,b\}.

For example, we assign the address 231 to the node we arrive at by starting at the root, going to its 2nd child, going to that node’s 3rd child, and finally going to that node’s 1st child. Each symbol in the string tells us which choice to make next when simulating a step in one branche in NN's nondeterministic computation.

If too few choices are available for a configuration, then the symbol is invalid.

Now we are ready to describe DD.

  1. Initially tape 1 contains the input ww, and tapes 2 and 3 are empty.

  2. Copy tape 1 to tape 2.

  3. Use tape2 to simulate NN with input ww on one brach of its nondeterministic computation. Before each step of NN consult the next symbol on tape 3 to determine which choice to make.

    If no more symbols remain on tape 3 or if this nondeterministic choice is invalid, abort this brach by going to stage 4. Also goto stage 4 if a rejecting configuration is encountered. If an accepting configuration is encountered, accept the input.

  4. Replace the string on tape 3 the lexicographically next string. example:

    12312412b131bbb1111123\rightarrow 124\\ 12b\rightarrow 131\\ bbb\rightarrow1111\\

    Then repeat the simulation.

Theorem

A language is Turing-recognizable/decidable if and only if some nondeterministic Turing machine recognizes/decides it.


Definition(Enumerator)

An enumerator is a TM with two tapes: the work tape and the printer.

  • Initially both tapes are empty.

  • On the work tape it can read, write and move just like the ordinary TM.

  • On the printer, it can only print one of the symbols of the language alphabet, or @ serving as the “end of string” symbol. So the transition looks like:

    q1abc,Dq2q_1\stackrel{\stackrel{c}{a\rightarrow b},D}{\longrightarrow}q_2

    meaning “if in state q1q_1, you see a symbol aa on the work tape, then replace it with bb, move in the direction DD, goto state q2q_2 and print cc to the printer”. Every time it prints a symbol, the head of the printer moves right. if cc is absent, then nothing happens to printer.

  • There is start state, but no accept or reject states. Entering a configuration from which there is no transition causes the enumerator to halt.

  • The language enumerated by an enumerator is the set of strings that it prints on the printer.

Example:

21

Theorem

A language is Turing-recognizable if and only if some enumerator enumerates it.

Proof: First we show that if we have an enumerator EE that enumerates a language AA, a TM MM recognizes it.

MM = "On input ww:

  1. Run EE. Every time that EE outputs a string, compare it with ww.
  2. If ww ever appears in the output of EE, accept."

Now we do the other direction

EE = "Ignore the input

  1. Repeat the following for i=1,2,3,...i=1,2,3,...
  2. Run MM for ii steps on each input s1,s2,...,sis_1,s_2,...,s_i
  3. If any computations accept, print out the correspoinding sjs_j."

where s1,s2,s3,...s_1,s_2,s_3,... is a list of all possible strings in Σ\Sigma^*.

In fact, if MM accepts some string ww, then it will appear on the printer infinite times.


Church-Turing thesis: Intuitive notion of algorithms = algorithms defined by λ\lambda calculus = algorithms defined by Turing machines.

# Decidability

# Decidable languegs

Theorem

ADFA={B,wBA_{DFA}=\{\langle B,w\rangle|B is a DFA that accepts input string w}w\} is a decidable language.

Proof:

MM = "On input B,w\langle B,w\rangle, where BB is a DFA and ww is a string:

  1. Simulate BB on input ww.
  2. If the simulation ends in an accept state, accept. If it ends in a nonaccepting state, reject."

Theorem

ANFA={B,wBA_{NFA}=\{\langle B,w\rangle|B is a NFA that accepts input string w}w\} is a decidable language.

Proof:

MM = "On input B,w\langle B,w\rangle, where BB is a NFA and ww is a string:

  1. Convert BB to a DFA. Since the algorithm is precisely introduced before, it can be simulated by TM.
  2. Call ADFAA_{DFA} decider."

Theorem

AREX={B,wBA_{REX}=\{\langle B,w\rangle|B is a regular expression that generates input string w}w\} is a decidable language.

Proof:

MM = "On input B,w\langle B,w\rangle, where BB is a regular expression and ww is a string:

  1. Convert BB to its corresponding NFA.

  2. Call ANFAA_{NFA} decider."

Theorem

EDFA={AAE_{DFA}=\{\langle A\rangle|A is a DFA and L(A)=}L(A)=\emptyset\} is decidable.

Proof:

TT = "On input A\langle A\rangle where AA is a DFA:

  1. Mark the start state of AA.
  2. Repeat until no new states get marked:
  3. Mark any state that has a transition coming into it from any state that is already marked.
  4. If no accept state is marked, accept; otherwise reject."

Theorem

EQDFA={A,BAEQ_{DFA}=\{\langle A,B\rangle|A and BB are DFAs and L(A)=L(B)}L(A)=L(B)\} is decidable.

Proof:

FF = "On input A,B\langle A,B\rangle, where AA and BB are DFAs:

  1. Construct DFA CC, where

    L(C)=(L(A)L(B))(L(A)L(B))L(C)=(L(A)\cap\sim L(B))\cup(\sim L(A)\cap L(B))

    L(C)L(C) is called the symmetric difference of L(A)L(A) and L(B)L(B).

  2. Call ECE_C to decider whether L(C)L(C) is empty, if is, then accept. Otherwise reject."


Theorem

ACFG={G,wGA_{CFG}=\{\langle G,w\rangle|G is a CFG that generates string w}w\} is decidable.

Proof:

SS = "On input G,w\langle G,w\rangle where GG is a CFG and ww is a string:

  1. Convert GG to an equivalent grammar in Chomsky normal form.
  2. List all derivations with 2n12n-1 steps, where nn is the length of ww, except if n=0n=0, then check if we have the rule SεS\rightarrow\varepsilon.
  3. If any of these derivations generate ww, accept; if not, reject."

Notice that if a string can be generated by a Chomsky normal form, then the number of derivation steps is at most 2n12n-1.

Theorem

ECFG={GGE_{CFG}=\{\langle G\rangle|G is a CFG and L(G)=}L(G)=\emptyset\} is decidable.

Proof:

RR = "On input G\langle G\rangle where GG is a CFG:

  1. Mark all terminal symbols in GG.
  2. Repeat until no new variables get marked:
  3. Mark any variable AA where GG has a rule AU1U2...UkA\rightarrow U_1U_2...U_k and each symbol U1,...,UkU_1,...,U_k has already been marked.
  4. If the start variable is not marked, accept; otherwise reject."

Theorem

EQCFG={A,BA,BEQ_{CFG}=\{\langle A,B\rangle|A,B are CFGs and L(A)=L(B)}L(A)=L(B)\} is NOT decidable.

The proof is too hard now.

Theorem

Every context-free language is decidable.

Proof: Let GG be a CFG for context-free language AA, then on input string ww, we just need to call ACFGA_{CFG} decider.

22

# The halting problem

Theorem

ATM={M,wMA_{TM}=\{\langle M,w\rangle| M is a TM and MM accepts w}w\} is undecidable.

Proof: We assume to the contrary that ATMA_{TM} is decidable and has a decider HH:

H(M,w)={acceptif  M  accepts  wrejectif  M  does  not  accept  wH(\langle M,w\rangle)=\begin{cases}accept&if\;M\;accepts\;w\\ reject&if\;M\;does\;not\;accept\;w \end{cases}

Now we construct a new TM:

DD = "On input M\langle M\rangle, where MM is a TM:

  1. Run HH on input M,M\langle M,\langle M\rangle\rangle.
  2. If HH accepts, then reject. If HH rejects, then accept."

Now what would happen if we run DD on input D\langle D\rangle?

D(D)={acceptD  does  not  accept  DrejectD  accpet  DD(\langle D\rangle)=\begin{cases} accept&D\;does\;not\;accept\;\lang D\rangle\\ reject&D\;accpet\;\langle D\rangle \end{cases}

Which is obviously a contradiction.

So HH doesn’t exists.


Theorem

A language is decidable if and only it’s Turing recognizable and co-Turing-recognizable (its complement is Turing recognizable).

Proof:

One direction: if AA is decidable, we can easily see that both AA and its complement Aˉ\bar{A} are Turing-recognizable.

The other direction: if both AA and Aˉ\bar{A} are Turing-recognizable, we let M1M_1 be the recognizer for AA and M2M_2 for Aˉ\bar{A}. And we construct a Turing machine:

MM = "On input ww:

  1. Run both M1M_1 and M2M_2 on input ww in parallel (two tapes).
  2. If M1M_1 accept, accept; If M2M_2 accept, reject."

Theorem

ATMˉ\bar{A_{TM}} is not Turing-recognizable.

Proof: We already know that ATMA_{TM} is Turing-recognizable but not Turing-decidable. So its complement must NOT be Turing-recognizable.

# Reducibility

# Undecidable problems from language theory

Theorem

HALTTM={M,wMHALT_{TM}=\{\langle M,w\rangle|M is a TM and MM halts on input w}w\} is undecidable.

Proof: We assume to the contrary, if HALTTMHALT_{TM} is decidable and RR is a decider. Then we can construct a decider for ATMA_{TM}:

SS = "On input M,w\langle M,w\rangle, an encoding of a TM and a string:

  1. Run RR on input M,w\langle M,w\rangle.
  2. If RR rejects, reject
  3. If RR accepts, simulate MM on ww until it halts.
  4. If MM accepts, then accept; otherwise reject.

Since we already know that ATMA_{TM} is undecidable, so RR does not exist. So HALTTMHALT_{TM} is undecidable.

Theorem

ETM={MME_{TM}=\{\langle M\rangle|M is a TM and L(M)=}L(M)=\emptyset\} is undecidable.

Proof:

Like the last theorem, we can prove that, if we have the decider for ETME_{TM}, then ATMA_{TM} will be decidable too.

We assume to the contrary, RR is a decider for ETME_{TM}.

We construct a decider for ATMA_{TM}:

SS = "On input string M,w\langle M,w\rangle where MM is a Turing machine and ww is a string:

  1. Construct a new Turing machine MM':

    MM' = "On input string xx

    1. If xwx\neq w, reject
    2. If x=wx=w, run MM on ww if MM accepts then accept."
  2. Run RR on input M\langle M'\rangle.

  3. If RR accepts, reject; otherwise accept."

Remark: We don’t need to run MM on ww, instead, we only need the encoding of MM' doing such work.

Obviously, L(M)L(M')\neq\emptyset if and only if MM accepts ww.

Theorem

REGULARTM={MMREGULAR_{TM}=\{\langle M\rangle|M is a TM and L(M)L(M) is a regular language}\} is undecidable.

Proof: We let RR be a decider that decides REGULARTMREGULAR_{TM}. Then we can construct a decider for ATMA_{TM}:

SS = "On input M,w\langle M,w\rangle, where MM is a TM and ww is a string:

  1. Construct the following TM MM':

    MM' = "On input xx:

    1. If xx is of the form 0n1n0^n1^n, accept.
    2. If xx does not have the form, run MM on input ww and accept if MM accepts ww."
  2. Run RR on input M\langle M'\rangle.

  3. If RR accepts, accept; otherwise reject."

Remark: Firstly, we don’t need to run MM on ww, instead, we only need the encoding of MM' doing such work.

Obviously, if MM accept ww, then L(M)=ΣL(M')=\Sigma^*. Because no matter of the input xx, MM' will accept it. If MM doesn’t accept ww, then L(M)={0n1n}L(M')=\{0^n1^n\}, since only x=0n1nx=0^n1^n, MM' will accept it. So L(M)L(M') is regular \Leftrightarrow MM accepts ww.

So if RR acceptes MM', then MM accepts ww.

Theorem

EQTM={M1,M2M1,M2EQ_{TM}=\{\langle M_1,M_2\rangle|M_1,M_2 are TMs and L(M1)=L(M2)}L(M_1)=L(M_2)\} is undecidable.

Proof:

If we have the decider RR for EQTMEQ_{TM}, then we can construct the decider for ETME_{TM}:

SS = "On input M\langle M\rangle, where MM is TM

  1. Run RR on M,M1\langle M,M_1\rangle, where M1M_1 is the Turing machine rejecting any input.
  2. If R accepts, accept; otherwise reject."

Obviously, L(M)=L(M)=L(M1)L(M)=\emptyset\Leftrightarrow L(M)=L(M_1).

Theorem

ALLCFG={GGALL_{CFG}=\{\langle G\rangle|G is a CFG and L(G)=Σ}L(G)=\Sigma^*\} is undecidable.

# Mapping reducibility

Definition

A function f:ΣΣf:\Sigma^*\rightarrow\Sigma^* is a computable function if some Turing machine MM, on every input ww, halts with just f(w)f(w) on its tape.

Definition

Language AA is mapping reducible to language BB, written AmBA\leq_m B, if there is a computable function f:ΣΣf:\Sigma^*\rightarrow\Sigma^*, where for every ww:

wAf(w)Bw\in A\Leftrightarrow f(w)\in B

The function ff is called the reduction of AA to BB.

23

Theorem

If AmBA\leq_m B and AA is undecidable, then BB is undecidable.

Proof:

If we have a decider for BB, then for any input ww, we can compute f(w)f(w) and check whether it is in BB. Then we can decide whether ww is in AA.