Non-regular Languages

Reading

Section 1.4

Practice problems (page 85): 1.29, 1.30, 1.47, 1.48, 1.49

Challenge: 1.44, 1.45, 1.54, 1.55, 1.56, 1.63

Non-regular Languages

It is important to understand the limitations of any system we use, and this includes finite automata. So a natural question there is: Can we easily say if a language is regular or not?

A language is called non-regular, if there is no DFA/NFA that recognizes it.

So how can we tell if a language is non-regular? We can show a language is regular by providing a DFA/NFA for it, but we cannot say a language is non-regular simply because we have not been able to find a DFA for it.

Our inability to find a proof for a statement is not a proof that the statement is false.

We therefore need something more concrete. But before we do that, let us state a question that we will return to at a later time:

Is there a process/algorithm/method, which given a language can determine if that language is regular or non-regular?

Let’s attempt a partial answer:

Let us leave it at that for now.

But let us exemplify some of our difficulties with the following two examples:

The language \[C = \left\{w\mid w\textrm{ has equal number of 0s and 1s }\right\}\] is not regular. But the language \[D = \left\{w\mid w\textrm{ has equal number of 01s and 10s }\right\}\] in fact is regular.

Take a moment to ponder why that is.

The Pumping Lemma for regular languages

In this section we will establish a nice relatively easy-to-test requirement for a language to be regular. This would allow us to exhibit many languages as non-regular. It is known as the pumping lemma. It essentially says that if the language has a string of a certain length, then we can produce infinitely many strings of a certain form that must also be in the language.

In this section we will make extensive use of the notation \(|x|\) to denote the length of the string \(x\).

Pumping Lemma for regular languages.

If \(A\) is a regular language, then there is a number \(p\), called the pumping length, so that if \(s\) is any string in \(A\) with length at least \(p\), then we can break \(s\) in three parts as \[s = xyz\] so that:

  1. for each \(i\geq 0\) we have that \(xy^iz\in A\),
  2. \(|y| > 0\) (\(y\) is nonempty),
  3. \(|xy| \leq p\).

Here’s the idea of the proof:

Now let us look at a formal proof:

Question: Can it happen that \(x\) or \(z\) is empty? Can they both be empty?

Using the pumping lemma

Let us see how the pumping lemma can help us show that a language is non-regular. As an example we will take the language \(A=\{ 0^n1^n\mid n\geq 0 \}\) described earlier. The idea applies in many situations:

  1. Suppose \(A\) was regular. Then it would have a pumping length \(p\).
  2. Consider an appropriate string from the language. In this case we will consider the string \(s=0^p1^p\).
  3. Apply the pumping lemma: This string \(s\) can be written as \(s=xyz\) with the aforementioned conditions.
  4. The condition \(|xy|\leq p\) can help us say something about \(y\). In this case \(y\) must consist of all zeros.
  5. The fact that the strings \(xy^iz\) must all be in the language usually leads to a contradiction now. In our case, \(xy^2z\) has more zeros than \(xyz\), but the same number of ones. That is not possible given that \(xy^z\) is supposed to also be in \(A\).
  6. The fact that we arrived at a contradiction tells us that the language \(A\) must not be regular.

Note that the exact same proof can be used also for the language \(C = \left\{w\mid w\textrm{ has equal number of 0s and 1s }\right\}\).

The book has a lot of excellent examples. Study them!