The last post was about strings, so now it’s the right time to
learn something something about **languages** too.

From the theoretical point of view a *language* is a set of words or more
precisely sentences (strings). Possibly and usually an infinite set. Only the
sentences that are a part the set belong to the language, nothing else. Neat
feature of languages is that they actually interconnect two separate formal
theories/models into one beautiful concept. Let me explain.

Language is a **set**. As in mathematical
set. So anything that
applies to sets applies to languages as well (like union, intersection,
complement, Cartesian product). Members of any language are only
**strings**. And
there are some operations defined over strings (that is concatenation,
reversal, sub-strings etc.). We can take advantage of this fact and abstract
those, so they work with language as well. There is generally **a lot** you can
do with a formal language either from the set or string point of view.

**Definition 1.** A **formal language** *L* over an alphabet Σ is a subset of
Σ^{*}. Formally, L ⊆ Σ^{*}. The magical Σ^{*} denotes
a set of all existing words and sentences over Σ. Sometimes it’s called an
alphabet iteration.

Like I said earlier, all operations defined for sets work with languages as
well. For example **language union** L_{1} ∪ L_{2} =
{s; s ∈ L_{1} ∨ s ∈ L_{2}} etc. Even more interesting are the
string operations that can be abstracted so they work on languages too. There
is **language concatenation**:

**Definition 2.** Let L_{1} be a language over Σ_{1} and
L_{2} be a language over Σ_{2}. **Concatenation** of these
two L = L_{1}⋅_{2} = {xy; x ∈ L_{1}, y ∈ L_{2}}
is a language over alphabet Σ = Σ_{1} ∪ Σ_{2}.

Language concatenation as well as the string equivalent is not commutative operation. There is a couple of special cases:

- L ⋅ {ε} = L
- L ⋅ ∅ = ∅

The first, {ε} is a language with a single word – an empty string, in the
second case we’re talking about an empty set. Concatenation can be looked at a
sort of multiplication between the two languages (remember Cartesian product?).
As a result of this thought, we define **language power**.

**Definition 3.** Let *L* be a language over Σ, then *Ln* denotes n-power of
language and

- L
_{0}= {ε} - L
_{n+1}= L_{n}⋅L

With language power defined, we can advance to the last operation called
**language iteration**. It’s basically an abstraction of language power.

**Definition 4**. L+ denotes positive iteration of language L defined
accordingly

- L
^{+}= L ∪ L_{2}∪ … ∪ L_{∞}

**Definition 5**. L* denotes iteration of language L defined accordingly

- L
^{*}= {ε} ∪ L ∪ L_{2}∪ … ∪ L_{∞}

## Summary

In this post were introduced some of the most basic and most common operations
that are defined on languages. The most important thing to remember is, that
language is a **set** of **sentences** (or words) and the operations that we
define originate from these two areas. We have some set operations like
**union**, **intersection**, **complement** as for generic sets. And we have
abstractions for
string operations like **concatenation** (notice, that it’s
somewhat similar to cartesian product), **language power** and **iteration**.
Anyway, these couple of operations is nowhere near to the complete list of all
existing functions for languages! After all, if you dive into some algebraic
structures a little you can easily define your own :-).