Easy Foundations for Programming Languages IX — Universal Algebra and Algebraic Data Types

In this article, we introduce a new mathematical framework which is simpler than typed lambda calculus, called universal algebra (or equational logic). This framework will be useful when we model imperative programs in the next article. In universal algebra, we consider types as sets, and focus on studying the algebraic datatype.

An algebraic datatype consists of one or more sets of values, such as natural numbers, booleans, characters or strings, together with a collection of functions on these sets. The adjective “algebraic” restricts that none of these functions should have function type parameters.

Unlike typed lambda calculus, which has its root in logic, universal algebra comes from the set theory. Although these two mathematical frameworks have different origins, they are like two man looking at the exact same “mountain” just from different angles. As a result, you may find many of the concepts that will be covered today have their counterparts in our previous articles, with only the expression style changed!

The denotational semantics of algebraic datatype is given by algebras, and this is where we begin today’s discussion from.

Algebras, Signatures and Terms

Algebras

An algebra consists of one or more sets, called carriers, together with a collection of distinguished elements and first-order (or algebraic) functions.

\[f:A_1 \times ... \times A_k \rightarrow A\]

over carriers of the algebra.

First-order or algebraic simply means that the function cannot accept functions as parameters.
The word “first-order” comes from the first-order logic, where quantification over predicates is not allowed. (See Easy Foundations for Programming Languages II — Notation and Mathematical Conventions.) And predicates can be regarded as functions. The word “algebraic” has a similar meaning in mathematics.

An example is the algebra

\[\mathcal{N} = \langle \mathbb{N},0,1,+,\ast \rangle ,\]

with carrier $\mathbb{N}$ the set of natural numbers, distinguished elements $0,1 \in \mathbb{N}$, and functions $+,\ast:\mathbb{N} \times \mathbb{N} \to \mathbb{N}$.

The difference between a “distinguished” element of a carrier and any other element is that we have a name for each distinguished element in the language of algebraic terms. To simplify, we will consider distinguished elements like $0,1 \in \mathbb{N}$ as “zero-ary” functions, or “functions of zero arguments.” This is not a deep idea at all, just a convenient way to treat distinguished elements and function uniformly.

An example with more than one carrier is the algebra

\[\mathcal{A}_{pcf} = \langle \mathbb{N},\mathbb{B},0,1,...,+,true,false,Eq?,... \rangle ,\]

where $\mathbb{N}$ is the set of natural numbers, $\mathbb{B}$ is the set of booleans, $0,1,…$ are natural numbers, $+$ is the addition function, and so on. These are the mathematical values that we usually think of when we write basic numeric and boolean expressions of PCF.

Syntax of Algebraic Terms

The syntax of algebraic terms depends on the basic symbols and their types. This information is collected together in a signature. A signature $\sigma=\langle S, F \rangle$ consists of

A set $S$ whose elements are called sorts.
A collection $F$ of pairs $\langle f,s_1 \times … \times s_k \to s \rangle$, with $s_1 , … , s_k,s \in S$ and no $f$ occuring in two distinct pairs.

In algebra, is traditional to call the basic type symbol sorts. Because in algebraic datatype theory, people differentiate type from sort by saying that “a type comes equipped with specific operations, while a sort does not.”
This distinction is actually consistent with the terminology we used in PCF articles, since function types and product types come equipped with specific operations, namely, lambda abstraction, application, pairing and projection.

In the definition of signature, a symbol $f$ occuring in $F$ is called a typed function symbol (or function symbol for short), and an expression $s_1 \times … \times s_k \to s$ a first-order (or algebraic) function type over $\boldsymbol{S}$. We usually write $f:\tau$, instead of $\langle f,\tau \rangle \in F$.

A sort is a name for a carrier of an algebra, and a function symbol $f:s_1 \times … \times s_k \to s$ is a name for a $k$-ary function. We allow $k=0$, so that a “function symbol” $f:s$ may be a name for an element of the carrier of sort $s$. A symbol $f:s$ is called a constant symbol, and $f:s_1 \times … \times s_k \to s$ with $k \geq 1$ is called a proper function symbol.

As an example, a simple signature for writing natural number expression $\Sigma_\mathcal{N} = \langle S,F \rangle$ may be written in the following format.

sorts: $\:\mathbb{N}$
fctns: $\:0,1 \in \mathbb{N}$
$\quad \quad \: \:+,\ast : \mathbb{N} \times \mathbb{N} \to \mathbb{N}$

The purpose of a signature is to allow us to write algebraic terms. We assume we have some infinite set $\mathcal{V}$ of symbols we called variables, and assume that these are different from all the symbols we will ever consider using as constants, function symbols, or sort symbols. A sort assignment is a finite set

\[\Gamma = \{ x_1:s_1,...,x_k:s_k \}\]

of ordered $variable:sort$ pairs.

Given a signature $\Sigma = \langle S,F \rangle$ and sort assignment $\Gamma$ using sorts from $S$, we define the set $Terms^s(\Sigma,\Gamma)$ of algebraic terms of sort $\boldsymbol{s}$ over signature $\boldsymbol{\Sigma}$ and variables $\boldsymbol{\Gamma}$ as follows.

\[\begin{split} x \in Terms^s(\Sigma,\Gamma) \text{ if }& x:s \in \Gamma,\\ f \: M_1 \:... \: M_k \in Terms^s(\Sigma,\Gamma) \text{ if }& f:s_1 \times ... \times s_k \to s \text{ and}\\ & M_i \in Terms^{s_i}(\Sigma,\Gamma) \text{ for } i=1,..,n. \end{split}\]

In the special case $k=0$, the second clause should be interpreted as

\[f \in Terms^s(\Sigma,\Gamma) \text{ if } f:s \in \Gamma.\]

For example, $0 \in Terms^{nat}(\Sigma_\mathcal{N},\emptyset)$ and $+01$ is a term in $Terms^{nat}(\Sigma_\mathcal{N},\emptyset)$, since $+$ is a binary numeric function symbol in $\Sigma_\mathcal{N}$. It is also easy to see that $+01 \in Terms^{nat}(\Sigma_\mathcal{N},\Gamma)$ for any sort assignment $\Gamma$.

To match the syntax of PCF, we may treat $0+1$ as syntactic sugar for $+01$.

We use the notation $\Gamma,x:s’$ for the sort assignment

\[\Gamma,x:s' = \Gamma \cup \{x:s'\}\]

where $x$ does not occur in $\Gamma$.

In algebra, the substitution $[N/x]M$ of $N$ for all occurrences of $x$ in $M$ is accomplished by straightforward replacement of each occurrence of $x$ by $N$, since there are no bound variables in algebraic terms. It is easy to see that if we replace a variable by a term of the correct sort, we obtain a well-formed term of the same sort we started with. This is stated more precisely in the following lemma.

(Lemma. Substitution) If $M\in Terms^s(\Sigma,\Gamma,x:s’)$ and $N\in Terms^{s’}(\Sigma,\Gamma)$, then $[N/x]M\in Terms^s(\Sigma,\Gamma)$.

(Proof.) The proof is by induction on the structure of terms in $Terms^s(\Sigma,\Gamma,x:s’)$.

Since every term is either a variable or the application of a function symbol to one or more arguments, induction on algebraic terms has one base condition (variables) and one induction step (function symbols). In the base case, we must prove the lemma directly. In the induction step, we assum that the lemma holds for all the subterms. This assumption is called the induction hypothesis.

For the base case, we consider a variable $y \in Terms^s(\Sigma,\Gamma,x:s’)$. If $y$ is different from $x$, then $[N/x]y$ is just $y$. In this case, we must have $y:s \in \Gamma$, and hence $y \in Terms^s(\Sigma,\Gamma)$. If our term is the variable $x$ itself, then the result $[N/x]x$ is the term $N$, which belongs to $Terms^s(\Sigma,\Gamma)$ by similar assumption. This proves the base case of the induction.

For the induction step, we assume that $[N/x]M_i\in Terms^{s_i}(\Sigma,\Gamma)$, for $1 \leq i \leq k$, and consider a term of the form $fM_1,..,M_k$. The result of substitution $N$ for $x$ in this term may be written $f[N/x]M_1,..,[N/x]M_k$. Since we assume $fM_1,..,M_k\in Terms^{s}(\Sigma,\Gamma,x:s’)$, the function symbol must have type $s_1 \times … \times s_k \to s$. It follows that $f[N/x]M_1,..,[N/x]M_k \in Terms^s(\Sigma,\Gamma)$, which finishes the induction step and proves the lemma. $\blacksquare$

Algebras and the Interpretation of Terms

Algebras are mathematical structures that provide meaning (denotational semantics) for algebraic terms. To make the mapping from symbols to algebras explicit, we will use a form of algebra that gives a value to every sort and function symbol of some signature.

If $\Sigma$ is a signature, then a $\boldsymbol{\Sigma}$-algebra $\boldsymbol{\mathcal{A}}$ consists of

Exactly one carrier $\mathcal{A}^s$ for each sort symbol $s \in S$,
An interpretation map $\mathcal{I}$ assigning a function $\mathcal{I}(f): A^{s_1} \times...\times A^{s_k} \to A^s$ to each proper function symbol $f:s_1 \times … \times s_k \to s \in F$ and an element $\mathcal{I}(f) \in A^s$ to each constant symbol $f:s \in F$.

If $\Sigma$ has only one sort, then we say that $\Sigma$-algebra is single-sorted; otherwise, we say an algebra is many-sorted or multi-sorted.

If $\mathcal{A} = \langle \{ A^s \}_{s \in S}, \mathcal{I} \rangle$ is a $\Sigma$-algebra, and $f$ is a function symbol of $\Sigma$, it is often convenient to write $f^\mathcal{A}$ for $\mathcal{I}(f)$. Using this convention, we may rewrite the $\Sigma_{\mathcal{N}}$-algebra $\mathcal{N}$ as

\[\mathcal{N} = \langle \mathbb{N}, 0^\mathcal{N},1^\mathcal{N} , +^\mathcal{N}, \ast^\mathcal{N}\rangle.\]

This gives us all the information we need to interpret terms over $\Sigma_\mathcal{N}$, since the interpretation map $\mathcal{I}$ is given by the way the functions of the algebra are named.

We define an environment $\eta$ for $\mathcal{A}$ as a mapping $\eta:\mathcal{V} \to \cup_s A^s$ from variables to elements of the carriers of $\mathcal{A}$. We say an environment $\eta$ satisfies $\Gamma$ if $\eta(x) \in A^s$ for each $x:s \in \Gamma$.

Here, $\Gamma$ is more like variable declarations, and $\eta$ represents the actual values assigned to these variables.

Given an environment $\eta$ for $\mathcal{A}$ that satisfies $\Gamma$, we define the meaning $\mathcal{A}[\![ x ]\!] \eta$ of any $M \in Terms(\Sigma, \Gamma)$ in $\eta$ as follows

\[\begin{split} \mathcal{A}[\![ x ]\!] \eta &= \eta(x)\\ \mathcal{A}[\![ fM_1 \:... M_k ]\!] \eta &= f^\mathcal{A}(\mathcal{A}[\![ M_1 ]\!] \eta, ..., \mathcal{A}[\![ M_k ]\!] \eta). \end{split}\]

It is common to omit $\mathcal{A}$ and write $\mathcal{A}[\![ x ]\!] \eta$ if the algebra $\mathcal{A}$ is clear from context. If $M$ has no variables, then $\mathcal{A}[\![ x ]\!] \eta$ does not depend on $\eta$. In this case, we may omit the environment and write $\mathcal{A}[\![ x ]\!]$ for the meaning of $M$ in algebra $\mathcal{A}$.

Proof System for Algebras

The axiomatic semantics of an algebraic datatype is given by a set of equations between terms over a signature. Together, a signature and set of equations is called an algebraic specification. Based on equations, we can build a proof system to get some useful properties, just like what we have done in Easy Foundations for Programming Languages VIII — Proof Systems.

From an algebraic specification, we may either use the equational proof system to derive additional equations between terms, or ask which algebras satisfy the requirements imposed by these equations. These two are closely related — the equation we derive from a specification should hold in any algebra that satisfies the specification. This property is called soundness of the algebraic proof system.

The converse is that every equation that holds in all algebras satisfying the specification is provable from the specification. This is called completeness.

First, let us look at some concepts of equations in universal algebras.

Equations

An equation is a formula $M=N[\Gamma]$ with $M,N \in Terms^s(\Sigma, \Gamma)$ for some $s$. Note that an equation has three parts: two terms and a set of typed variables. If $\eta$ satisfies $\Gamma$, then we say that an algebra $\mathcal{A}$ satisfies $M=N[\Gamma]$ at environment $\eta$, written

\[\mathcal{A}, \eta \vDash M=N[\Gamma],\]

whenever $[\![ M ]\!] \eta = [\![ N ]\!] \eta$.

If $\mathcal{A}$ satisfies an equation at $\eta$, we also say the equation holds at $\eta$. For terms containing variables, we are usually more interested in whether an equation holds for all values of the variables than at a specific environment. We say $\mathcal{A}$ satisfies $M=N[\Gamma]$, and write

\[\mathcal{A} \vDash M=N[\Gamma],\]

if $\mathcal{A}, \eta \vDash M=N[\Gamma]$ for every $\eta$ satisfying $\Gamma$.

Satisfication may also be extended to sets of equations or algebras. If $\mathcal{E}$ is a set of $\Sigma$-equations, then $\mathcal{A}$ satisfies $\mathcal{E}$ if $\mathcal{A}$ satisfies every equations in $\mathcal{E}$. Similarly, if $\mathcal{C}$ is a class of algebras, then $\mathcal{C} \vDash M=N[\Gamma]$ if we have $\mathcal{A} \vDash M=N[\Gamma]$ for every $\mathcal{A} \in \mathcal{C}$.

Algebraic Specification and Semantic Implication

A pair $Spec=\langle \Sigma, \mathcal{E} \rangle$ consisting of a signature $\Sigma$ and a set $\mathcal{E}$ of $\Sigma$-equations is called an algebraic specification.

We generally think of an algebraic specification $Spec=\langle \Sigma, \mathcal{E} \rangle$ as specifying a set of algebras, namely, the set of $\Sigma$-algebras that satisfy $\mathcal{E}$. This leads us to the notion of semantic implication.

A set $\mathcal{E}$ of equations over signature $\Sigma$ semantically implies another $\Sigma$-equation $M = N[\Gamma]$, written

\[\mathcal{E} \vDash M = N[\Gamma]\]

if every $\Sigma$-algebra $\mathcal{A}$ satisfying $\mathcal{E}$ also satisfies the equation $M = N[\Gamma]$. It is easy to see that the equations that hold in all algebras satisfying a specification $Spec=\langle \Sigma, \mathcal{E} \rangle$ are the $\Sigma$-equations semantically implied by $\mathcal{E}$.

A set of equations closed under semantic implication is called a theory. More precisely, a set $\mathcal{E}$ of equations is called a semantic theory if $\mathcal{E} \vDash M = N[\Gamma]$ implies $M = N[\Gamma] \in \mathcal{E}$. The theory $Th(\mathcal{A})$ of an algebra $\mathcal{A}$ is the set of all equations which hold in $\mathcal{A}$. It is easy to verify that the theory of an algebra is a semantic theory.

Proof System

Some properties of equality are “universal” and do not depend on particular properties of algebras. We select these properies as axioms and inference rules of our proof system as listed below. Considering that we have already seen them in typed lambda calculus, the detail descriptions are omitted.

\[\phantom{\stackrel{\text{def}}{=}g} M = M[\Gamma] \tag*{($ref$)}\] \[\frac{\phantom{\stackrel{\text{def}}{=}} M = N[\Gamma]}{N = M[\Gamma]} \tag*{($sym$)}\] \[\frac{\phantom{\stackrel{\text{def}}{=}} M = N[\Gamma], \: N = P[\Gamma]}{ N = P[\Gamma]} \tag*{($trans$)}\] \[\frac{\phantom{\stackrel{\text{def}}{=}} M = N[\Gamma]}{ M = N[\Gamma,x:s]} \quad x \text{ not in } \Gamma \tag*{($add$ $var$)}\] \[\frac{\phantom{\stackrel{\text{def}}{=}} M = N[\Gamma,x:s], \: P = Q[\Gamma]}{ [P/x]M = [Q/x]N[\Gamma]} \tag*{($subst$)}\]

We say $M = N[\Gamma]$ is provable from a set $\mathcal{E}$ of equations, and write

\[\mathcal{E} \vdash M=N[\Gamma]\]

if we can derive $M=N[\Gamma]$ from the equations in $\mathcal{E}$ and instances of the axiom ($ref$) using the inference rules ($sym$), ($trans$), ($subst$), and ($add$ $var$).

More formally, a proof of $E$ from $\mathcal{E}$ is a sequence of equations such that each equation is either an axiom, an equation from $\mathcal{E}$, or follows from one or more equations that occur earlier in the sequence by a single inference rule.

If $\mathcal{E}$ is closed under provability, then we say $\mathcal{E}$ is a syntactic theory. Put another way, $\mathcal{E}$ is a syntactic theory if $\mathcal{E} \vdash M=N[\Gamma]$ implies $M=N[\Gamma] \in \mathcal{E}$.

If $\mathcal{E}$ is any set of equations, the syntactic theory of $\mathcal{E}$, written $Th(\mathcal{E})$, is the set of all equations provable from $\mathcal{E}$.

There are two main theorems of this equational proof system. First one says that the proof system is sound, which means that if we can prove an equation $E$ from a set $\mathcal{E}$ of equational hypotheses, then $\mathcal{E}$ semantically implies $E$. And the second one says that the proof system is complete, which has the converse meaning of soundness. Combining these two theorems, we can know that syntactic and semantic theories are identical!

(Theorem 1. Soundness) If $\mathcal{E} \vdash M=N[\Gamma]$, then $\mathcal{E} \vDash M=N[\Gamma]$.

(Proof) We prove soundness by induction on the length of proof. More specifically, suppose there is a proof of $E$ from $\mathcal{E}$. We show then $\mathcal{E} \vDash E$ by induction on the length of the proof of $E$ from $\mathcal{E}$.

The base case of the induction of length 1, which is either an axiom or an equation from $\mathcal{E}$. In either case, it is easy to see that any algebra satisfying $\mathcal{E}$ must satisfy this equation as well.

For the induction step, we assume that $E$ follows from equations $E_1,…,E_n$ by a single proof rule, and $E_1,…,E_n$ are provable by shorter proofs. By the inductive hyphthesis, we may assume that if $\mathcal{A} \vDash \mathcal{E}$, then $\mathcal{A}$ satisfies $E_1,…,E_n$. It therefore suffices to show that if $\mathcal{A}$ satisfies the antecedents of the last rule used, then $\mathcal{A}$ satisfies $\mathcal{E}$. This leads us to a case analysis on the set of proof rules. In the following, we will give the proof for the substitution rule and other rules can be proved similarly.

We assume $\mathcal{A} \vDash M=N[\Gamma,x:s]$ and $\mathcal{A} \vDash P=Q[\Gamma]$. We must show that $\mathcal{A} \vDash [P/x]M = [Q/x]N[\Gamma]$. To do this, we let $\eta$ be any environment satisfying $\Gamma$. We must show that $[\![[P/x]M]\!]\eta = [\![[Q/x]N]\!]\eta$.

Let $a = [\![P]\!]\eta = [\![Q]\!]\eta$, and note that $\eta[x \mapsto a]$ satisfies $\Gamma,x:s$.

We use the notation $\eta[x \mapsto a]$ for the environment that is identical to $\eta$ on every variable except $x$, and maps $x$ to $a$.

Then we have

\[[\![[P/x]M]\!]\eta = [\![M]\!](\eta[x \mapsto a])\]

and similarly

\[[\![[Q/x]M]\!]\eta = [\![N]\!](\eta[x \mapsto a]) .\]

But since $\mathcal{A} \vDash M=N[\Gamma]$, we also have

\[[\![M]\!](\eta[x \mapsto a]) = [\![N]\!](\eta[x \mapsto a]) .\]

It follows that $[\![[P/x]M]\!]\eta$ and $[\![[Q/x]M]\!]\eta$ are equal, which proves the theorem. $\blacksquare$

(Theorem 2. Completeness) If $\mathcal{E} \vDash M=N[\Gamma]$, then $\mathcal{E} \vdash M=N[\Gamma]$.

(Proof) Suppose the equation $M=N[\Gamma]$ is not provable from $\mathcal{E}$. We will prove this theorem by showing that there is an algebra satisfying $\mathcal{E}$ but not satisfying this equation.

Since the proof of this theorem involves more in-depth concepts, we will not expand on them here. To see the full proof, readers may refer to the Chapter 3.4.5 Congruence, Quotients, and Deductive Completeness of the book Foundations for Programming Languages, John C. Mitchell. $\blacksquare$

Easy Foundations for Programming Languages IX — Universal Algebra and Algebraic Data Types

Algebras, Signatures and Terms

Algebras

Syntax of Algebraic Terms

Algebras and the Interpretation of Terms

Proof System for Algebras

Equations

Algebraic Specification and Semantic Implication

Proof System

Further Reading

Easy Foundations for Programming Languages VII — Simply-Typed Lambda Calculus

Easy Foundations for Programming Languages VIII — Proof Systems

Easy Foundations for Programming Languages X — Imperative Programs