Easy Foundations for Programming Languages IV — PCF Semantics

In the last article, we have seen all of the structure of PCF, now we will take a step back and discuss the meaning, or semantics, in a general way. We want our discussion can be applied to a variety of programming languages.

The three forms of semantics we will consider are axiomatic semantics, given by a proof system, operational semantics arising from a set of reduction rules and denotational semantics. Each form of semantics has advantages and disadvantages for understanding properties of programs. In addition, there are standard connections between the three forms of semantics that should hold for any programming language. We will discuss them at the end of this article.

PCF Programs

In any programming language, not all well-formed syntactic entities can be executed or evaluated. Therefore, we distinguish programs from other syntactic forms that may be used only as parts of programs. A program has two characteristics: (1) a program should not refer to any undeclared or unbound variables; (2) a program should have the appropriate type or form to yield a printable value or observable effect.

PCF has two syntactic categories: types and terms (expressions that have types). In PCF, we consider closed boolean and numeric expressions programs, because we can expect the interpreter to directly print their values. If we give an open natural number term such as $x+5$ to an interpreter, there is no correct value since the value to $x$ is not given.

Recall: “closed” means that the expression does not contain free variables.

In addition, we cannot expect the value of a function expression to be printed with the same degree of accuracy as a natural number or boolean. No interpreter can print function expressions in a way that would tell us exactly which mathematical function is defined by the function expression typed in. This is because, which we will explain in future articles, equality for partial recursive functions is undecidable and every partial recursive function is definable in PCF. For this reason, we say that natural number and boolean values are observable, but $nat \rightarrow nat$ values are not.

Formally put, in PCF, we say $\tau$ is an observable type if $\tau$ is either $nat$ or $bool$. A PCF program is a well-formed, closed term of observable type.

One potentially confusing aspect of this definition is that it does not consider input or output; it might be more accurate to call these “programs that require no additional input and produce one output.” This allows us to more easily compare axiomatic, operational, and denotational semantics.

Another general term we use in comparing semantics is result. Intuitively, a result is an observable effect of evaluating or executing a program. For PCF, we make this precise by saying that a result is a closed normal form of observable type. In other words, the PCF results are the numerals $0,1,2,…$ and boolean constants $true$ and $false$.

Recall: “normal form” means that the expression cannot be simplified further.

With this terminology, now we can give a general definition of semantics: a semantics of programs is a relation between programs and results.

Axiomatic Semantics

In general, an axiomatic semantics consists of a proof system for deducing properties of programs and their parts. These properties may be equations, assertions about the output of programs, or other properties.

There are three general properties of axiomatic semantics that hold for PCF. The first is that the axiomatic semantics defines program behavior in some way. The other two are relations between the axiomatic and operational or denotational semantics.

The axiomatic semantics determine the result (or output or observable effect, in general) of any program that has one.
When two expressions are equivalent, according to the axiomatic semantics, we may safely substitute one for the other in any program without changing the operational semantics of that program. This could be called “soundness of the axiomatic semantics with respect to the operational semantics.”
The axiomatic semantics are sound with respect to the denotational semantics. Specifically, if we can prove any pair of PCF terms equal, then these terms must have the same denotation, regardless of which values we give to their free variables.

The axiomatic semantics of PCF is given by the proof system shown in this table.

In the table, congruence means that provable equality is an equivalence relation (reflexive, symmetric and transitive), and that equality is preserved if we replace any subexpression by an equivalent one.

All the congruence rules except two about “Functions” are in fact redundant!
For example, suppose we can prove $M=N$ and $P=Q$, for four natural number expressions $M,N,P,Q$, then we can derive $M+N=P+Q$ as follows.
By the reflexivity axiom, we have the equation
$\lambda x:nat.\lambda y:nat.x+y = \lambda x:nat.\lambda y:nat.x+y.$
Therefore, by the congruence rule for function application, we can prove
$(\lambda x:nat.\lambda y:nat.x+y)\:M = (\lambda x:nat.\lambda y:nat.x+y)\:N.$
Then, by axiom scheme $(\beta)$ and transitivity,
$\lambda y:nat.M+y = \lambda y:nat.N+y.$
Repeating the application, $(\beta)$, and transitivity, we may complete the proof of $M+N=P+Q$.

It is worth mentioning that although the axiomatic semantics is powerful enough to determine the meaning of programs, this proof system is not as powerful as one might initially expect. For example, we cannot even prove that addition is commutative. Nor can we prove many interesting equivalences between recursive functions.

A binary operation is commutative if changing the order of the operands does not change the result.

To adapt these properties, the most natural approach would be to add induction rules. Induction on natural numbers would let us prove commutativity quite easily, and a form of induction called “fixed-point induction” would allow us to prove a great many more equations between recursively defined functions.

Denotational Semantics

The PCF denotational semantics assigns a natural number value to each expression of type $nat$, a boolean value to each expression of type $bool$, a mathematical function to any function expression, and a pair of values to each expression of cartesian product type. The mathematical value of an expression is called its denotation.

Notice that if a term has free variables, its denotation will generally depend on the values assumed for the free variables.

Denotation

We give denotations to terms by first choosing a set of values for each type. In the standard semantics, the set of mathematical values for type $nat$ will include all of the natural numbers, plus a special element $\perp_{nat}$ representing nontermination.

This nontermination value is needed since we have PCF expressions of type $nat$ such as

\[\text{letrec } f(x:nat):nat = f(x+1) \text{ in } f(0)\]

which do not terminate under the standard interpreter, and do not denote any standard natural number.

For a similar reason, the set of denotations of type $bool$ includes $true$, $false$, and an extra value $\perp_{bool}$.

The mathematical values of a product type $\sigma \times \tau$ are ordered pairs, as you would expect. The mathematical values of a type $\sigma \rightarrow \tau$ are functions from $\sigma$ to $\tau$. If $\sigma = \tau$, then this function has a fixed point (We will explain why in the future aritcles).

Environment

Once we have chosen a set of values for each type, we assign meanings to terms by choosing an environment, which is a mapping from variables to values.

If $M$ is an expression, and $\eta$ is an environment, then we use the symbol $[\![ M ]\!] \eta$ to represent the meaning of term $M$ in environment $\eta$. For example, the meaning $[\![ x ]\!] \eta$ is the mathematical value given to variable $x$ by the environment; the meaning of an application $[\![ M \: N ]\!] \eta$ is obtained by applying the function $[\![ M ]\!] \eta$ denoted by $M$ to the argument $[\![ N ]\!] \eta$ denoted by $N$.

Properties

There are two properties of denotational semantics that generally hold. The first is an intrinsic property of denotational semantics that distinguishes it from other forms of semantics.

The denotational semantics is compositional, which means that the meaning of any expression is determined from the meanings of its subexpressions. For example,

\[[\![ \text{if } B \text{ then } M \text{ else } N ]\!] \eta = \begin{cases} [\![ M ]\!] \eta & \text{if } [\![ B ]\!] \eta \text{ is true,}\\ [\![ N ]\!] \eta & \text{if } [\![ B ]\!] \eta \text{ is false,}\\ \perp & \text{otherwise.} \end{cases}\]

Here, $\perp$ is used as the meaning of the expression in the case that evaluation of the test $B$ does not terminate.

If we can prove the same assertions about $M$ and $N$ in the axiomatic semantics, then $M$ and $N$ must have the same meaning in the denotational semantics. This is called soundness, for an equational proof system.
It follows from soundness, by connections between the axiomatic and operational semantics of PCF, that if $M \twoheadrightarrow N$, then $M$ and $N$ must have the same meaning in the denotational semantics.

Operational Semantics

The most common mathematical presentation of operational semantics are proof systems. They can be used either for deducing the final result of evaluation or for transforming an expression through a sequence of steps. In this series, we concentrate on this form of operational semantics.

There are other ways to give an operational semantics. One can define an abstract machine, which is a theoretical computing machine that evaluates programs by progressing through a series of machine states. This way provides more insight into practical implementation because it is close to the structure of interpreters and compilers.

The operational semantics of PCF are given by reduction axioms listed in this table.

The reduction axioms are written with the symbol $\rightarrow$ instead of $=$, to emphasize the direction of reduction. Most of the rules have the “feel” of program execution — it seems that we are making progress towards a simpler expression in some way.

We may define an evaluation partial function from the reduction system by $eval(M) = N$ iff $M$ may be reduced to normal form $N$ in zero or more steps.

A partial function from a set $X$ to a set $Y$ is a function from a subset of $X$ (possibly the whole $X$ itself) to $Y$. We say the evaluation function is partial because some expression may not be able to be evaluated to a normal form.

Equivalence Relations Defined by Each Form of Semantics

We may summarize the basic connections between axiomatic, operational and denotational semantics by comparing the equivalence relations defined by each one.

Equivalence Definitions

For the axiomatic semantics, which is a logical system for deriving equations, the obvious equivalence relation is the provable equality. We will discuss axiomatic equivalence in detail in a later article Easy Foundations for Programming Languages VIII — Proof Systems

For the denotational semantics, we consider the relation of denotational equivalence: two terms are denotationally equivalent if they have the same denotation (for any association of values to free variables).

For the operational semantics, things get more complicated. We say that two programs of PCF, or similar language, are operationally equivalent if they have the same value under the operational semantics. In symbolic form, program $M$ and $N$ are operationally equivalent if $eval(M) \simeq eval(N)$.

The “Kleene equation” $eval(M) \simeq eval(N)$ means that either $M$ and $N$ both evaluate to the same term, or neither evaluation is defined.

However, this equivalence relation is only defined on programs, not on arbitrary terms. We extend this relation to terms that may have free variables or nonobservable types using an important syntactic notion called a context.

A context $\mathcal{C}[\:\:]$ is a term with a “hole” in it. An example is the context

\[\mathcal{C}_0[\:\:] \:\stackrel{\text{def}}{=} \: \lambda x:nat.x + [\:\:]\]

If we insert a term into a context, then this is done without renaming bound variables. For example, $\mathcal{C}_0[x]$ is $\lambda x:nat.x + x$.

We can think of context as an incomplete program, sitting in the buffer of a text editor. Inserting a term into a context corresponds to using the text editor to fill in the rest of the program. A special case is the empty context $[\:\:]$, which corresponds to a text editor containing no program at all.

Using contexts, we define operational equivalence on arbitrary terms as follows. Terms $M$ and $N$ of the same type are operationally equivalent if, for every context $\mathcal{C}[\:\:]$ such that $\mathcal{C}[M]$ and $\mathcal{C}[N]$ are programs, we have $eval(\mathcal{C}[M]) \simeq eval(\mathcal{C}[N])$.

In the literature, operational equivalence is sometimes called “observational equivalence” or “observational congruence”.

Equivalence Comparison

If we write $M =_{ax} N$ for provable equality in the axiomatic semantics, $M =_{den} N$ for denotational equivalence, and $M =_{op} N$ for operational equivalence, then the minimal requirement on the three semantics, called adequacy or computational adequacy, is

\[(\forall \: \: programs \: M) \: \: (\forall \: \: results \: N) \quad M =_{ax} N \text{ iff } M =_{den} N \text{ iff } M =_{op} N.\]

If computational adequacy holds, one can develop correct programs using algebra and analysis, completely ignoring the details of the operational semantics.

We also expect that for arbitrary terms, the axiomatic semantics are sound for the denotational semantics, and the denotational semantics are sound for the operational semantics. These can be written as the following inclusions between relations on terms:

\[=_{ax} \: \subseteq \: =_{den} \: \subseteq \: =_{op}.\]

Note that the adequacy is for programs, and the above inclusion is for terms. Be careful not to confuse them.

In general, operational equivalence is the coarsest of the three, i.e., more terms are operationally equivalent than denotationally equivalent or provably equivalent. This is not an accident fact about operational semantics, but a consequence of the way that operational equivalence is defined.

For example, if two terms $M$ and $N$ are not operationally equivalent in PCF, then there is some context $\mathcal{C}[\:\:]$ with $\mathcal{C}[M]$ and $\mathcal{C}[N]$ reducing to different numerals or boolean constants. Consequently, if we equate $M$ and $N$ in the axiomatic semantics, we would have an inconsistent proof system.

A proof system is inconsistent if every equation $M=N$ between well-formed expressions $M$ and $N$ of the same type is provable. For example, if $true = false$ is provable from the axioms and inference rules of PCF, we can deduce that every equation $M:\sigma=N:\sigma$ holds, such as $1=2$, resulting in an inconsistent proof system.

Similar reasoning applies to the denotational semantics, so in general $=_{ax}$ and $=_{den}$ cannot be any coarser than $=_{op}$.

A denotational semantics with $M =_{den} N \text{ iff } M =_{op} N$, for arbitrary terms, is called fully abstract. A fully abstract denotational semantics may be very useful, since reasoning about the denotational semantics therefore allow us to reason about $=_{op}$. This is important because $=_{op}$ is generally difficult to reason about, yet it is the most useful form of equivalence for program optimization or transformation. However, it is generally a difficult mathematical problem to construct fully abstract denotational semantics.