Denotational Semantics of Typed Lambda Calculus IV — CPO Model for PCF

In this article, we consider the domain-theoretic semantics of PCF. This provides some insight into the properties of CPOs, and also provides the basis for semantic reasoning about PCF.

We show soundness of the PCF equational axioms system, and therefore the reduction rules, for our CPO model, called $\mathcal{A}_{\text{PCF}}$. It is impossible for the PCF equational proof system to be complete for proving all equations between PCF terms that hold in PCF. The reason is that, as shown in Easy Foundations for Programming Languages V — PCF Iteration and Recursion, every partial recursive function is definable in PCF.

By soundness, and the fact that all the numerals denote distinct elements of the model, two expressions of type $nat \to nat$ will be equal in PCF iff they operationally define the same partial recursive function. Since equality between partial recursive functions is not recursively enumerable, no recursive axiom system could be complete for $\mathcal{A}_{\text{PCF}}$.

CPO Model for PCF

The model $\mathcal{A}_{\text{PCF}}$ is the full continuous hierarchy over $\mathcal{A}_{\text{PCF}}^{nat} = \mathbb{N}_\perp$ and $\mathcal{A}_{\text{PCF}}^{bool} = \mathbb{B}_\perp$, with constants of PCF interpreted as described below.

Since the type constants of PCF are interpreted as pointed CPOs, and cartesian product and continuous function constractors preserve pointedness, all of the type of PCF are interpreted as pointed CPOs. This gives us least-fixed-point operators at all types, allowing us to interpret the fixed-point constants of PCF according to Lemma 1 in the last article.

We interpret constants $0, 1, 2, …$ and $true,false$ as the standard natural numbers and boolean elements of the lifted set $\mathbb{N}_\perp$ and $\mathbb{B}_\perp$. If we choose continuous functions for basic PCF operations $+$, $Eq?$ and conditional, then every term of PCF will have a meaning in $\mathcal{A}_{\text{PCF}}$. It follows that every term in PCF has a fixed point.

We interpret $+$ and $Eq?$ in $\mathcal{A}_{\text{PCF}}$ as the lifted versions, $+_\perp$ and $Eq?_\perp$, of the standard functions. In other words, we interpret $+$ as the extension of addition that is strict in both arguments, so that for every $x \in \mathcal{A}_{\text{PCF}}^{nat}$,

\[\perp_{nat} + x = x + \perp_{nat} = \perp_{nat}\]

We treat the PCF equality test similarly, so that the PCF expression $Eq? \: M \: N$ has value $true$ if $M$ and $N$ denote the same element of $\mathbb{N}_\perp$ different from $\perp$, value $false$ if $M$ and $N$ denote different non-$\perp$ elements of $\mathbb{N}_\perp$, and value $\perp$ if either $M$ and $N$ denotes $\perp$.

The interpretation of conditional in $\mathcal{A}_{\text{PCF}}$ is a little subtle. You may guess that we could also interpret conditional as a strict function, with value $\perp$ of the appropriate type whenever any of its arguments is $\perp$ of some type. However, this does not work, since in the case that $M$ denotes $\perp$ but $N$ does not, we would have

\[\text{if } false \text{ then } M \text{ else } N \:=\:\: \perp,\]

which contradicts an equational axiom of PCF.

Recall that we have the following equational axioms for conditional discussed in Easy Foundations for Programming Languages III — The Language PCF & Its Syntax.
$\begin{split} (\text{if } true \text{ then } M \text{ else } N) = M \\ (\text{if } false \text{ then } M \text{ else } N) = N \end{split}$

An interpretation of conditional that satisfies the equational axioms of PCF is the following.

\[\mathcal{A}_{\text{PCF}} [\![\text{if } P \text{ then } M \text{ else } N]\!]\eta = \begin{cases} \mathcal{A}_{\text{PCF}} [\![M]\!]\eta & \text{if } \mathcal{A}_{\text{PCF}} [\![P]\!]\eta = true \\ \mathcal{A}_{\text{PCF}} [\![N]\!]\eta & \text{if } \mathcal{A}_{\text{PCF}} [\![P]\!]\eta = false \\ \perp & \text{otherwise} \end{cases}\]

Since the equational axioms of PCF only mention conditional expressions when the first argument is $true$ or $false$, it is easy to see that the axioms are satisfied. This completes the definition of $\mathcal{A}_{\text{PCF}}$.

Now we have the following immediate consequences of the general soundness theorem for Henkin models.

(Theorem 1.) Let $M$ and $N$ be expressions of PCF over typed variables from $\Gamma$. If $\Gamma \: \triangleright \: M=N :\sigma$ is provable from the axioms for PCF, then the CPO model $\mathcal{A}_\text{PCF}$ satisfies the equation $\Gamma \: \triangleright \: M=N :\sigma$.

(Corollary 1.) If $\Gamma \: \triangleright \: M :\sigma$ is a well-typed term of PCF, and $M \twoheadrightarrow N$, then the CPO model $\mathcal{A}_\text{PCF}$ satisfies the equation $\Gamma \: \triangleright \: M=N :\sigma$.

These soundness results show that the denotational semantics given by $\mathcal{A}_\text{PCF}$ has the minimal properties required of any denotational semantics. Specifically, if we can prove $M=N$, or reduce one term to the other, then these two terms will have the same meaning in $\mathcal{A}_\text{PCF}$. It follows that if two terms have different meanings, then we cannot prove them equal or reduce one to the other. Therefore, we can use $\mathcal{A}_\text{PCF}$ to reason about unprovability or the non-existence of reductions.

Examples

In this section, we will use two examples to illustrate how to work out the meaning of some expressions in $\mathcal{A}_\text{PCF}$ and make some comments about the difference between the least fixed point of a function and other fixed points.

Ex.1 Factorial Function

Recall that the factorial function may be written $fact \: \stackrel{\text{def}}{=} \: fix_{nat \to nat} \: F$, where $F$ is the expression

\[F \: \stackrel{\text{def}}{=} \: \lambda f:nat \rightarrow nat .\: \lambda y:nat .\: (\text{if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y*f\:(y-1)).\]

We will work out the meaning of this closed term assuming that $\ast$ and $-$ denote multiplication and subtraction functions which are strict extensions of the standard ones. In other words, we assume that multiplication and subtraction have their standard meaning on non-least elements of $\mathbb{N}_\perp$, but have value $\perp$ if either of their arguments is $\perp$.

Following the definition of $fix$ in a CPO, we can see that the meaning of factorial is the least upper bound of a directed set. Specifically, $\mathcal{A}_\text{PCF}[\![fact]\!]$ is the least upper bound $\bigvee \{ \bar{F}^n(\perp) \mid n \geq 0 \}$, where $\bar{F} = \mathcal{A}_\text{PCF}[\![F]\!]$. We will work out the first few cases of $\bar{F}^n(\perp)$ to understand what this means.

Since all of the lambda terms involved are closed, we will dispense with the formality of writing $\mathcal{A}_\text{PCF}[\![\:]\!]$ and use lambda terms as names for elements of the appropriate domains.

\[\begin{split} \bar{F}^0 (\perp) &= \perp_{nat \to nat} \\\\ \bar{F}^1 (\perp) &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y*(\bar{F}^0 (\perp))\:(y-1) \\ &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y*(\perp_{nat \to nat})\:(y-1) \\ &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } \perp_{nat} \\\\ \bar{F}^2 (\perp) &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y*(\bar{F}^1 (\perp))\:(y-1) \\ &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y\:*\\ &\quad \quad (\lambda x:nat.\: \text{ if } Eq? \: x \: 0 \text{ then } 1 \text{ else } \perp_{nat})\:(y-1) \\ &= \lambda y:nat.\: \text{ if } Eq? \: y \: 0 \text{ then } 1 \text{ else } y\:*\\ &\quad \quad (\text{ if } Eq? \: (y-1) \: 0 \text{ then } 1 \text{ else } \perp_{nat}) \\ \end{split}\]

Continuing in this way, we can see that $\bar{F}^n (\perp)$ is the function which computes $y!$ whenever $0 \leq y < n$ and has value $\perp_{nat}$ otherwise. Since $\perp_{nat}$ represents “undefined,” this means that $\bar{F}^n (\perp)$ is defined for $0 \leq y < n$ and undefined otherwise.

A property of this collection of functions is that for a particular argument $y \neq \perp$, there are two possible values for $\bar{F}^n (\perp) \: y$ — if $n \leq y$, then the value is $\perp$, while if $n > y$, the value is $y!$.

As spelled out in the statement of Lemma 4 in the CPO article, the least upper bound of $\{ \bar{F}^n (\perp) \mid n \geq 0 \}$ is the function mapping any $y \in \mathbb{N}_\perp$ to the least upper bound of the set $\{ \bar{F}^n (\perp)\:y \mid n \geq 0 \}$. The set contains only $\perp$, if $y = \perp$, and has two elements $\perp$ and $y!$ otherwise. Therefore $\mathcal{A}_\text{PCF}[\![fact]\!]$ is the function from $\mathbb{N}_\perp$ to $\mathbb{N}_\perp$ which maps $\perp$ to $\perp$, and any other $y \in \mathbb{N}$ to $y!$.

Such strictness of factorial agrees with our computational intuition and experience, since we do not expect any computation of $fact$ $M$ to terminate if the expression $M$ cannot be simplified to a numeral. But in the case $M$ reduces to a numeral for $y$, we can reduce $fact$ $M$ to the numeral for $y!$.

In general, the least fixed point of a function is the fixed point that is “computed in practice.” Later we will formally show that the PCF operational and denotational semantics coincide. Simply put, the least fixed point of a function $f$ has all the properties of $fix$ $f$ that we can determine by applying $fix$-reduction some number of times.

Ex.2 Function with Many Fixed Points

This example illustrates the difference between the least fixed point of a function and other fixed points.

Consider the function $F:(nat \to nat) \to (nat \to nat)$ defined by

\[F \: \stackrel{\text{def}}{=} \: \lambda f:nat \rightarrow nat .\: \lambda x:nat .\: (\text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } f\:(x-2))\]

where subtraction $x-y$ yields $0$ if $y \geq x$.

In contrast to the function used to define factorial, we will see that $F$ has many fixed points. The reason is that $F$ is a function mapping even arguments to $\perp$. This allows us to find greater fixed points (in the point-wise order) which map even arguments to natural numbers other than $\perp$.

Before discussing alternate fixed points further, we will work out the meaning of $fix$ $F$.

As in factorial, we let $\bar{F} = \mathcal{A}_\text{PCF}[\![F]\!]$. The least fixed point of $\bar{F}$ is the least upper bound of the set $\{ \bar{F}^n (\perp) \mid n \geq 0 \}$. The simplest way to understand this set is to begin by working out the first few functions.

\[\begin{split} \bar{F}^0(\perp) &= \perp_{nat \to nat} \\\\ \bar{F}^1(\perp) &= \lambda x:nat .\: \text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } (\bar{F}^0(\perp))\:(x-2)\\ &= \lambda x:nat .\: \text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } \perp_{nat} \\\\ \bar{F}^2(\perp) &= \lambda x:nat .\: \text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } (\bar{F}^1(\perp))\:(x-2)\\ &= \lambda x:nat .\: \text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } \\ & \quad \quad (\lambda y:nat .\: \text{if } Eq? \: y \: 1 \text{ then } 1 \text{ else } \perp_{nat})\:(x-2)\\ &= \lambda x:nat .\: \text{if } Eq? \: x \: 1 \text{ then } 1 \text{ else } \\ & \quad \quad (\text{if } Eq? \: (x-2) \: 1 \text{ then } 1 \text{ else } \perp_{nat})\\ \end{split}\]

Continuing in this way, we can see that $\bar{F}^n (\perp)$ is the function which maps any odd $x < 2n$ to $1$ and has value $\perp_{nat}$ otherwise. The least upper bound $f$ of all such functions is the function mapping all odd $x$ to $1$, and any other natural number to $\perp$.

This corresponds to the fact that if we try to compute the value of $(fix \: F) \: x$ by reduction, we will succeed after a finite number of reduction steps iff $x$ is odd. If $x$ is even, we can reduce the expression indefinitely, but will never produce a numeral.

It is not hard to see that the fixed points of $F$ above are precisely the functions $g:nat \to nat$ satisfying the two conditions:

\[\begin{split} g(1) &= 1\\ g(x+2) &= g(x) \end{split}\]

Since these two equations do not determine a value for even $x$, any function mapping all even $x$ (and $0$) to some number $n$ will be a fixed point of $F$. All of these alternatives are greater than $f$ since any natural number $n$ is greater than $\perp$ in the ordering on the lifted CPO of natural numbers. However, none of these alternative fixed points is the function computed by reduction.

Fixed-Point Induction

The equational proof system for PCF is sound for the CPO model and other semantic models of PCF. It is adequate to specify the result of any program — if $P:nat$ is closed, and $P$ denotes a number $n$ in the CPO model $\mathcal{A}_\text{PCF}$, then we can prove $P=n$.

However, the equational proof system is not very powerful when it comes to proving equations between parts of programs, such as terms defining functions. This is illustrated in the example below.

In this section, we study an extension to the equational proof system that has been found, in practice, to be sufficient for proving many equations between PCF terms.

Ex. 3 Equation of Function Terms

If $f:D \to D$ and $g:D \to D$ are continuous functions on some CPO $D$, then it is not hard to see that $fix \: (f \circ g) = f\: (fix \: (f \circ g))$. The reason is that

\[\begin{split} fix \: (f \circ g) &= \bigvee \{ (f \circ g)^i\:(\perp) \mid i \geq 0 \} \\ &= \bigvee \{ (f \circ g)\:(\perp), (f \circ g \circ f \circ g)\:(\perp), (f \circ g \circ f \circ g \circ f \circ g)\:(\perp),... \} \\ &= \bigvee \{ (f \circ (g \circ f)^i)\:(\perp) \mid i \geq 0 \} \\ &= f\: (fix \: (f \circ g)) \end{split}\]

by monotonicity, general facts about least upper bound, and the associativity of composition.

In the rest of this example, we show that there is no equational proof of $fix \: (f \circ g) = f\: (fix \: (f \circ g))$, treating $f$ and $g$ as variables of type $\sigma \to \sigma$, for any type $\sigma$, using only the equational proof system of PCF.

There are two facts about PCF:

The reduction rules of PCF are confluent;
The reduction rules of PCF are exactly directed versions of the equational axioms, plus symmetry, transitivity and congruence rules.

Based on these, we know that the equation $fix \: (f \circ g) = f\: (fix \: (f \circ g))$ is provable iff both terms reduce to a common term. However, an easy induction shows that any term obtained by reducing $fix \: (f \circ g)$ will have an equal number of $f$’s and $g$’s, while terms obtained from $f\: (fix \: (f \circ g))$ will have more $f$ than $g$. Therefore, we cannot prove $fix \: (f \circ g) = f\: (fix \: (f \circ g))$ using the equational proof system.

Extended Proof System

The extended proof system is based on the CPO interpretation of terms. In addition to equations, we will use formulas which assert that the meaning of one term is an approximation of another. More specifically, we use assertions of the form $M \leq N$ or, written out with type assignments, $\Gamma \: \triangleright \: M \leq N : \sigma$. To be precise, an approximation $\Gamma \: \triangleright \: M \leq N : \sigma$ is satisfied at environment $\eta \vDash \Gamma$ for CPO model $\mathcal{A}$ if $\mathcal{A}[\![M]\!]\eta \leq \mathcal{A}[\![N]\!]\eta$.

The proof system given in this section is sound for deriving assertions that hold in all CPO models. A common way of reading $M \leq N$ is, “$M$ is an approximation of $N$.”

One obvious inference rule is

\[\frac{\phantom{\stackrel{\text{def}}{=}}\Gamma \: \triangleright \: M = N : \sigma}{\phantom{qq}\Gamma \: \triangleright \: M \leq N : \sigma} \tag*{($eq$)}\]

which allows us to derive approximations from equations. This rule may be used to derive versions of the $(\beta)$ and $(\eta)$ axioms using approximation in place of equality.

A “converse” to this rule is

\[\frac{\Gamma \: \triangleright \: M \leq N : \sigma, \phantom{\stackrel{\text{def}}{=}} \Gamma \: \triangleright \: N \leq M : \sigma}{\Gamma \: \triangleright \: M = N : \sigma} \tag*{($asym$)}\]

which is sound since a partial order is anti-symmetric. This rule lets us derive equations from approximations.

Since $\leq$ is transitive, we also have the rule

\[\frac{\Gamma \: \triangleright \: M \leq N : \sigma,\phantom{\stackrel{\text{def}}{=}} \Gamma \: \triangleright \: N \leq P : \sigma}{\Gamma \: \triangleright \: M \leq P : \sigma \phantom{g}} \tag*{($trans$)}\]

If we add a constant $\perp_\sigma$ to the language, for each type $\sigma$, then we write the axiom scheme

\[\Gamma \: \triangleright \: \perp_\sigma \leq M : \sigma \tag*{($bot$)}\]

since $\perp_\sigma$ is the least element of type $\sigma$. The tag “bot” stands for the “bottom” element.

We also know that the least element of a function type is the function mapping each argument to $\perp$. This gives us the equational axiom scheme

\[\Gamma \: \triangleright \: \perp = \lambda x:\sigma.\: \perp : \sigma \to \tau \tag*{($botf$)}\]

We also have a congruence rule for each syntactic form. For application, the rule

\[\frac{\Gamma \: \triangleright \: M_1 \leq M_2 : \sigma \to \tau,\phantom{\stackrel{\text{def}}{=}} \Gamma \: \triangleright \: N_1 \leq N_2 : \sigma}{\Gamma \: \triangleright \: M_1\:N_1 \leq M_2\:N_2 : \tau\phantom{g}} \tag*{($acong$)}\]

tells us that every continuous function is monotonic. The congruence rule for lambda abstraction

\[\frac{\phantom{\stackrel{\text{def}}{=}}\Gamma,\: x:\sigma \: \triangleright \: M \leq N : \tau}{\Gamma \: \triangleright \: \lambda x:\sigma .\:M\leq \lambda x:\sigma .\:N : \sigma \to \tau\phantom{g}} \tag*{($fcong$)}\]

is sound since we order functions point-wise.

The final rule is an induction rule for fixed points. From the equational axiom for fixed points, we can prove that $f\:(fix \: f) \leq fix \: f$, and similar properties that follow from $fix \: f$ being a fixed point of $f$. However, in the standard equational proof system, there is no obvious way to make use of the fact that $fix \: f$ is the least fixed point of $f$. With approximations instead of equations, we could write a rule

\[\frac{\Gamma \: \triangleright \: M\:N = N : \sigma}{\Gamma \: \triangleright \: fix\: M \leq N : \sigma}\]

saying that $fix \: M$ is an approximation to any fixed point of $M$. This follows from the more powerful rule of “fixed point induction”, also called Scott induction after Dana Scott.

In stating the rule of fixed point induction, we write $\Phi \vdash A$ to mean that the equation or approximation $A$ is provable from the set $\Phi$ of equations and approximations using the axioms and inference rules given here, combined with the equational axioms and inference rules of typed lambda calculus with fixed points and any additional equational hypotheses (such as axioms for algebraic datatypes).

If $A$ has the form $\Gamma, \: x:\sigma \: \triangleright \: M \leq N : \tau$, then we write $[P/x]A$ for the result $\Gamma \: \triangleright \: [P/x]M \leq [P/x]N : \tau$ of removing variable $x$ from the type assignment and substituting $P$ for $x$ in both terms, assuming $\Gamma \: \triangleright \: P:\sigma$. If $A$ is an equation, we define $[P/x]A$ similarly.

Using this notation, the rule to find the fixed point is written as follows.

\[\frac{\Phi \vdash [\perp/x]A,\quad \Phi,\:[c/x]A \vdash[F(c)/x]A}{\Phi \vdash [fix \: F/x]A} \: \text{ constant } c \text{ not in } \Phi \tag*{($fpfind$)}\]

If we think of $A$ as a way of saying that the variable $x$ has some property, then $[\perp/x]A$ says that this property holds for $\perp$. The second hypothesis, $\Phi,\:[c/x]A \vdash[F(c)/x]A$, is a way of saying that if this property holds for some arbitrary $c$. then it holds for $F(c)$. If follows that this property holds for every element of $\{ F^n(\perp) \mid n \geq 0\}$, by induction on $n$. Using the fact that the value of any term depends continuously on each of its free variables, we can easily verify that the property $A$ holds for the least upper bound of this set, namely $fix \: F$. This gives us the conclusion of the fixed point induction rule.

A feature of fixed-point induction is that we reason implicitly about sets of form $\{ F^i(\perp) \mid i \geq 0\}$ without introducing natural numbers into the formal system. While the soundness proof for fixed-point induction uses induction on the integers, the proof system only uses formulas such as approximation or equality.

Ex 4. Fixed-Point Induction Example

To illustrate the use of fixed-point induction, we will show that if $N$ is a fixed point of $M$, then $fix \:M \leq N$.

We assume $\Gamma\: \triangleright \: M\:N = N : \sigma$ (according to the definition of fixed point), which gives us

\[\Gamma\: \triangleright \: M\:N \leq N : \sigma\phantom{g}\]

since equality implies approximation. This will be useful at a later step of the proof.

Let us now think about how we can use the fixed point induction rule. The conclusion of the rule has the form $[fix \: F/x]A$, which matches the statement we wish to prove if we take

\[A \:\equiv \: \Gamma, \: x:\sigma \: \triangleright \: x \leq N:\sigma,\]

with $x$ not free in $N$, and let $F$ be $M$. Reading the fixed point induction rule from the bottom up, we can see that it suffices to prove

\[[\perp/x]A \:\equiv \:\Gamma \: \triangleright \: \perp \leq N:\sigma,\]

which is an axiom, and to show that we can prove

\[[M\:c/x]A \:\equiv \:\Gamma \: \triangleright \: M\:c \leq N:\sigma\]

if we take $[c/x]A \:\equiv \:\Gamma \: \triangleright \: c \leq N:\sigma$ as an additional hypothesis.

If we begin with $c \leq N$, then by monotonicity (the “congruence rule” for application), we have

\[\Gamma\: \triangleright \: M\:c \leq M\:N : \sigma \phantom{\stackrel{\text{def}}{=}g}\]

But since we have already proved $M \: N \leq N$ above, this gives us

\[\Gamma\: \triangleright \: M\:c \leq N : \sigma \phantom{\stackrel{\text{def}}{=}g}\]

which is exactly what we need in order to finish the proof by fixed point induction. $\blacksquare$

Denotational Semantics of Typed Lambda Calculus IV — CPO Model for PCF

CPO Model for PCF

Examples

Ex.1 Factorial Function

Ex.2 Function with Many Fixed Points

Fixed-Point Induction

Ex. 3 Equation of Function Terms

Extended Proof System

Ex 4. Fixed-Point Induction Example

Further Reading

Easy Foundations for Programming Languages VII — Simply-Typed Lambda Calculus

Easy Foundations for Programming Languages VIII — Proof Systems

Easy Foundations for Programming Languages IX — Universal Algebra and Algebraic Data Types