Why are there so many programming languages in the world? What’s the art behind their miscellaneous designs?
Nowadays, there are thousands of high-level programming languages, and new ones continue to emerge. These languages belong to different categories and no single factor determines which category is better than others. In this article, we will go through the design ideas behind different languages and try to reach a landscape of programming languages.
The Programming Language Spectrum
The many existing languages can be classified into families based on their model of computation. The following list shows a common set of families. The top-level division distinguishes between declarative languages, in which the focus is on what the computer is to do, and imperative languages, in which the focus is on how the computer should do it.
- declarative
- functional (Lisp/Scheme, ML/OCaml, Haskell)
- dataflow (Id, Val)
- logic, constraint-based (Prolog, spreadsheets, SQL)
- imperative
- von Neumann (C, Ada, Fortran)
- object-oriented (C++, Java, Rust)
- scripting (Perl, Python, JavaScript)
Note that the programming languages categories are fuzzy, and open to debate. In particular, it is possible for a functional language to be object-oriented, and many authors do not consider functional programming to be declarative.
Declarative languages are in some sense “higher level”; they are more in tune with the programmer’s point of view, and less with the implementor’s point of view. Imperative languages predominate mainly for performance reasons.
There is a tension in the design of declarative languages between the desire to get away from “irrelevant” implementation details and the need to remain close enough to the details at least control the outline of the algorithm. The design of efficient algorithms, after all, is what much of computer science is about.
It is not yet clear to what extent, and in what problem domains, we can expect compilers to discover good algorithms for problems stated at a very high level of abstraction. In any domain in which the compiler cannot find a good algorithm, the programmer needs to be able to specify one explicitly.
Within the declarative and imperative families, there are several important subfamilies:
- Functional Languages employ a computational model based on the recursive definition of functions. They take their inspiration from the lambda calculus, a formal computational model developed by Alonzo Church in the 1930s. In essence, a program is considered a function from inputs to outputs, defined in terms of simpler functions through a process of refinement. Languages in this category include Lisp, ML/OCaml, and Haskell.
If you want to have a comprehensive understanding of lambda calculus, the book Types and Programming Languages by Benjamin C. Pierce is recommended for a beginner.
Dataflow Languages model computations as the flow of information (tokens) among primitive functional nodes. They provide an inherently parallel model: nodes are triggered by the arrival of input tokens, and can operate concurrently. Id and Val are examples of dataflow languages. Sisal, a descendant of Val, is more often described as a functional language.
Logic or constrained-based languages take their inspiration from predicate logic. They model computation as an attempt to find values that satisfy certain specified relationships, using goal-directed search through a list of logic rules. Prolog is the best-known logic language. The term is also sometimes applied to the SQL database language, the XSLT scripting language, and programmable aspects of spreadsheets such as Excel.
The von Neumann languages are probably the most familiar and widely used. They include Fortran, Ada, C and all of others in which the basic means of computation is the modification of variables. Whereas functional languages are based on expressions that have values, von Neumann languages are based on statements (assignments in particular) that influence subsequent computation via the side effect of changing the value of memory.
Object-oriented languages trace their roots to Simula 67. Most are closely related to the von Neumann languages, but have a much more structured and distributed model of both memory and computation. Rather than picture computation as the operation of a monolithic processor on a monolithic memory, object-oriented languages picture it as interactions among semi-independent objects, each of which has both its own internal states and subroutines (methods) to manage that state. C++ and Java are probably the most widely used object-oriented languages.
It is also possible to devise object-oriented functional languages, but they tend to have a strong imperative flavor. The best known of these is OCaml, a dialect of ML.
- Scripting languages are distinguished by their emphasis on coordinating or “gluing together” components drawn from some surrounding context. Several scripting languages were originally developed for specific purposes: bash is the input language of job control (shell) programs; PHP and JavaScript are primarily intended for the generation of dynamic web content; Lua is widely used to control the logic of computer games. Other languages, including Perl, Python, and Ruby, are more deliberately general purpose. Most place an emphasis on rapid prototyping, with a bias toward ease of expression over speed of execution.
One might suspect that concurrent (parallel) languages would form a separate family, but the distinction between concurrent and sequential execution is mostly independent of the classification above.
Most concurrent programs are currently written using special library packages or compilers in conjunction with a sequential language such as C. A few widely used languages, including Java and C#, have explicitly concurrent features.
Example to Illustrate Language Families
As a simple example of the contrast among language families, consider calculating the greatest common divisor (GCD) of two integers using Euclid’s algorithm.
The choice among, say, von Neumann, functional, or logic programming for this problem influences not only the appearance of the code, but how the programmer thinks.
The von Neumann Style
The von Neumann algorithm version of the algorithm is very imperative:
To compute the
gcd
ofa
andb
, check to see ifa
andb
are equal. If so, print one of them and stop. Otherwise, replace the larger one by their difference and repeat.
C code for this algorithm is shown below.
1
2
3
4
5
6
7
int gcd(int a, int b) {
while (a != b) {
if (a > b) a = a - b;
else b = b - a;
}
return a;
}
Functional Style
In a functional language, the emphasis is on the mathematical relationship of outputs to the inputs:
The
gcd
ofa
andb
is defined to be (1)a
whena
andb
are equal, (2) thegcd
ofb
anda - b
whena > b
, and (3) thegcd
ofa
andb - a
whena < b
. To compute thegcd
of a given pair of numbers, expand and simplify this definition until it terminates.
An OCaml version of this algorithm is shown below. The keyword let
introduces a definition; rec
indicates that it is permitted to be recursive (self-referential); arguments for a function come between the name (in this case, gcd
) and the equals sign.
1
2
3
4
let rec gcd a b =
if a = b then a
else if a > b then gcd b (a - b)
else gcd a (b - a)
Logic Style
In a logic language, the programmer specifies a set of axioms and proof rules that allow the system to find desired values:
The proposition
gcd(a, b, g)
is true if (1)a
,b
, andg
are all equal; (2)a
is greater thanb
and there exists a numberc
such thatc
isa - b
andgcd(c, b, g)
is true; or (3)a
is less thanb
and there exists a numberc
such thatc
isb - a
andgcd(c, a, g)
is true. To compute thegcd
of a given pair of numbers, search for a numberg
(and various numbers ofc
) for which these rules allow one to prove thatgcd(a, b, g)
is true.
A Prolog version of this algorithm is shown below. It may be easier to understand if one reads “if” for :-
and “and” for commas.
1
2
3
gcd(A,B,G) :- A = B, G = A
gcd(A,B,G) :- A > B, C is A-B, gcd (C,B,G)
gcd(A,B,G) :- A < B, C is B-A, gcd (C,A,G)
It should be emphasized that the distinctions among language families are not clear-cut. The division between von Neumann and object-oriented languages, for example, is often very fuzzy, and many scripting languages are also object-oriented.