PLDI, whose full name is the Conference on Programming Language Design and Implementation, is held today in Copenhagen, Denmark this year. It is the most famous conference in the programming language theory field. Sadly, I failed to get a chance to attend this grand event; my paper was rejected since reviewers argued that my proof was hard to read. But this assessment is not fair enough; my essay is not difficult in reading itself, but mainly presents very many entirely new concepts that make comprehending more difficult, and these concepts are essential to make up my final conclusion.
Considering that I could not attend the conference by myself and many of the papers have not yet been made public, there is no way to provide an effective analysis of this conference. One thing we can do is to take a look back at PLDI held in the last year.
PLDI 2023 accepted 83 papers, and 8 were awarded as distinguished papers. Among them, 2 are about verification, 3 are about compilers, 1 is about proof repair, and 2 are about synthesis. In terms of the geographic location of the first author’s school, most of the distinguished papers were from North America, except one from Taiwan and one from Singapore.
In this note, let’s quickly go through all the distinguished papers to see what have they achieved in their field. If you have read the theory series of blogs I’ve written, these cutting edge works should not be hard to understand. (At least you can easily comprehend their challenges and high-level ideas.)
Click here to see the PLDI 2023 research papers page.
Distinguished Papers
Papers below are sorted in alphabetical order.
1. An Automata-Based Framework for Verification and Bug Hunting in Quantum Circuits
authors: Yu-Fang Chen, Kai-Min Chung, Ondřej Lengál, Jyun-Ao Lin, Wei-Lun Tsai, Di-De Yen
keywords: quantum computation; tree languages; formal software verification
The first anthor of this paper, Yu-Fang Chen, is a research fellow at Acadenua Sinica, Taiwan. He has 17 years research exprience on automata. This time he novely applied automata on solving the quantum circuits problem and won his first distinguished paper award, also this is his first paper related to quantum circuits.
His team proposed a new paradigm for analyzing and finding bygs in quantum circuits. Their framework is based on binary tree automata, for which you may find a systematical introduction in an ealier book [Comon et al. 2008]. The approach uses tree automata (TA) to represent sets of quantum states and employs TA transformation algorithms to execute quantum gates over the automata. By doing so, they bridge the gap between quantum and classical verification and establish a connection between automata theory and the quantum world.
For curious readers who want to know how automata is used to find bugs in detail, we provide a step-by-step summary here.
2. Covering All the Bases: Type-Based Verification of Test Input Generators
authors: Zhe Zhou, Ashish Mishra, Benjamin Delaware, Suresh Jagannathan
keywords: type theory; tree languages; formal software verification
This work comes from a Ph.D. student at Purdue University under the guidance of Suresh Jagannathan. This is his second published paper. Their group has an active line of investigating the application of expressive type systems to facilitate the verification of non-trivial properties of functional programs.
In this paper, they discuss test input generators. One limitation of existing generators is that it is hard to validate a particular generator’s output satisfies the coverage requirement. (Coverage is a metric in software testing that measures the percentage of code that is actually examined under specific test inputs.) So, their idea is to embed the notion of coverage as an integral part of a test input generator’s type specification. By doing so, a generator’s type now specifies the set of behaviors the generator is guaranteed to exhibit; a well-typed generator is thus guaranteed to produce every possible value satisfying a desired structural property.
More specifically, this paper introduces the notion of coverage types, and formalizes the semantics of coverage types in OCaml. Moreover, it develops a bi-directional type-checking algorithm for coverage types and implements it as a tool operating over OCaml programs. In principle, this paper adapts underapproximate reasoning found in recent work on incorrectness logic (IL) [O’Hearn 2019; Raad et al. 2020].
Check here to see what underapproximate reasoning means.
3. CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives
authors: Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner , Yuval Yarom
keywords: compilers
Surprisingly, the first author Joel Kuepper is the youngest one among 12 authors. And this is the debut work of him. No evidence shows who is the advisor of Joel Kuepper.
This paper presents the first compiler that specializes high-level cryptographic functional programs into assembly code, named CryptOpt. They claim that the generated code runs significantly faster than what GCC or Clang produces. On the formal verification side, they connect to the Fiat Cryptography framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker, incorporating a modest subset of known features of SMT solvers and symbolic-execution engines.
This is a classical PL paper about compiler design and implementation. This kind of solid work requires a large amount of effort and undoubtedly deserves a distinguished award. It is hard to believe that Joel Kuepper, as a freshman in the research field, could lead such a complicated task and put everything nicely to the ground. Awesome!
4. Extensible Metatheory Mechanization via Family Polymorphism
authors: Ende Jin, Nada Amin, Yizhou Zhang
keywords: compilers; language features
Ende Jin is a master student at the University of Waterloo under the guidance of Yizhou Zhang. Their group aims to design and implement programming languages with rich expressive power, strong guarantees, and efficient implementations. Ende Jin graduated in 2023 summer and this is the only paper he has published.
This paper discusses the design and implementation of a language extension, called FPOP, to the Coq proof assistant, which enables extensible inductive types, family polymorphism, late binding of names, and modular mechanization and proof reuse in dependent type theory.
They implement their language design as a Coq plugin. It works by translating programs in FPOP syntax into programs that can be checked and evaluated by Coq.
5. Mosaic: An Interoperable Compiler for Tensor Algebra
authors: Manya Bansal, Olivia Hsu, Kunle Olukotun, Fredrik Kjolstad
keywords: compilers; mathematical software performance
Manya Bansal is currently a Ph.D. student at MIT advised by Saman Amarasinghe and Jonathan Ragan-Kelley. This one is the first and only paper published by her so far.
This paper introduces Mosaic, a compiler for sparse tensor algebra. Sparse tensor algebra is an important computational language that describes multilinear expressions over dense and sparse tensors. It is used in many domains, including scientific computing, engineering, and machine learning.
No current sparse tensor algebra compiler can mix generated specialized and fused code with calls to libraries or domain-specific hardware (external functions), leaving application developers to write low-level code by hand. To deal with this limitation, Mosaic enables users to extend with library functions and specialized hardware to externally compute whole expressions.
6. Mostly Automated Proof Repair for Verified Libraries
authors: Kiran Gopinathan, Mayank Keoliya, Ilya Sergey
keywords: software evolution; proof repair
Kiran Gopinathan is a fifth year Ph.D. student at the National University of Singapore advised by Ilya Sergey. This is his first paper during the Ph.D. career. Their group is currently focusing on automated program repair via static analysis and verification.
In this paper, they present the first proof repair methodology for higher-order imperative functions, whose initial versions were verified in the Coq proof assistant and whose specifications remained unchanged. Their proof repair procedure is based on the combination of dynamic program alignment, enumerative invariant synthesis, and a novel technique for efficiently pruning the space of invariant candidates, dubbed proof-driven testing, enabled by the constructive nature of Coq’s proof certificates.
Given an OCaml function verified in Coq and its unverified new version, their tool produces a Coq proof for the new version, discharging most of the new proof goals automatically and suggesting high-confidence obligations for the programmer to prove for the cases when automation fails.
7. Synthesizing MILP Constraints for Efficient and Robust Optimization
authors: Jingbo Wang, Aarti Gupta, Chao Wang
keywords: syntax guided synthesis
Jingbo Wang is an incredibly productive student working with professor Chao Wang at the University of Southern California. And she will join Purdue university as an assistant professor this fall. Most of her works focus on program synthesis.
Program synthesis is the task of constructing a program that provably satisfies a given high-level formal specification. In contrast to program verification, the program is to be constructed rather than given.
Her paper discusses the mixed integer linear programming (MILP) solvers. They find that it is a challenging task for end users to write correct and efficient MILP constraints, especially for problems specified using the inherently non-linear Boolean logic operations. To overcome this challenge, they propose a syntax-guided synthesis (SyGuS) method. It can generate high-quality MILP constraints from the specifications expressed using arbitrary combinations of Boolean logic operations.
For basic background of integer and linear programming, you can read my blog Randomized Algorithm IV— Integer Programming.
They implemented and evaluated the method on various benchmark specifications from statistics, machine learning, and data science applications. The results show that the quality of the synthesized MILP constraints is even better than manually-written constraints.
8. Trace-Guided Inductive Synthesis of Recursive Functional Programs
keywords: program synthesis; recursive functional programs
authors: Yongwei Yuan, Arjun Radhakrishna, Roopsha Samanta
Yongwei Yuan is advised by Suresh Jagannathan. Zhao Zhe, the author of the second paper, is also Suresh Jagannathan’s student.
In the paper, they propose a novel trace-guided approach to tackle the challenges of ambiguity and generalization in the synthesis of recursive functional programs from examples. Their approach augments the search space of programs with recursion traces consisting of sequences of recursive subcalls of programs. Their method is based on a new version space algebra (VSA) for succinct representation and efficient manipulation of pairs of recursion traces and programs that are consistent with each other. They have implemented this approach in a tool called SyRup and evaluated it on benchmarks from prior work. Their evaluation demonstrates that SyRup not only requires fewer examples to achieve a certain success rate than existing synthesizers, but is also less sensitive to the quality of the examples.
Appendix
How bugs are found using automata?
Problem Specification: The problem is given by a triple \(\{P\} \: C \: \{Q\}\) , where $P$ represents a set of quantum states on the input of a circuit $C$, and $Q$ represents a set of quantum states on the output of the circuit. The question is whether the set of quantum states on the output is equal to (or included in) the set $Q$.
Compact Representation: Tree automata are used to compactly represent sets of quantum states. Tree automata are mathematical structures that can recognize or generate tree languages. In this case, they are used to represent the sets of quantum states.
Semantics of Quantum Gates: Transformers are developed to implement the semantics of quantum gates over the representation provided by tree automata. These transformers define how the quantum gates affect the quantum states.
Algebraic Representation: The technique computes with an algebraic representation of quantum states, avoiding the inaccuracy of working with floating-point numbers. This algebraic representation allows for precise calculations and comparisons.
Verification Process: The proposed approach verifies the correctness of the quantum circuit by comparing the expected output set $Q$ with the actual output set obtained from the circuit $C$. The tree automata representation and the transformers are used to perform this verification.
Bug Detection: If the actual output set does not match the expected output set, it indicates the presence of a bug in the quantum circuit. The technique can identify discrepancies between the expected and actual output sets, allowing for the detection of bugs.
Scalability and Performance: The performance of the approach is evaluated against various benchmarks from the literature. The evaluation shows that the approach is scalable and can handle large-scale circuits with a significant number of qubits and gates. It also outperforms other existing tools in terms of bug detection.
What is underapproximate reasoning of a program?
Underapproximate reasoning of a program refers to a method of reasoning about the behavior and properties of a program by considering a conservative approximation of its possible outcomes. It involves making conservative assumptions or approximations about the program’s behavior in order to simplify the analysis and make it more tractable. Underapproximate reasoning focuses on capturing a subset of possible program behaviors that are guaranteed to hold true, while potentially missing some specific or complex behaviors.
This approach is often used in program analysis and verification to establish safety properties, prove correctness, detect bugs, or guide program optimizations. It provides a trade-off between precision and efficiency, allowing for scalable and sound analysis results while sacrificing some level of completeness.