CompCert Overview

:: tutorial, coq, compiler correctness

By: Ben Greenman

If you are interested in learning about the internals of the CompCert C compiler but would rather not read its source code, this post is for you.

(This is a public service announcement.)

Last fall, I gave a short lecture on the 2006 paper “Formal Certification of a Compiler Back-End” by Xavier Leroy for Amal Ahmed’s “Special Topics in Programming Languages” class. Rather than present CompCert as it existed in 2006, I read the documentation and source code for CompCert 2.5 (released June 2015). The lecture then focused on three questions:

  • What subset of C does CompCert handle, today?
  • What optimizing passes does CompCert perform?
  • What is the “correctness theorem” for CompCert, and what does this theorem mean?

My notes for the lecture give a “mid-level” summary of the compiler — there are more details than you’ll find in papers, but it’s (hopefully!) easier to read than the source code. The document is also hyperlinked to locations in the CompCert GitHub repository.

Here is the document:

And here is a table-of-contents:

  1. Motivation, details of the source and target languages, high-level guarantees
  2. Compiler pipeline, optimizing passes, links intermediate language grammars and Coq theorems
  3. Background on compiler correctness
  4. CompCert’s correctness, properties that CompCert does not guarantee
  5. Recent (2006 – 2015) work in the CompCert ecosystem

The document ends with a short description of two other research projects that have grown into “industry software” and a link to Xaver Leroy’s OPLSS lectures on certified compilers. Enjoy!