Copyright © 2023 Robert Y. Lewis. See LICENSE.txt.

import Lean
import LoVe.LoVelib

Data structures and processes of a proof assistant

Today will be a break from the normal flow of things. We'll set down some vocabulary for different parts of the proof assistant processing pipeline, and talk about some of the algorithms involved.

Elaboration

There's a big gap between Lean "input syntax" and actual dependent type theory.

#check 1 + 2

set_option pp.all true in
#check 1 + 2

Elaboration is the art of going from input syntax to the abstract syntax of our language. This happens in most languages to some degree, but it's especially complicated in languages with lots of implicit information!

Expressions/abstract syntax

What is the abstract syntax of Lean anyway?

#print Lean.Expr

-- let's write the corresponding expr
#check fun b : Bool => not b

section
open Lean Expr


end

Declarations

A declaration is essentially a triple of

def f : Bool → Bool := fun b => not b

#print Lean.Declaration

The def and theorem commands take in these components. They elaborate the type and body, and check that the type of the body is the same as the type of the declaration.

Environments

At any point in a Lean file, we can access and use everything that was declared before that point.

We call this collection an environment. Essentially, an environment is a set of declarations.

Elaboration has access to the environment!

#print Lean.Environment

Tactics

Where do tactics fit into this picture?

theorem a_silly_theorem : True :=
And.left (And.intro (by simp) trivial)

#print a_silly_theorem

During elaboration, tactics are evaluated to produce expressions. The environment stores no trace of a tactic.

Tactics are, essentially, functions from "proof states" to expressions.

Notice that, to get an "expected type," elaboration must have begun in order for the tactic to execute. But also, tactic evaluation can effect elaboration!

#check id (by exact (5 : ℤ))

There's a very complicated dance here. Also relevant: type class inference.

Type checking

The elaborator, and the tactic execution engine, both seem to have some notion of type correctness.

-- #check Nat.add 4 (-2 : ℤ)

-- example : False := by
--   exact trivial

These algorithms are horrendously complicated. Even worse, tactics are arbitrary programs that can do all kinds of stupid awful things. Why can we trust that they aren't compromising the soundness of our system?

elab "remove_goals" : tactic => Lean.Elab.Tactic.setGoals []

-- example : False := by
--   remove_goals

What does "soundness" even mean?

Theorem (soundness). In an environment in which no new axioms have been declared, there is no well-typed declaration with type False.

The kernel is our last resort against soundness bugs. No matter what the elaborator does, it must submit any potential new declaration to the kernel for inspection.

The kernel implements a slow, simple, reliable type checking algorithm. It is small (in terms of LOC) and well documented.

Furthermore, it can "export" an entire environment. External programs that implement type checking algorithms for the same type theory can read this export and double check that there is no type error.

A language hierarchy

In summary, we have a bunch of different languages here.

input language <---> tactic language | / | elaboration / v v core language | | kernel export v export language

It seems reasonable to have the input and core languages be similar.

What should the export language look like? What about the tactic language?