|~
MENU
CASE STUDY WORK SAMPLES RESUME REFERENCES CONTACT

Proven Ability ∎

My name is Larry Lee and I am a software developer specializing in formal methods. My work includes formally verified hardware designs and real-world examples of applied mathematics.


I am a software developer specializing in formal methods.

Formally Verified Software development is the practice of writing mathematical proofs to verify software. I specialize in using Coq to do this. Over the last 7 years, I've written hundreds of proofs using Coq and have developed formally verified algorithms. I've contributed to the Coq Standard Library and am a package maintainer in the Coq Package Index. My portfolio demonstrates my ability to analyze complex problems and produce results.


Industrial Results

SiFive is a Silicon Valley-based company specializing in the design of RISC-V processors.

In 2018, they suspected that a register used within the floating point unit of their S-series processors was too small to store the intermediate values that it computed during square root and division operations. While the unit passed SiFive's test suite, They feared that a bug lurked in an edge case not covered by their tests.

I successfully derived an analytical proof verifying that the registers allocated by the unit where sufficiently large to store these intermediate values. More importantly, I formalized this proof using Coq; thereby giving SiFive the assurance that they needed to move forward with production.

Click on the down arrows below to view an excerpt from this proof. You can view the full proof here: FPU-Verification

1
Introduction

SiFive developed the floating point unit in a modeling language named Chisel. I began by writing a set of recurrence relations that formalized the square root and division algorithms. I then identified bijective mappings between the variables used in SiFive's model and the variables used in my equations. This allowed me to prove that my equations correctly modeled the unit. Then, I verified that my equations correctly converged onto the correct results, thereby verifying that SiFive's unit was logically correct.

One problem remained however. SiFive's unit defined a register named \(rem\), and they feared that this register was too small to store all of the intermediate values computed during operation. To prove that the \(rem\) register was large enough, I needed to prove that the values stored within \(rem\) were smaller than \(2^w\), where \(w\) represented the number of bits SiFive had allocated.

This register corresponded to a variable defined within my equations named \(error(n)\). Using the mapping between \(rem\) and \(error(n)\), I had to verify that: $$\forall n, error(n) < \frac{8}{2^n}$$ when computing the square root of an odd number, and an analogous result when computing the square root of even numbers. The following is an extract of my proof of this result.

2
The Model

Let \(x\) denote the number that the floating point unit is computing the square root for. Let \(approx(n)\) denote the \(n^{th}\) approximation of the square root of \(x\). And, let \(error(n)\) represent the \(n^{th}\) approximation error. Then \(x\) is given by:

$$x = approx (n)^2 + error (n)$$1

Setting \(approx(0) = 0\), we will append a bit, \(b(n)\), onto the approximation at the end of each iteration.

$$\begin{array}{rcl} approx (0) &=& 0 \\ \\ approx (n + 1) &=& approx (n) + \frac{b (n)}{2^n}\end{array}$$2

where:

$$b(n) = \begin{cases} 1, & \mbox{if } \frac{1}{2^n} (2\ approx (n) + \frac{1}{2^n}) \le error (n) \\ \\ 0, & \mbox{otherwise}\end{cases}$$4

Using back substitution we can solve for \(error\) to derive the following recurrence relation:

$$\begin{array}{rcl} error (0) &=& x \\ \\ error (n + 1) &=& error (n) - \frac{b (n)}{2^n} (2\ approx (n) + \frac{b (n)}{2^n})\end{array}$$5

Using the above equations and the fact that \(1 <= x < 2\), we can derive the following corollaries:

$$\forall n, 0 \le approx(n) \lt 2$$6
$$\begin{array}{rcl} \forall n, b (n) = 0 \implies error (n) \lt \frac{1}{2^n} (2\ approx (n) + \frac{1}{2^n}) \\ \\ \forall n, b (n) = 1 \implies \frac{1}{2^n} (2\ approx (n) + \frac{1}{2^n}) \le error (n) \end{array}$$7
3
A Lower Bound for \(approx\)
$$\mbox{Lemma 1.1:}\quad \forall n, x \lt (approx (n) + \frac{2}{2^n})^2$$

Proof

We proceed using a proof by induction. When n = 0, substitution gives:

$$\begin{array}{rcl} x &\lt& (approx (0) + \frac{2}{2^0})^2 \\ x &\lt& 4\end{array}$$8

When \(n \gt 0\), we have to prove:

$$\begin{array}{l} \forall n, &x \lt (approx (n) + \frac{2}{2^n})^2 \implies \\ &x \lt (approx (n + 1) + \frac{1}{2^n})^2.\end{array}$$9

We proceed by case analysis. \(b (n)\) equals either 0 or 1.

b (n) = 0

When \(b (n) = 0\), our goal becomes: \(x \lt (approx (n) + \frac{1}{2^n})^2\). Expanding and rewriting \(error\) using (7) leads to:

$$\begin{array}{rcl}x &=& approx (n)^2 + error (n) \\x &\lt& approx (n)^2 + \frac{1}{2^n} (2\ approx (n) + \frac{1}{2^n}) \quad \quad \mbox{by (7)} \\x &\lt& (approx (n) + \frac{1}{2^n})^2\end{array}$$

b (n) = 1

When \(b (n) = 1\), our goal becomes: \(x \lt (approx (n) + \frac{1}{2^n} + \frac{1}{2^n})^2\). However, this is equivalent to: \(a \lt (approx (n) + \frac{2}{2^n})^2\), which is identical to our inductive hypothesis. ∎

4
A Constant Upper Bound for \(error\)
$$\mbox{Lemma 1.2:}\quad \forall n, error (n) \lt 4$$

Proof

We proceed using proof by induction.

n = 0

When \(n = 0\), the lemma follows immediately from (5): \(error (0) = x \lt 2\).

n ≥ 1

When \(n \ge 1\), our goal becomes: \(error (n) \lt 4 \implies\)\(error (n + 1) \lt 4\). Assuming that \(error (n) \lt 4\) and expanding \(error (n + 1)\) leads to:

$$\begin{array}{rcl}error (n + 1) &=& error (n) - \frac{b (n)}{2^n} (2\ approx (n) + \frac{b (n)}{2^n}) \\error (n + 1) &\lt& 4 - \frac{b (n)}{2^n} (2\ approx (n) + \frac{b (n)}{2^n}) \qquad \mbox{by the ind. hyp.} \\error (n + 1) &\lt& 4 - 0 \\\end{array}$$

Where we maximized the rhs in the last inequality by taking the case where \(b (n) = 0\). ∎

5
A Scaling Upper Bound for \(error\)
$$\mbox{Corollary 1.3:}\quad \forall n, error (n) \lt \frac{4}{2^n} (approx (n) + \frac{1}{2^n})$$

Derivation

Corollary 1.4 follows algebraically from Lemma 1.1:

$$\begin{array}{rcl}x &\lt& (approx (n) + \frac{2}{2^n})^2 \\x &\lt& approx (n)^2 + \frac{4}{2^n} (approx (n) + \frac{1}{2^n}) \\x - approx (n)^2 &\lt& \frac{4}{2^n} (approx (n) + \frac{1}{2^n}) \\error (n) &\lt& \frac{4}{2^n} (approx (n) + \frac{1}{2^n})\end{array}$$
6
Conclusion
$$\mbox{Theorem 1.4:}\quad \forall n, error (n) \lt \frac{8}{2^n}$$

Proof

We will proceed using induction over \(n\). When \(n = 0\), \(error (0) = x \lt 2\) and Theorem 1.4 follows immediately. Hence, we only need to consider the case where \(n \ge 1\).

Observe that:

$$\begin{array}{rcl}approx (n)^2 &=& x - error (n) \\approx (n)^2 &\lt& 2 - 0 \qquad \mbox{using (6)} \\approx (n) &\lt& \sqrt {2}\end{array}$$

We can substitute this upper bound into the scaling upper bound we derived in Corollary 1.3. Doing so leads to:

$$\begin{array}{rcl}error (n) &\lt& \frac{4}{2^n} (approx (n) + \frac{1}{2^n}) \\error (n) &\lt& \frac{4}{2^n} (\sqrt {2} + \frac{1}{2^n})\end{array}$$

which is always less than \(\frac{8}{2^n}\) when \(n \ge 1\). This observation completes this proof. ∎


Large Scale Developments

While working at SiFive, I was the lead developer for Kami Processor, an Open-Source RISC-V processor written in the modeling language Kami.

Kami Processor is highly configurable and implements a substantial fraction of the RISC-V feature set: including memory virtualization, compressed instruction encodings, floating point operations, and dynamic data widths.

I developed and verified most of the Gallina functions used to generate Kami Processor and contributed proofs to the Kami language project. Click on the down arrows below to read more about my work.

1
Kami Processor

Overview

SiFive started the Kami Processor project in 2018 intending to develop a canonical model of the RISC-V specification. While working at SiFive, I was the primary developer for Kami Processor.

The Kami processor supports the following RISC-V extensions: I, M, A, D, F, C, S, U, Zifencei, and Zcsr. This means that the Kami processor supports the mandatory core set of integer arithmetic and logic instructions prescribed by the RISC-V ISA (I). In addition, it supports integer multiplication (M), atomic memory operations (A), 64 bit floating point operations (D), 32 bit floating point operations (F), compressed instruction encodings (C), multi threaded synchronization support (ZiFencei), and support for RISC-V's control status register operations (Zicsr).

The Kami Processor includes a page table walker which allows applications to translate between physical and virtual memory addresses. Kami Processor includes a translation look-aside buffer that caches address translations in a buffer that uses a pseudo-LRU allocation algorithm.

The RISC-V spec defines several virtual memory modes. These are denoted by a code that indicates the size of the words used to encode virtual addresses. For example, "sv39" indicates that the virtual addresses are 39 bits wide. The Kami Processor supports virtual address modes sv32, sv39, and sv48.

The RISC-V memory protection model prescribes a physical memory protection scheme based around physical protection control status registers (PMP CSRs). The Kami Processor supports the full suite of PMP CSRs and all of the RISC-V address matching schemes, namely: top of range (TOR), naturally aligned four-byte regions (NA4), and naturally aligned power of two regions (NAPOT). Its address granularity is configurable.

I wrote the Kami Processor to use the TileLink interchip communication protocol. The TileLink protocol defines a set of conformance levels. Kami Processor's device interface communicates using the TileLink Uncached Heavyweight (TL-UH) message protocol. This protocol extends the basic read and write messages provided by the TileLink Uncached Lightweight (TL-UL) message protocol by adding atomic memory operations. By supporting the TileLink protocol, the Kami Processor can be integrated into systems using a number of interchip bus protocols either directly or through a conversion bridge.

Read the following report to learn more about the Kami Processor.

2
Kami

Overview

Kami is a modeling language intended to model digital circuits. Its language syntax and semantics are derived from BlueSpec. It is a domain-specific language embedded withing Gallina, Coq's dependently typed programming language. Kami uses Gallina's dependent types to specify component properties. Kami verifies that these component properties are true using mathematical proofs expressed in Coq's Ltac and Gallina. Ltac is a macro language that Coq interprets to generate Gallina terms. These Gallina terms represent mathematical proofs.

A Kami model of a synchronous digital circuit consists of one or more Kami modules and a scheduler. A Kami module comprises a set of registers, rules, and methods. A rule represents a combinatorial circuit that the scheduler tries to execute as frequently as possible. A method is a combinatorial circuit that posses a well defined set of inputs and outputs and which may be shared between circuits provided by other modules. Both rules and methods consist of Kami actions. A Kami action is a combinatorial circuit that may read or write to one or more registers and which may "call" one or more methods. Kami actions consist of a series of Kami expressions, which are combinatorial circuits that do not write to registers or call methods, and Kami statements that read and write to registers and call methods.

The Utila Library

I wrote the Utila library, which defines a set of formally verified hardware components that are used throughout Kami Processor.

The library includes formally verified lookup tables, and useful circuit combinators.


Interactive Code Sample

The sample below presents three Gallina functions: natToBinStr, natToDecStr, and natToHexStr. Each of these functions accepts a number and returns a string that represents the given number in a specific format. In this example, I prove that these functions are injective.

Whereas the print functions provided by the Coq Standard Library only support decimal and hexadecimal formats, the example presented below can support an arbitrary radix. Furthermore, this example demonstrates a number of Coq design patterns, including well founded recursion, structural recursion, and dependent types.

This example was originally defined as part of the Kami project: NatStr.v.

modules/coq/examples/nat_str.v


Open Source Contributions

Over the years, I've contributed to the Coq Standard Library and released packages through the Coq Package Index.

The Coq package index is a curated collection of Coq modules that are released by the Coq maintainers as part of the Coq Opam repository.

I am the package maintainer for two CPI libraries: functional algebra and pigeons.


Endorsements

"I was Larry's supervisor at SiFive. There, he was instrumental in bringing up a formal specification of the RISC-V processor in Kami, which is a framework in Coq to specify hardware circuits using transactional semantics. The task was pretty complex, given the enormity of the RISC-V specification and the ability to understand, Kami, a new framework completely, in order to write programs in that framework. Larry was a quick learner and was highly motivated to perform this task. He paid a lot of attention to detail and had excellent software engineering skills - always coming up with unit test cases even for the smallest change made or feature added to the code base. Overall, I enjoyed working with Larry."

Murali Vijayaraghavan
Proof Engineer
vmurali@csail.mit.edu

"Working with Larry for nearly two years at SiFive I have come to know him as a man with [an] efficient and effective proof writing technique. As evidenced by his work in the ProcKami repository, he also shows the great attention to detail and forward planning one associates with well-written COQ code."

Anthony Machado
Proof Engineer
tj.machado@gmail.com

"Larry Lee [has] great knowledge of theorem proving as well as the ProcKami [RISC-V] processor. He was always the one I talked to when I had a question on how something worked in the ProcKami processor or a question on the Kami language semantics. It was really enjoyable working with Larry Lee while I was at SiFive."

Kenneth Roe
Proof Engineer
http://www.cs.jhu.edu/~roe/


FAQs

search/search//0/10

You can find information about me and my work using the search feature below. Simply enter in a search query, such as "email address," and I will try to answer your question.

search_interface

Results

search_interface
\(\mbox {Q.E.D}\)

Send Me a Message

feedback_form
Back To Top