Mindless-coding = high-mechanism? (was Re: anybody tried msr's f*?)

Wow - thanks for those postings, folks! I’m going viral ;!)

I should probably explain why I think “mindless-coding” has a shot at
being a “high-mechanism”. Sorry for the long post…

In a sufficiently powerful dependently typed language, you can specify
so much of a function’s behavior in its type (the dependent type of its
args and return type combined) that this type constrains anything that
satisfies it to be a fully correct implementation of the function (maybe
not the best performing or most resource conservative - but maybe linear
types could help there). And, specifying things is much easier
(sometimes vastly easier) than implementing them. And, how you specify
doesn’t matter, as long as it is fully constraining, since the
specification won’t appear in the extracted code. You can choose to do
some ridiculous things just to make the specification easier - like
having every node in a tree carry the entire contents of its subtree
flattened into a list as part of its type (which is what I did).

If you then have an interactive “proof assistant” (Coq) or “type
checker” (Agda and Idris), the task of writing the implementation of
such a fully-specified-by-its-type function gets simplified in a very
interesting way. The proof assistant/type checker, equipped with this
fully specified type, is now carrying the burden of making sure the
implementation suits the purpose of the function, that it is correct,
and that there are no missing parts. The programmer is given small
tasks - the “subgoals” in a Coq proof - and just has to make progress.
If you succeed or fail, you find out quickly. Your task as programmer
is fully constrained at each point. Alas - no freedom! But the loss of
freedom is - well - freeing. The metaphor of playing computer Sudoku is
a good one - just focus on a single row, column, or square, and fill in
the missing numbers.

In other words - while it usually gets harder to write a program with
more dependent types - that stops being true at some point, and it
actually gets easier. Much easier (at least for what otherwise would be
hard-to-write programs). Sure, in the case of using Coq’s proof mode,
you need to have a tool-box of tactics to use - but that’s not hard to
learn.

This flies a bit in the face of what some dependent type experts will
tell you. You’re supposed to use dependent types to establish important
properties of functions - not “all” properties. But I was intrigued -
initially because I thought that having the type of a function
completely describe all anyone needs to know about that function was
both cool and could solve a lot of modularization woes.

The first thing I did was implement red-black trees - a full
implementation, including delete - without any reference to existing
implementations, just using the rules describing what a red-black tree
is. I actually did this first in Agda a few years ago - but I couldn’t
swing the nice erasability and readable code extraction in Agda - and so
eventually switched to Coq. If you know red-black trees, you know that
rebalancing them, especially during a delete operation, can make your
brain hurt. I had never tried to do it myself before (the experience
with Agda). But the mindless-coding technique made it relatively easy.
I had to develop a good bit of generic “proof” infrastructure - mostly
about proving things about sorted lists like if list (L1 ++ L2) is
sorted, then L1 is sorted and L2 is sorted - that kind of thing. I then
added AVL trees for good measure, making sure that the proof
infrastructure worked for them as well.

Being able to extract nice readable code (OCaml or whatever) is
important for feedback purposes. Ideally, I would like to do
interactive mindless-coding with a third window: the proof script, the
current subgoal, and the extracted code (getting extracted as the proof
is elaborated, not after) with holes in it corresponding to unproven
subgoals. But, Coq can’t do that yet (although I don’t think there is
any theoretical reason why it couldn’t).

Another rather dramatic demonstration occurred when I tried to come up
with a counter-example. I decided I need to mindlessly-code something
other than just trees - but before I did, I wanted to see what a failure
would look like. So I picked a tree algorithm that I thought was so
easy it couldn’t exist (otherwise it would exist already - after all,
AVL trees are over 50 years old, and red-black trees over 40). And
mindlessly-coded away. But, instead of hitting a failure, it worked -
these are the gap trees described in my project. In this case, not only
did I have absolutely no idea how to implement them - I didn’t think an
implementation would exist. I was pretty certain that I’d hit some
failure trying to mindlessly code up a rebalancing function or something
like that. But it didn’t matter that all I had was the specification of
the gap tree’s structure - that was enough for the proof assistant to
guide me all the way to a complete verified implementation. I had to
look at the extracted OCaml code to see what it looked like - were the
rebalances simple or complex? (simple!) - and then discovered some very
interesting properties about these gap trees (that might make them a
very nice alternative to red-black and AVL trees).

Why Coq: I had looked at ATS (hence why I am lurking in this group), and
am still quite intrigued by its linear types - but it didn’t offer the
combinations of features I needed to jump-start my idea: interactivity
with nice feedback, good readable code extraction, full erasability of
proof parts, unrestricted dependent typing power, good proof
automation. In fact, none of the dependently-typed languages did (I
have also looked at Agda and Idris quite a bit, in addition to Coq), but
Coq seemed to need just a little bit of help (with erasability) to get
things moving. I know it looks strange to see functions written by
tactics - but that’s mostly because Coq isn’t nearly as nice a
programming language as it is a prover (but it’s an awesome prover!).
And Curry-Howard means it doesn’t matter, right? Anyway, using an
unfamiliar way to write programs also helps demonstrate this point: when
you’re mindlessly coding, you’re focused on the subgoals that the proof
assistant presents to you, and not on the way (tactics vs. programming
language constructs) in which you are interacting with the proof
assistant. Even if the proof-script is completely undecipherable, what
matters is the extracted code (I later went back and made the red-black
and AVL proof scripts look nice anyway). Also, using powerful tactics
and proof automation can really simplify a lot of things.

Again, sorry for the long post. Thanks again for the postings. Now,
back to lurking…

– Jonathan

some people will have had very similar experience via
test-driven-development. i know i have. but of course i’d rather it be
via static type checking than a giant wad of test case code.