Getting familiar with dependent types. (This is not a question, but a tutorial)

This includes some pitfalls that the beginner (myself) may incur in getting
familiar with dependent types. I have somewhat linearized my debugging
process to make this a sane post, leaving out a few details; I will start
with a simple example and up to a still simple but slightly more intricate
example. Please do not read unless you are bored or looking for an
introduction to “debugging” in ATS; comments are welcome of course.

Following the main recommendation of George Polya that I can remember: If
you can’t solve a problem, you can find an easier one to solve. That seems
to have been very helpful to me in debugging today.

First, let us look at a simple loop.

This doesn’t check, with error: error(3): unsolved constraint:
C3STRprop(main; S2Eeqeq(S2Eapp(S2Ecst(sub_int_int_int); S2EVar(2424),
S2Eint(1)), S2Eapp(S2Ecst(sub_int_int_int); S2Evar(n(5343)), S2Eint(1))))

fun bloop
{n:nat | n >= 1}
(n: int (n)): int (n-1) =
if n = 1 then n-1
else bloop(n-1)

The intention above was that, clearly, each time bloop is called, the
returned value is the argument value minus one.

This does:

fun bloop
{n:nat | n >= 1}
(n: int (n)): int =
if n = 1 then n-1
else bloop(n-1)

I still haven’t learned to read these type errors very well, but clearly
the constraint: bloop(n) = n-1 is failing. This is because it is a
recursive function, and what actually is returned is being typechecked is
the final value, not what is returned after one recursion (which
technially, would be a function or closure, I suppose). What a mistake! Of
course, if one recalls the chapter on dependent types from the ATS book,
this problem will not occur…

So now, we can guess that what we actually want is:

fun bloop
{n:nat | n >= 1}
(n: int (n)): int (0) =
if n = 1 then n-1
else bloop(n-1)

This also typechecks.

Going to a slight variant with a function of two variables than increases
the latter variable until it is equal to the former:

fun aloop
{n: nat} {i: nat| i < n}
(n: int,i: int): (int,int) =
if i = n then @(n,i) else aloop(n,i+1)

This fails to typecheck with many errors:

warning(3): s3iexp_make_s2exp: s2e0 = S2EVar(2420)
/media/RAID5/share/ATS_learning/ip1.dats: 1207(line=58, offs=29) –
1212(line=58, offs=34): warning(3): the constraint [S2Eapp(S2Ecst(>=); S2EVar(2420), S2Eint(0))] cannot be translated into a form accepted by the
constraint solver.
/media/RAID5/share/ATS_learning/ip1.dats: 1207(line=58, offs=29) –
1212(line=58, offs=34): error(3): unsolved constraint: C3STRprop(main;
S2Eapp(S2Ecst(>=); S2EVar(2420), S2Eint(0)))
warning(3): s3iexp_make_s2exp: s2e0 = S2EVar(2421)
/media/RAID5/share/ATS_learning/ip1.dats: 1207(line=58, offs=29) –
1212(line=58, offs=34): warning(3): the constraint [S2Eapp(S2Ecst(>=); S2EVar(2421), S2Eint(0))] cannot be translated into a form accepted by the
constraint solver.
warning(3): s3iexp_make_s2exp: s2e0 = S2EVar(2421)
/media/RAID5/share/ATS_learning/ip1.dats: 1207(line=58, offs=29) –
1212(line=58, offs=34): warning(3): the constraint [S2Eapp(S2Ecst(<); S2EVar(2421), S2EVar(2420))] cannot be translated into a form accepted by
the constraint solver.
type-checking has failed: there are some unsolved constraints: please
inspect the above reported error message(s) for information.

Simply commenting out the existential constraint will allow a typecheck:
//{n: nat} {i: nat| i < n}
But, this is not really in the spirit of ATS precision.

Breaking up the lines a bit so we can see where the error is using emacs
flymake (or by typechecking manually again), we see it is a problem with
the recursive call to aloop:

fun aloop
{n: nat} {i: nat| i < n}
(n: int,i: int): (int,int) =
if i = n
then @(n,i)
else aloop(n,i+1)

At this point, one may guess that the issue is that the function will never
terminate since i can never get to n, since we require (i<n). But remember
that the constraints are for initial conditions (universal constraints) and
final, recursed conditions (existential constraints). If we change the
constraint from (i<n) to (i<=n) the problem still occurs.

However, if we change the types of the arguments to make use of this
information, we now have a function that typechecks:

fun aloop
{n: nat} {i: nat| i <= n}
(n: int (n) ,i: int(i) ): (int,int) =
if i = n then @(n,i) else aloop(n,i+1)

Without knowing that i and n are of type int (i) and type int (n)
respectively, we cannot guarantee the constraints. This may seem a bit
verbose if one is not used to doing it just yet.

It is good to show that we can say something about the returned value more
specific than they are a 2-tuple of integers: that both elements are equal
to n by changing “(int, int)” to “(int (n),int(n))” in the return value’s
type.

My penultimate inspiration for this example was a problem that looked very
similar now that all the other problems in it have been fixed, the only
difference being that we are making use of tuples

fun cloop
{n: nat} {i: nat| i <= n}
(@(n: int n, i:int i)): (int (n),int(n)) =
if i = n then @(n,i) else cloop @(n,i+1)

//Alternative syntax for arguments: (@(n, i): (int n, int i))
Originally I was somehow getting a type error about a closure function, but
it seems to no longer be here.

Now, my original inspiration for this whole example was to write a simple
function to eat up white space in a lexical analyzer for use in a parser
for a simplified boolean expression grammar (and to even get this far I’d
like to thank others for their help, especially Hongwei and Chris Double
and ousado from IRC; also, I would like to thank emacs (or blame it) for
making it so easy for me to write up these trivial yet lengthy posts…).

I’m sure there are much better ways to write a parser, especially in in
ATS, which may have some facilities to guarantee a specified grammar has a
unique parse tree using types (I speculate based on something I remember
seeingon the ATS mailing list). My goal for now is to just make as many
parse functions tail-recursive as seems convenient. I may improve all these
as an exercise later.

First, some character functions to help with our alphabet:

(* Assumes ASCII *)
fn isGeneCh(c:char): bool = (c >= ‘0’ && c <= ‘9’) || (c >= ‘A’ && c <= ‘Z’)
|| (c >= ‘a’ && c <= ‘z’) || c = ‘_’ || c = ‘.’ || c = ‘-’

fn isBoolCh(c:char): bool = c = ‘a’ || c = ‘A’ || c = ‘n’ || c = ‘N’ || c =
‘d’ || c = ‘D’
|| c = ‘o’ || c = ‘O’ || c = ‘r’ || c = ‘R’ || c = ‘|’ || c = ‘&’

fn isBool1stCh(c:char): bool = c = ‘a’ || c = ‘A’|| c = ‘o’ || c = ‘O’ || c
= ‘|’ || c = ‘&’

fn stopChar(c:char): bool = c = ‘\0’

//This should probably be changed, but we assume these are the above are
the only
//language letters.

fn isWhiteSpace(c:char): bool = not (stopChar© orelse isGeneCh© orelse
isBoolCh© orelse (c = ‘(’) orelse (c = ‘)’) ) // Is this right?

Now, using the principles learned about dependant types above, I was able
to adjust the space eater to look like this:

//Eats whitespace and stops on next character
fun loopWhite
{n:nat} {i:nat | i < n}
(@(astring, i): (string n, int i)): [j:nat | j>=i && j <= n] (string n,
int j) =
let
val isz = size1_of_int1(i)
val notend = string_isnot_atend(astring, isz)
val cur_ch = if notend then astring[isz] else '\0’
in
if isWhiteSpace(cur_ch) then
(print(“a space.\n”); loopWhite(@(astring, i+1)))
else @(astring, i)
end
//end [loopWhite]

This gives the following error occuring on the line with “print”: error(3):
unsolved constraint: C3STRprop(main; S2Eapp(S2Ecst(<); S2EVar(2453),
S2EVar(2452)))
type-checking has failed: there are some unsolved constraints: please
inspect the above reported error message(s) for information.

This looks very familiar, once again, we have a recursive call inovling i+1
but we require i<n. But changing the constraint to i<=n doesn’t help,
because loopWhite will again be called if the last character in the string
is a space. Could this be the problem? It is certainly a problem.

If we change the test condition to “if (isWhiteSpace(cur_ch) andalso
notend)” we still get the error. What is causing the constraint error?
Removing all constraints on the return type other than (string, int)
doesn’t help either.

We’ve probably reached a point where we’re in a bit of trouble because none
of the alphabet checks impose any constraints, nor can they easily since
they are not based on the sorts of “int” and “bool”. What we really want is
a constraint that we are not at n-1:

fn str_notat_endminus1
{n:nat} {i:nat | i <= n}
(astring: string (n), i: size_t i): bool (i < n-1) =
let
val notend = string_isnot_atend(astring, i)
in
if notend then string_isnot_atend(astring, i+1) else false
end

//Eats whitespace and stops on next character
fun loopWhite
{n:nat} {i:nat | i <= n}
(@(astring, i): (string n, int i)): (*[j:nat | j>=i && j <= n] (string n,
int j) *) (string, int) =
let
val isz = size1_of_int1(i)
val notend = string_isnot_atend(astring, isz)
val notatend_m1 = str_notat_endminus1(astring, isz)
val cur_ch = if notend then astring[isz] else '\0’
in
if (isWhiteSpace(cur_ch) andalso notatend_m1) then
(print(“a space.\n”); loopWhite(@(astring, i+1)))
else @(astring, i)
end
//end [loopWhite]

Now we still get a constraint problem. Taking a hint from the section of
the ATS Book on string processing, I realized I was making a terrible
logical fallacy: it could still happen that the last space is white. Also,
we improve the logic to always try to stop at the end of the string:

//Eats whitespace and stops on next character
fun loopWhite
{n:nat} {i:nat | i <= n}
(@(astring, i): (string n, int i)): [j:nat | j>=i && j <= n] (string n,
int j) =
let
val isz = size1_of_int1(i)
val notend = string_isnot_atend(astring, isz)
val notatend_m1 = str_notat_endminus1(astring, isz)
val cur_ch = if notend then astring[isz] else '\0’
in
if notatend_m1 then
if isWhiteSpace(cur_ch) then (print(“a space.\n”);
loopWhite(@(astring, i+1)))
else @(astring, i)
else if notend then @(astring, i+1) else @(astring, i)
end
//end [loopWhite]

Yes, ‘char’ is a predicative sort. It is not used often so far. Most of the
time, one just uses ‘int’ in place of ‘char’.

A couple of corrections:

  1. char does appear to be a predicative sort. My confusion on this point
    was likely caused by it not being listed as a commonly used sort in the ATS
    Book, and also by my other errors masking my attempted use of it.
  2. “Simply commenting out the existential constraint …” should have
    "universal constraint".