Going on with https://groups.google.com/forum/#!topic/ats-lang-users/q0sY4IdTMbg , I
proved each character ranges which valid UTF-8 string can represents. It
was proved based on the UTF-8 specification re‑written for ATS. I though
proving expected properties derived from the specification, was a way to
check the specification it-self, and indeed I could catch two typo errors
this way (two misspelled constants, so two constants substituted for
another).
That’s already nice. What’s nice too, is that it was more easy than I was
afraid it would be; the constraint solver just pleasantly surprised me, I
was not expecting so much of it. Basically, I just told it where about to
look around, and it found it-self. The proof for each range is short. Here
is an example (for UTF-8 strings starting with a head byte in 0xF1…0xF3):
prfn utf8_f1_range
{c:int} (pf:UTF8_F1(4, c)):
[UTF8_F1_LOWER <= c; c <= UTF8_F1_UPPER] void =
case+ pf of
| UTF8_F1_4(pfc3, _) =>
let
prval UTF8_F1_3(pfc2, _) = pfc3
prval UTF8_F1_2(pfc1, _) = pfc2
prval UTF8_F1_1(FIRST) = pfc1
in end
This lead to a question: what can be safely assumed from the constraint
solver? Are its capacities, more or less specified, even informally?
This would be nice to know, as I know there were sometimes issues with
Isabelle/HOL, when from a version to another, some proof were failing and
needed to be rewritten. I think the point is easily seen, and that’s why I
have this question.
for the story, now I will attempt to prove each value in each range are
indeed decodable from UTF-8 data. I still have no idea of how I will do
I took a look at the implementation of constraint-solving in ATS.
I don’t really know what the cause of the problem is. It seems that
integer division (/) is not treated as a primitive function; this means
that UTF8_E0_UPPER/TB_BASE is not simplified to 0x3F.
I am pretty sure that this problem does not exist
if you use Z3.
I know, but as long I can rely on ATS only, I prefer to rely on ATS only.
Any way, I will give it a try.
prval () =
prop_verify_and_add
{UTF8_E0_UPPER/TB_BASE==0x3F}()
// end of [prval]
prval () =
prop_verify_and_add
{temp <= (UTF8_E0_UPPER/TB_BASE)}()
// end of [prval]
Reminds me something. Was not in my notes…
Not a keyword, so instead I will at least make the syntax colorizer,
colorize it as “support function”.
Now, I remember there was other in the same family. Will search the files
to see where it is defined and what come along with it.
The core of the constraint-solver in ATS is based on linear integer
programming.
The solver adds to its core some heuristics for handling non-linear terms
(involving
multiplication). It is a very primitive solver and also a very stable one.
So basically, modulo the computation time (the problem it said to be
NP-hard), whenever I can reduce a numeric problem to linear integer
expressions, it’s OK (I searched the web, and it seems standard), I can
count on ATS alone.
For the non-linear expressions, what are the additions? is there is
reasonable picture of the kind of expressions?
It is a bit hard to describe the added heuristics. Basically, some kind of
algebra involving polynormials. For instance,
(m+n)(m-n) = mm - n*n
Here is an example that can probably give you more feel:
TB_BASE is a #define. I turned multiple things into #define, as I don’t
like magic numbers, while I agree, there are case where it’s more easy to
know the real value.
In the attached file, at line 255. Just above, I copied/pasted the defines
to help me.
The core of the constraint-solver in ATS is based on linear integer
programming.
The solver adds to its core some heuristics for handling non-linear terms
(involving
multiplication). It is a very primitive solver and also a very stable one.
I suggest that you try the Z3 solver for solving constraints generated from
typechecking
ATS code. It is quite good.On Saturday, December 12, 2015 at 5:12:22 PM UTC-5, Yannick Duchêne wrote:
Going on with Redirecting to Google Groups , I
proved each character ranges which valid UTF-8 string can represents. It
was proved based on the UTF-8 specification re‑written for ATS. I though
proving expected properties derived from the specification, was a way to
check the specification it-self, and indeed I could catch two typo errors
this way (two misspelled constants, so two constants substituted for
another).
That’s already nice. What’s nice too, is that it was more easy than I was
afraid it would be; the constraint solver just pleasantly surprised me, I
was not expecting so much of it. Basically, I just told it where about to
look around, and it found it-self. The proof for each range is short. Here
is an example (for UTF-8 strings starting with a head byte in 0xF1…0xF3):
prfn utf8_f1_range
{c:int} (pf:UTF8_F1(4, c)):
[UTF8_F1_LOWER <= c; c <= UTF8_F1_UPPER] void =
case+ pf of
| UTF8_F1_4(pfc3, _) =>
let
prval UTF8_F1_3(pfc2, _) = pfc3
prval UTF8_F1_2(pfc1, _) = pfc2
prval UTF8_F1_1(FIRST) = pfc1
in end
This lead to a question: what can be safely assumed from the constraint
solver? Are its capacities, more or less specified, even informally?
This would be nice to know, as I know there were sometimes issues with
Isabelle/HOL, when from a version to another, some proof were failing and
needed to be rewritten. I think the point is easily seen, and that’s why I
have this question.
for the story, now I will attempt to prove each value in each range are
indeed decodable from UTF-8 data. I still have no idea of how I will do
I took a look at the implementation of constraint-solving in ATS.
I don’t really know what the cause of the problem is. It seems that
integer division (/) is not treated as a primitive function; this means
that UTF8_E0_UPPER/TB_BASE is not simplified to 0x3F.
I am pretty sure that this problem does not exist
if you use Z3.
Anyway, you may want to try the following coding style
(instead of introducing functions like f1(), f2(), …
prval () =
prop_verify_and_add
{UTF8_E0_UPPER/TB_BASE==0x3F}()
// end of [prval]
prval () =
prop_verify_and_add
{temp <= (UTF8_E0_UPPER/TB_BASE)}()
// end of [prval]On Sunday, December 13, 2015 at 7:41:22 PM UTC-5, Yannick Duchêne wrote:
Does it compute constants expressions as constants when it encounters any?
Don’t need more to see the picture. If I comment out the two first line,
it fails on f3. If I delete the two first line and replace UTF8_E0_UPPER / TB_BASE, dividing two constants, by its constant value, 0x03, then it does
not fails. Seems I have to tell it the value of a constants expressions.
I’m surprised, because I though getting constants from constant
expressions, was part of the basics (while keeping the original expression
too, as it may be helpful).
Also, is there a more convenient notation for this kind of things?
Something I would name “note” and “recall”? #define are not OK with that,
ATS don’t want this: #define NOTE(name, formula) prfn name (): formula void = () #define RECALL(name) prval () = name()
Le samedi 12 décembre 2015 23:12:22 UTC+1, Yannick Duchêne a écrit :
This lead to a question: what can be safely assumed from the constraint
solver? Are its capacities, more or less specified, even informally?
Don’t need more to see the picture. If I comment out the two first line, it
fails on f3. If I delete the two first line and replace UTF8_E0_UPPER / TB_BASE, dividing two constants, by its constant value, 0x03, then it does
not fails. Seems I have to tell it the value of a constants expressions.
I’m surprised, because I though getting constants from constant
expressions, was part of the basics (while keeping the original expression
too, as it may be helpful).
Also, is there a more convenient notation for this kind of things?
Something I would name “note” and “recall”? #define are not OK with that,
ATS don’t want this: #define NOTE(name, formula) prfn name (): formula void = () #define RECALL(name) prval () = name()Le samedi 12 décembre 2015 23:12:22 UTC+1, Yannick Duchêne a écrit :
This lead to a question: what can be safely assumed from the constraint
solver? Are its capacities, more or less specified, even informally?
Don’t need more to see the picture. If I comment out the two first line,
it fails on f3. If I delete the two first line and replace UTF8_E0_UPPER / TB_BASE, dividing two constants, by its constant value, 0x03, then it does
not fails. Seems I have to tell it the value of a constants expressions.
I’m surprised, because I though getting constants from constant
expressions, was part of the basics (while keeping the original expression
too, as it may be helpful).
Also, is there a more convenient notation for this kind of things?
Something I would name “note” and “recall”? #define are not OK with that,
ATS don’t want this: #define NOTE(name, formula) prfn name (): formula void = () #define RECALL(name) prval () = name()
Le samedi 12 décembre 2015 23:12:22 UTC+1, Yannick Duchêne a écrit :
This lead to a question: what can be safely assumed from the constraint
solver? Are its capacities, more or less specified, even informally?
The core of the constraint-solver in ATS is based on linear integer
programming.
The solver adds to its core some heuristics for handling non-linear terms
(involving
multiplication). It is a very primitive solver and also a very stable one.
So basically, modulo the computation time (the problem it said to be
NP-hard), whenever I can reduce a numeric problem to linear integer
expressions, it’s OK (I searched the web, and it seems standard), I can
count on ATS alone.
For the non-linear expressions, what are the additions? is there is
reasonable picture of the kind of expressions?
I suggest that you try the Z3 solver for solving constraints generated from
typechecking
ATS code. It is quite good.
If I need it and am stuck otherwise, I will try it.