The source below is not the entire source, it’s just what’s required to
see. The comments comes next to the sample.
dataprop INPUT (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] INPUT(i-1, pb)
sortdef hb00 = {b:int | 0x00 <= b && b <= 0x7F}
sortdef hbC2 = {b:int | 0xC2 <= b && b <= 0xDF}
/* … there is more here … */
sortdef tb80 = {b:int | 0x80 <= b && b <= 0xBF}
/* … there is more here … */
stadef added (c:int, b:int) = (c * 64) + (b - 0x80)
dataprop UTF8_00 (i:int, c:int) =
| {b:hb00; i == 1} UTF8_00_1(i, b) of INPUT(i, b)
dataprop UTF8_C2 (i:int, c:int) =
| {b:hbC2; i == 1} UTF8_C2_1(i, b - 0xC0) of INPUT(i, b)
| {b:tb80; i == 2} UTF8_C2_2(i, added(c, b)) of UTF8_C2(i-1, c) and INPUT(i,
b)
/* … there is more here … */
The issue is on the line where I wanted to say:
[…] of UTF8_C2(i-1, c) and INPUT(i, b)
I get a syntax error here.
I would like to require two proofs to derive UTF8_C2_2. I don’t know it I
just don’t know the syntax or if it’s not feasible at all.
Note for those who want the understand the sample:
The i of INPUT is the index in an input stream abd the b, is the byte read
at that index. The sort definitions, introduces bytes which belongs to some
ranges: hb00 and hbC2 are two head bytes (coming first) and tb80 is a tail
byte (coming after a head byte). UTF8_00 is for ASCII character in an UTF-8
stream, and UTF8_C2 is for two bytes character whose head byte belongs to
0xC2…0xDF. I’m using the data from
http://www.unicode.org/versions/Unicode7.0.0/ch03.pdf at table 3-7.
I first written the specification for the decoding only, and it was OK.
Then I wanted to express the relation between the decoding process and the
order of bytes from a stream, and there, I failed due to an error which is
either a syntax error or an impossible construct, as I don’t know and
that’s the purpose of this thread.
.