Missing something obvious with if statement?

I feel like maybe I’ve run in to this problem before, but I’m not sure …
last_word is not getting the proper type with the if/else.

//Word size:
#define WSZ 8
#define BUFSZ 1024

//loopFILE (tail-recursive function):
//1: check if read byte cnt is 0 or size of buffer;
// if so, read in up to BUFSZ bytes
//2: if # of read bytes mod WSZ is not 0, set a check
// for last word, and test these bytes individually;
// set early stop condition for default algorithm
//3: else if # of read bytes < BUFSZ, set early stop condition

//Does it make sense to change some vars to vals?
fun loopFILE {n:nat} (nl: ulint, buf: &bytes n, n: size_t n, cnt: int
, last_word: int, rb: int, fil: &FILE r): ulint = let

// Any useful way to do dependent types with modulo?
val (rb, last_word) = (if mod_int_int(cnt, BUFSZ) = 0 then let 
  val [m:int] m = fread_byte (file_mode_lte_r_r | buf, n, fil)
  val cnt =  0
  val rb = mod_int_int(int_of_size(m),WSZ)
  val m = int_of_size(m)
  val _ = $showtype m      
  val last_word: int = (if rb > 0 then m-1 else m)
  //val last_word = m*m-1  // this typechecks
  val _ = $showtype last_word
in (rb, last_word) end       //typecheck problem here
else (0,BUFSZ)) // end of [if]

in // in of [loopBUF]
  nl + ulint_of_int(int_of_bool(hb))
end // end of [loopBUF]

SHOWTYPE(/media/RAID5/share/ATS_learning/line_count.dats: 2452(line=69,
offs=25) – 2453(line=69, offs=26)): S2Ecst(int_t0ype)
SHOWTYPE(/media/RAID5/share/ATS_learning/line_count.dats: 2589(line=72,
offs=25) – 2598(line=72, offs=34)): S2EVar(17)
warning(3): s3bexp_make_s2exp: s2e0 = S2Etyleq(0; S2Etyrec(flt; 0;
0=S2Ecst(int_t0ype), 1=S2EVar(17)), S2EVar(13))
/media/RAID5/share/ATS_learning/line_count.dats: 2606(line=73, offs=8) –
2621(line=73, offs=23): warning(3): the constraint [S2Etyleq(0; S2Etyrec(flt; 0; 0=S2Ecst(int_t0ype), 1=S2EVar(17)), S2EVar(13))] cannot be
translated into a form accepted by the constraint solver.
/media/RAID5/share/ATS_learning/line_count.dats: 2606(line=73, offs=8) –
2621(line=73, offs=23): error(3): unsolved constraint: C3STRprop(main;
S2Etyleq(0; S2Etyrec(flt; 0; 0=S2Ecst(int_t0ype), 1=S2EVar(17)),
S2EVar(13)))
type-checking has failed: there are some unsolved constraints: please
inspect the above reported error message(s) for information.

Also:

fil: &FILE r

can be changed to

fil: FILEref

so as to simplify things a bit.

‘cnt’ should be greater than or equal to 8? The typechecker cannot infer it.

Also, when using arrayptr2word, you may need to pay attention to the
alignment issue.

You ‘loopFILE’ should probably use fread (instead of fread_byte) as the
checking you do in the code is already taken care of by fread.

You need to declare fread before using it:

extern fun fread (…) = “mac#fread”

In ATS2, you can make an external cal like this: extfcall (size_t, “fread”,
…).

–Hongwei

extern
praxi arrptr2word {m,n:nat | n-m >= WSZ} {l:agz}
(pf: !array_v (byte, n, l), ofs: int m): bytes_v (WSZ, l+m)

This one is problematic: pf cannot be of the same view after a part of it is taken out.

If you do it primarily for the purpose of learning, then you can use
array_v (array view) to do pointer arithmetic safely in ATS.
If you want to save time, please try some of the unsafe functions in
prelude/unsafe.sats.

pf_buf, I think, is a proof that a byte type exists at each byte from buf
to buf+BUFSZ-1. So it should suffice to have it in scope, I think. My
problem is, I’m not sure how to >>actually make the conversion from a ptr
type to a @[byte][WSZ] (a machine word).

You just add an axiom:

extern
praxi ptr2bytes {l:addr} (pf: ptr @ l): bytes_v (WSZ, l)

We refer to the style of program verification in ATS as
“programmer-centric” program verification: one should first and foremost
convince oneself why one’s code is correct.
One may introduce axioms at one’s wish as long as such axioms can be
readily explained to or understood by other properly trained people.

Please change ‘%{’ to ‘%{^’:

%{^

%}

means to put the code at the beginning of the generated C code.

Thanks!

By the way, what happened to bytes_v_of_b0ytes_v in ATS2?

I think I’m getting there, but not sure if using array_v_split is the
easiest direction to go (though likely the best) in getting the following
code segment to work.

pf_buf, I think, is a proof that a byte type exists at each byte from buf
to buf+BUFSZ-1. So it should suffice to have it in scope, I think. My
problem is, I’m not sure how to actually make the conversion from a ptr
type to a @[byte][WSZ] (a machine word).

staload “libc/SATS/stdio.sats”

//staload
//UN = “prelude/SATS/unsafe.sats”

// Counting newlines in this fashion makes an assumption
// that each newline must be at least 8 byes from the next.
// This isn’t the case for all matrices, but it certainly is
// for any that we consider in this program.
// Ref: http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
// To extend to working with arbitrary file sizes it may be possible to
// first find zeros then take a NOT value, finally doing a population count
// http://aggregate.org/MAGIC/#Population Count (Ones Count)

//

// This is all very unsafe for different archs: need to consider things
such as
// non-octet bytes, variable word sizes, and (possibly) endianness.

%{^

#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)
#define hasvalue(x,n) (haszero((x) ^ (~0UL/255 * (n)))
#define hasnewline(x) hasvalue(x,0x0a0a0a0a0a0a0a0aLL)

// unsigned long int hasnewline(void* x) {return
hasvalue(x,0x0a0a0a0a0a0a0a0aLL);}

%}

//Word size:
#define WSZ 8
#define BUFSZ 1024
#define sz1 size1_of_int1 1

extern fun malloc_ab {n:nat} (n: size_t): [l:agz] (array_v (byte, n, l) |
ptr l) = "mac#malloc"
extern fun free_ab {n:nat} {l:agz} (pf: array_v (byte, n, l) | p: ptr
l):void = “mac#free”

extern fun fread1 {l:agz} {n:nat} {f:fm} (pf_mod: file_mode_lte (f, r)
| dest: ptr l, sz: size_t 1, count: size_t n, fil: &FILE r)
: [m:nat] size_t m = “mac#fread”

implement main {n} (argc,argv) = () where {
val pargv = &argv
val () = assertloc(argc = 2)
val expInFi = argv.[1]
prval (pf, fpf) = __assert(pargv) where {
extern praxi __assert {l:addr} (p: ptr l): (ptrarr n @ l, ptrarr n @ l
-<lin,prf> void)
}
prval (pf1, pf2) = ptrarr_uncons(pf)
val (pf_inFIEXP | inFIEXP) = fopen_exn (expInFi, file_mode_r)

// val (pfopt | p_buf) = malloc_ngc (BUFSZ)
val (pf_buf | p_buf) = malloc_ab (BUFSZ)
val () = assert_errmsg (p_buf > null, #LOCATION)

// Need to write a function that calls array_v_split twice
// to get interior segment?

// make btCnt, cnt sizeLte(BUFSZ)

//loopFILE (tail-recursive function):
//1: check if read byte cnt is 0 or size of buffer;
// if so, read in up to BUFSZ bytes
//2: if # of read bytes mod WSZ is not 0, set a check
// for last word, and test these bytes individually;
// set early stop condition for default algorithm
//3: else if # of read bytes < BUFSZ, set early stop condition

fun loopFILE {n:nat} {l:agz} (pf_buf: !array_v(byte,n,l) | nl: ulint,
buf: ptr l(* &bytes n *),
n: size_t n, cnt: int, last_word: int, rb: int, fil: &FILE r): ulint =
let

val (rb, last_word) = (if mod_int_int(cnt, BUFSZ) = 0 then let
  // val [m:int] m = fread_byte (pf_buf_byt, file_mode_lte_r_r | buf, 

n, fil)
val [m:int] m = fread1 ( file_mode_lte_r_r | buf, sz1, n, fil)
val cnt = 0
val m = int_of_size(m)
val rb = mod_int_int(m,WSZ)
val last_word = (if rb > 0 then m-1 else m):int
in (rb, last_word) end
else (0,BUFSZ)) // end of [if]

//TODO: a view type for a sequenc of bytes: array_v_split
val _ = $showtype !buf
val buftmp = buf+cnt
prval  pf_buf_bytes = bytes_v_of_b0ytes_v (pf_buf)
val hb = hasnewline (pf_buf_bytes | !buf) where {
  extern fun hasnewline {lw:agz} (pf: bytes(WSZ) @ lw | cw:  &bytes WSZ 

(*cp: ptr lw *)): ulint = “mac#hasnewline”
}
// Increment number of newlines
val nl = nl + hb

// Need to increment cnt
val cnt = cnt + WSZ

// !!! Need to add in check for remainder bytes !!!


in // in of [loopFILE]
 if last_word = BUFSZ then loopFILE(pf_buf | nl, buf, n, cnt, 

last_word, rb, fil)
val numlines = loopFILE(pf_buf | ulint_of_size(size_of_int(0)),p_buf,
BUFSZ
, 0, 0, 0, !inFIEXP)
val () = println!(numlines)

val () = free_ab (pf_buf | p_buf);
val () = fclose_exn(pf_inFIEXP | inFIEXP)
prval pf = ptrarr_cons(pf1, pf2)
prval () = fpf(pf)
}

Sorry for the formatting above. I’ve been working on the 2nd highlighted
section but get an error. Here are my changes:

val cnt = int1_of_int(cnt)
//Replace this with dependent typing later?
val () = assertloc(cnt >= 0)
prval pf_buf_bytes = arrptr2word (pf_buf, cnt)  //typecheck error
val hb = hasnewline (pf_buf_bytes | !buf) where {
  extern fun hasnewline {lw:agz} (pf: bytes(WSZ) @ lw | cw: 

@[byte][WSZ] (* &bytes WSZ *)): ulint = “mac#hasnewline”
}

/media/RAID5/share/ATS_learning/line_count.dats: 3478(line=97, offs=26) –
3489(line=97, offs=37): error(3): unsolved constraint: C3STRprop(main;
S2Eapp(S2Ecst(>=); S2Eapp(S2Ecst(-); S2EVar(22), S2EVar(21)), S2Eint(8)))
type-checking has failed: there are some unsolved constraints: please
inspect the above reported error message(s) for information.

Thanks,
Brandon

I had been wondering what a praxi was.

So an axiom shouldn’t be able to be abused if it is obviously true, but the
one example listed above looks like it may not be true, since there is no
information about how many byte are allocated to ptr. I have tried to take
this into account in the highlighted example below. I guess I still need to
use array_v_split in the mechanism below to actually get a ptr with the
correct address (second highlighted block)?

staload “libc/SATS/stdio.sats”
//staload//UN = “prelude/SATS/unsafe.sats”
// Counting newlines in this fashion makes an assumption// that each
newline must be at least 8 byes from the next.// This isn’t the case for
all matrices, but it certainly is// for any that we consider in this
program.// Ref:
http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord// To extend
to working with arbitrary file sizes it may be possible to// first find
zeros then take a NOT value, finally doing a population count//
http://aggregate.org/MAGIC/#Population Count (Ones Count)
//
// This is all very unsafe for different archs: need to consider things
such as // non-octet bytes, variable word sizes, and (possibly) endianness.
%{^

#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)#definehasvalue
(x,n) (haszero((x) ^ (~0UL/255 * (n)))#define hasnewline(x) hasvalue(x,
0x0a0a0a0a0a0a0a0aLL)
// unsigned long int hasnewline(void* x) {return
hasvalue(x,0x0a0a0a0a0a0a0a0aLL);}
%}

//Word size:#define WSZ 8 #define BUFSZ 1024 #define sz1 size1_of_int1 1
viewdef bytes_v (n:int, l:addr) = bytes n @ l
extern praxi arrptr2word {m,n:nat | n-m >= WSZ} {l:agz} (pf: !array_v (
byte, n, l), ofs: int m): bytes_v (WSZ, l+m)

extern fun malloc_ab {n:nat} (n: size_t): [l:agz] (array_v (byte, n, l) |ptr l
) = "mac#malloc"extern fun free_ab {n:nat} {l:agz} (pf: array_v (byte, n, l)
| p: ptr l):void = "mac#free"
extern fun fread1 {l:agz} {n:nat} {f:fm} (pf_mod: file_mode_lte (f, r) |dest
: ptr l, sz: size_t 1, count: size_t n, fil: &FILE r) : [m:nat] size_t m =
"mac#fread"
implement main {n} (argc,argv) = () where { val pargv = &argv val () =assertloc
(argc = 2) val expInFi = argv.[1] prval (pf, fpf) = __assert(pargv) where
{ extern praxi __assert {l:addr} (p: ptr l): (ptrarr n @ l, ptrarr n @ l
-<lin,prf> void) } prval (pf1, pf2) = ptrarr_uncons(pf)
val (pf_inFIEXP | inFIEXP) = fopen_exn (expInFi, file_mode_r)
// val (pfopt | p_buf) = malloc_ngc (BUFSZ) val (pf_buf | p_buf) =
malloc_ab (BUFSZ) val () = assert_errmsg (p_buf > null, #LOCATION)
// Need to write a function that calls array_v_split twice // to get
interior segment? // make btCnt, cnt sizeLte(BUFSZ)
//loopFILE (tail-recursive function): //1: check if read byte cnt is 0
or size of buffer; // if so, read in up to BUFSZ bytes //2: if #
of read bytes mod WSZ is not 0, set a check // for last word, and
test these bytes individually; // set early stop condition for
default algorithm //3: else if # of read bytes < BUFSZ, set early stop
condition fun loopFILE {n:nat} {l:agz} (pf_buf: !array_v(byte,n,l) | nl:
ulint, buf: ptr l(* &bytes n *), n: size_t n, cnt: int, last_word: int,
rb: int, fil: &FILE r): ulint = let
val (rb, last_word) = (if mod_int_int(cnt, BUFSZ) = 0 then let //
val [m:int] m = fread_byte (pf_buf_byt, file_mode_lte_r_r | buf, n, fil)
val [m:int] m = fread1 ( file_mode_lte_r_r | buf, sz1, n, fil) val
cnt = 0 val m = int_of_size(m) val rb = mod_int_int(m,WSZ) val last_word = (if rb > 0 then m-1 else m):int in (rb, last_word) end else (0,BUFSZ)) // end of [if] //TODO: a view type for a sequenc of bytes: array_v_split
val buftmp = buf+cnt prval pf_buf_bytes = arrptr2word (pf_buf,int1_of_int
(cnt)) val hb = hasnewline (pf_buf_bytes | !buftmp) where { externfun hasnewline
{lw:agz} (pf: bytes(WSZ) @ lw | cw: &bytes WSZ): ulint = “mac#hasnewline”
} // Increment number of newlines val nl = nl + hb //
Increment cnt val cnt = cnt + WSZ // !!! Need to add in check
for remainder bytes !!! in // in of [loopFILE] if
last_word = BUFSZ then loopFILE(pf_buf | nl, buf, n, cnt, last_word, rb,
fil) else nl end // end of [loopFILE] val numlines =
loopFILE(pf_buf | ulint_of_size(size_of_int(0)),p_buf, BUFSZ , 0, 0, 0,
!inFIEXP)
val () = println!(numlines)

val () = free_ab (pf_buf | p_buf);
val () = fclose_exn(pf_inFIEXP | inFIEXP)
prval pf = ptrarr_cons(pf1, pf2)
prval () = fpf(pf)
}

Thank you. Sorry for still not being there quite yet. I copy two versions
below. The first is almost directly based on GNU coreutils’ wc line count,
which uses memchr (which is what makes it fast). I noticed the only use of
memchr in ATS uses it in the unsafe framework. Is it necessary to be so?

// A fast text-file line-counter.
// This code is adapted from wc.c in GNU Coreutils - Brandon Barker
//

/* wc - print the number of lines, words, and bytes in files
Copyright © 1985-2013 Free Software Foundation, Inc.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/. */

/* Written by Paul Rubin, p...@ocf.berkeley.edu
and David MacKenzie, d...@gnu.ai.mit.edu. */

#include <stdio.h>
#include <inttypes.h>
#include <string.h>

#define BUFSZ 16384

int main(int argc, char *argv[]) {

FILE* txtFILE;
txtFILE = fopen(argv[1], “r”);
char buf[BUFSZ];

unsigned long nl = 0;
unsigned long bytes = 0;
// number of bytes read by fread, set to 1 for while
// loop initialization
unsigned long bytes_read = 0;
while ((bytes_read = fread(buf, sizeof(unsigned char), BUFSZ, txtFILE)) >
0) {
char *p = buf;
if (bytes_read == 0) {
break;
}
while ((p = memchr (p, ‘\n’, (buf + bytes_read) - p))) {
++p;
++nl;
}
bytes += bytes_read;
}
printf("%lu\n",nl);
return 0;
}

The second method is what I was trying to use, but is not as nice because
it is very slightly slower and currently assumes more than 8 chars separate
each newline. So I would suggest not using this method, unless it is
somehow “more safe” than memchr.

// A fast text-file line-counter.
// Currently assumes newlines are always at least a
// machine word apart.
//
// Assumes word length is 8 bytes.
// Assumes sizeof(uchar) = sizeof(byte)
//

#include <stdio.h>
#include <inttypes.h>
#include <string.h>

// How to get word size from includes?
#define WSZ 8
#define BUFSZ 16384

//SIMD operations
#define haszero(v) (((v) - 0x0101010101010101UL) & ~(v) &
0x8080808080808080UL)
#define hasvalue(x,n) (haszero((x) ^ (~0UL/255 * (n))))
//#define hasvalue(x,n) (haszero(x ^ n))
#define hasnewline(x) hasvalue(x,0x0a)
//#define hasnewline(x) hasvalue(x,0x0a0a0a0a0a0a0a0aLL)

int main(int argc, char *argv[]) {

FILE* txtFILE;
txtFILE = fopen(argv[1], “r”);
unsigned char buffer[BUFSZ];
unsigned char wbuf[WSZ+1];

//current word
uint64_t * data;
// cnt is byte position in buffer
size_t cnt = 0;
// remaining bytes in buffer that don’t fill a word
int rb = 0;
// # of bytes read in to full words from fread
int lwlb = 0;
// number of newlines
unsigned long nl = 0;
// number of bytes read by fread, set to 1 for while
// loop initialization
size_t m=1;
while (m>0) {
if (cnt % BUFSZ == 0) {
m = fread(buffer, sizeof(unsigned char), BUFSZ, txtFILE);
cnt = 0;
rb = m % WSZ;
lwlb = m-rb;
}
// Find a newline in a machine word
if (m - cnt >= WSZ) {
data = ((uint64_t*) (buffer+cnt));
if (hasnewline(*data)) { nl += 1; }
cnt += WSZ;
}
// Count any newlines in remaining bytes
if (rb>0 && cnt == lwlb) {
for (int i=0; i<rb; i++) {
nl += (uint64_t) (buffer[cnt] == ‘\n’);
cnt += sizeof(unsigned char);
}
m = 0;
}
} // end of [while]
fclose(txtFILE);
printf("%lu\n",nl);
return 0;
}

Ah, I finally remembered, had to annotate the if clause
val last_word: int = (if rb > 0 then m-1 else m):int

Thanks for that performance improvement tip. I guess I will try to change
it after I fix some more fundamental problems.

Are there any examples of using pointer arithmetic with &bytes n, that is,
@[byte][n] ? Or should this be done with arrays?

This may clarify my question a bit better. I want to look at a buffer, one
word (e.g. 8 bytes) at a time:

%{

#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)
#define hasvalue(x,n) (haszero((x) ^ (~0UL/255 * (n)))
#define hasnewline(x) hasvalue(x,0x0a0a0a0a0a0a0a0aLL)

%}

// …
fun loopFILE {n:nat} (nl: ulint, buf: &bytes n, n: size_t n, cnt: int
, last_word: int, rb: int, fil: &FILE r): ulint = let

// …

val hb = hasnewline (buf.[cnt]) where {
extern fun hasnewline {l:agz} (cp: &(@[byte?][WSZ])) ): bool =
“mac#hasnewline”
}

I really feel that you are kind of trapped by the complexity of dependent
types.

Could you implement your problem in C? Then I will be happy to translate it
into ATS.
In this way, you get a chance to see some typical ways of using dependent
types.

‘cnt’ should be greater than or equal to 8? The typechecker cannot infer
it.

cnt >= 0; I now have this and others checked in an “if” expression, so arrayptr2word
passes typechecking. But pf_buf_bytes is somehow still not showing that we
have 8 bytes:

val cnt = int1_of_int(cnt)
//Replace this with dependent typing later?
val hb = (if cnt >= 0 && int1_of_size1(n)-cnt >= WSZ  then let
  val _ = $showtype pf_buf
  prval pf_buf_bytes = arrptr2word (pf_buf, cnt)
in
  hasnewline (pf_buf_bytes | !buf) where {
    extern fun hasnewline {lw:agz} (pf: bytes(WSZ) @ lw | cw: 

@[byte][WSZ] (* &bytes WSZ *)): ulint = “mac#hasnewline”}
end
else ulint_of_int(0)):ulint

SHOWTYPE(/media/RAID5/share/ATS_learning/line_count.dats: 3511(line=97,
offs=25) – 3517(line=97, offs=31)): S2Eapp(S2Ecst(at_viewt0ype_addr_view);
S2Etyarr(S2Ecst(byte); S2Evar(n(2706))), S2Evar(l(2707)))
/media/RAID5/share/ATS_learning/line_count.dats: 3614(line=100, offs=34) –
3618(line=100, offs=38): error(3): unsolved constraint: C3STRprop(main;
S2Eeqeq(S2Evar(n(2706)), S2Eint(8)))
type-checking has failed: there are some unsolved constraints: please
inspect the above reported error message(s) for information.

Also, when using arrayptr2word, you may need to pay attention to the
alignment issue.

I was thinking of implementing it to take #words instead of #bytes as an
offset argument to arrayptr2word, but hadn’t done so yet :).

Great, I went with array_v - the ATS Book and Chris’s recent blog helped.

There are probably a few things that can be improved (hah), but I’m getting
a type error when I try to thread a FILE ref through a loopFILE recursively:

staload “libc/SATS/stdio.sats”

typedef two = [b:two] int b
viewdef bytes_v (n:int, l:addr) = bytes n @ l

//staload
//UN = “prelude/SATS/unsafe.sats”

// Counting newlines in this fashion makes an assumption
// that each newline must be at least 8 byes from the next.
// This isn’t the case for all matrices, but it certainly is
// for any that we consider in this program.
// Ref: http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
// To extend to working with arbitrary file sizes it may be possible to
// first find zeros then take a NOT value, finally doing a population count
// http://aggregate.org/MAGIC/#Population Count (Ones Count)

%{

#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)
#define hasvalue(x,n) (haszero((x) ^ (~0UL/255 * (n)))
#define hasnewline(x) hasvalue(x,0x0a0a0a0a0a0a0a0aLL)

%}

//Word size:
#define WSZ 8
#define BUFSZ 1024
#define sz1 size1_of_int1 1

extern fun malloc_ab {n:nat} (n: size_t): [l:agz] (array_v (byte, n, l) |
ptr l) = "mac#malloc"
extern fun free_ab {n:nat} {l:agz} (pf: array_v (int, n, l) | p: ptr
l):void = “mac#free”

extern fun fread1 {l:agz} {n:nat} {f:fm} (pf_mod: file_mode_lte (f, r)
| dest: ptr l, sz: size_t 1, count: size_t n, fil: &FILE r)
: [m:nat] size_t m = “mac#fread”

implement main {n} (argc,argv) = () where {
val pargv = &argv
val () = assertloc(argc = 2)
val expInFi = argv.[1]
prval (pf, fpf) = __assert(pargv) where {
extern praxi __assert {l:addr} (p: ptr l): (ptrarr n @ l, ptrarr n @ l
-<lin,prf> void)
}
prval (pf1, pf2) = ptrarr_uncons(pf)
val (pf_inFIEXP | inFIEXP) = fopen_exn (expInFi, file_mode_r)

// val (pfopt | p_buf) = malloc_ngc (BUFSZ)
val (pf_buf | p_buf) = malloc_ab (BUFSZ)
val () = assert_errmsg (p_buf > null, #LOCATION)

// Need to write a function that calls array_v_split twice
// to get interior segment?

// make btCnt, cnt sizeLte(BUFSZ)

//loopFILE (tail-recursive function):
//1: check if read byte cnt is 0 or size of buffer;
// if so, read in up to BUFSZ bytes
//2: if # of read bytes mod WSZ is not 0, set a check
// for last word, and test these bytes individually;
// set early stop condition for default algorithm
//3: else if # of read bytes < BUFSZ, set early stop condition

fun loopFILE {n:nat} {l:agz} (pf_buf | nl: ulint, buf: ptr l(* &bytes n
*),
n: size_t n, cnt: int, last_word: int, rb: int, fil: &FILE r): ulint =
let

val (rb, last_word) = (if mod_int_int(cnt, BUFSZ) = 0 then let
  // val [m:int] m = fread_byte (pf_buf_byt, file_mode_lte_r_r | buf, 

n, fil)
val [m:int] m = fread1 ( file_mode_lte_r_r | buf, sz1, n, fil)
val cnt = 0
val m = int_of_size(m)
val rb = mod_int_int(m,WSZ)
val last_word = (if rb > 0 then m-1 else m):int
in (rb, last_word) end
else (0,BUFSZ)) // end of [if]

//Need a view type for a sequenc of bytes: array_v_split
val hb = hasnewline (buf+8) where {
  extern fun hasnewline {lw:agz} (cp: ptr lw  (* @[byte][WSZ] *) ): 

bool = “mac#hasnewline”
}
// Increment number of newlines
val nl = nl + ulint_of_int(int_of_bool(hb))

// Need to increment cnt
val cnt = cnt + WSZ

in // in of [loopBUF]
 if last_word = BUFSZ then loopFILE(pf_buf
   | nl, buf, n, cnt, last_word, rb, fil)
 else
   nl
end // end of [loopBUF]

val numlines = loopFILE(pf_buf | ulint_of_size(size_of_int(0)),!p_buf,
BUFSZ
, 0, 0, 0, !inFIEXP)
val () = println!(numlines)

// val () = free_ngc (pf_ngc, pf_buf | p_buf);
val () = fclose_exn(pf_inFIEXP | inFIEXP)
prval pf = ptrarr_cons(pf1, pf2)
prval () = fpf(pf)
}

I made a minor change in the code segment above ( cw: @[byte][WSZ] —> cw:
&(@[byte][WSZ])) and uploaded the file for convenience. I seem to be quite
close, and all constraints appear to be linear, soI don’t understand why
“S2EVar(28)”, that is, n in bytes_v(n,l), is not known to be 8: this is
established in the praxi.

SHOWTYPE(/media/RAID5/share/ATS_learning/line_count.dats: 3510(line=97,
offs=25) – 3514(line=97, offs=29)): S2Etyarr(S2Ecst(byte); S2Evar(n(2706)))
SHOWTYPE(/media/RAID5/share/ATS_learning/line_count.dats: 3592(line=99,
offs=25) – 3596(line=99, offs=29)): S2Etyarr(S2Ecst(byte); S2EVar(28))
/media/RAID5/share/ATS_learning/line_count.dats: 3640(line=101, offs=34) –
3644(line=101, offs=38): error(3): size mismatch for assignment: the
following two types are expected to be size-equal but they may not be:
S2Etyarr(S2Ecst(byte); S2Eint(8))
S2Etyarr(S2Ecst(byte); S2EVar(28))

line_count_dats.txt (4.36 KB)

It is typechecking. Maybe I’m inlining wrong, i.e. %{ … %}, but it
appears that C function macros aren’t callable using "mac#my_C_macfunc"
like normal C functions (e.g, hasnewline in the code above). Is the best
way to just inline an actual c function?

fun loopFILE {n:nat} {l:agz} (pf_buf | nl: ulint, buf: ptr l(* &bytes n
*),
n: size_t n, cnt: int, last_word: int, rb: int, fil: &FILE r): ulint

‘pf_buf’ not given a type (view).