Compiler warnings

It should probably be emphasized that facilitating inlining is just
a by-product of templates. Moving a template around allows the template
to be re-interpreted in different contexts, and re-interpretation is really
the
key to code reuse.On Friday, October 30, 2015 at 7:59:18 PM UTC-4, gmhwxi wrote:

One consequence of template instantiation is bringing the definition of a
template to the place where
the template is used. In this way, it creates an opportunity of the C
compiler to actually perform inlining.

A non-template function is still while a function template is mobile (in
the sense that its code gets moved
around by the ATS compiler).

On Friday, October 30, 2015 at 7:26:51 PM UTC-4, Mike Jones wrote:

Ok, so you are saying using templates enables inlining, but does not
enforce it. You can always find a compiler setting to prevent it.

I am basing this on a comment in another thread where I had a function
like u73 that clamped a value, and you suggested templates so that it would
be inlined.

In case you want to go the other direction, that is, downsizing your code,
you can make use of higher-order functions in ATS in the absence of dynamic
memory allocation. There is an example in the following paper showing how
this is actually done:

http://www.metasepi.org/doc/metasepi-icfp2015-arduino-ats.pdfOn Sunday, November 1, 2015 at 7:41:21 PM UTC-5, Mike Jones wrote:

Here is a metric from my code:

The original code used 13K of program space. By using templates
everywhere, it increased size by 37% to 18K.

Given the Atmega328 has a 32K program space, this is pretty painful.

If functions in ATS require 2-3 C functions this may become a practical
limit to use of ATS in very small embedded systems. So while the statics go
away, the this overhead does not.

The non-template version is not limited by code speed, but by IO, most of
the time. But different parts of the code have different needs. Given that
some functions will be shared between time critical and space critical
parts of the code, what is needed is a way to annotate the code to tell the
ATS when to instantiate a template, and when to use a shared function.
Thus, being able to use templates in time critical areas, and not in others

Given the above, is there a way to put a wrapper around each template such
that when used, it uses a shared function, but does not add even more C
function call overhead? The idea being a template is instantiated once as a
shared function, with no additional overhead in the generated code, or at
least a very small overhead like one pointer dereference, etc.

That is correct.On Fri, Oct 30, 2015 at 5:12 PM, Mike Jones proc...@gmail.com wrote:

And if you use templates everywhere, it global optimizes, but the image is
bigger due to inlining. Correct?


You received this message because you are subscribed to the Google Groups
“ats-lang-users” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to ats-lang-user...@googlegroups.com.
To post to this group, send email to ats-lan...@googlegroups.com.
Visit this group at http://groups.google.com/group/ats-lang-users.
To view this discussion on the web visit
https://groups.google.com/d/msgid/ats-lang-users/400a6d0b-7c22-4baf-8c15-d7de7de21f36%40googlegroups.com
https://groups.google.com/d/msgid/ats-lang-users/400a6d0b-7c22-4baf-8c15-d7de7de21f36%40googlegroups.com?utm_medium=email&utm_source=footer
.

Ok, so you are saying using templates enables inlining, but does not
enforce it. You can always find a compiler setting to prevent it.

I am basing this on a comment in another thread where I had a function like
u73 that clamped a value, and you suggested templates so that it would be
inlined.

A nice piece of magic.

Is the resulting object code similar in size and performance to writing a
tail recursive loop? The implementation in the prelude looks like it might
have more overhead in terms of generated code, etc.

In my current application context, it might mean the code listens to some
interface, then makes decisions about what interface to frame over, then
instance an interface for it. This means all code that consumes the
interface is the same.

This sounds good but reality may not be so compliant. Often one interface
cannot be fit all.
The datatype-based approach (as is shown in your Haskell code) is what I
call a ‘closed-world’ approach,
which you can certainly do in ATS. However, the template-based approach is
an “open-world” approach, which
I think is far superior.On Monday, November 2, 2015 at 9:07:31 AM UTC-5, Mike Jones wrote:

wrt HaskellI

See this for background:

24 Days of GHC Extensions: Type Families
24 Days of GHC Extensions: Record Wildcards

A specific example sets up a data type as an interface:

data SMBus = SMBus {
sendByte :: Address → Command → IO (Either SMBusStatus ()),
writeByte :: Address → Command → Word8 → IO (Either SMBusStatus ()),
readByte :: Address → Command → IO (Either SMBusStatus Word8),
writeWord :: Address → Command → Word16 → IO (Either SMBusStatus ()),
readWord :: Address → Command → IO (Either SMBusStatus Word16),
readBlock :: Address → Command → Size → IO (Either SMBusStatus
[Char]),
readString :: Address → Command → Size → IO (Either SMBusStatus
String),
probe :: () → IO (Maybe [Address]),
close :: () → IO (),
}

And there are multiple instances with one partial example here:

smbusAardvark = do
devices ← devices – From USB API
handle ← A.openDevice (head devices) – Don’t get hung up on how
dangerous this is to assume the first device, yes it is
return $ SMBus {
sendByte = \address command → do
(status, numWritten) ← i2cWriteExt (fromIntegral handle)
(fromIntegral address) I2cNoFlags [(fromIntegral command)] 1
ss ← toStatus status
if ss == SMBusOk then return $ Right () else return $ Left ss,

And calling code creates an instance for a particular hardware interface,
such that it returns a SMBus datatype with the handle and operations hidden
behind it.

The goal is to instance whatever hardware is being used.

smbus ← smbusAardvark

runI2cTest :: SMBus → IO ()
runI2cTest smbus@SMBus{…} = do

s <- writeByte 0x30 0x00 0x00

In my current application context, it might mean the code listens to some
interface, then makes decisions about what interface to frame over, then
instance an interface for it. This means all code that consumes the
interface is the same.

That said, because it is embedded, I would like to avoid boxed types.
Passing in memory from an outer frame, such that it remains on the stack
might be ok, but it would be better to be static. One issue is that
depending on the instance chosen, the memory size changes. So if the choice
was to preallocate space with C, it would be nice to have a way for the
code to statically calculate a maximum so that there are no accidental
overflows, or to specify its size, and use statics to prevent code from
instantiating something bigger than what was pre-allocated.

RecordWildCards are not necessary, they just pretty up the code so it
reads out loud better.

Here is some code showing how to control template instantiation.

Try

patscc -ccats foo.dats
patscc -DATS FOO_TEMPLATE_NONE foo.dats

And you will see the difference.

(* ****** ****** )
//
extern
fun{} foo(int): void
//
extern
fun foo_ : $d2ctype(foo<>)
implement foo_(x) = foo(x)
//
#ifdef
FOO_TEMPLATE_NONE
#define foo foo_
#endif
//
val () = foo(0)
val () = foo(1)
//
(
****** ****** *)On Sunday, November 1, 2015 at 7:41:21 PM UTC-5, Mike Jones wrote:

Here is a metric from my code:

The original code used 13K of program space. By using templates
everywhere, it increased size by 37% to 18K.

Given the Atmega328 has a 32K program space, this is pretty painful.

If functions in ATS require 2-3 C functions this may become a practical
limit to use of ATS in very small embedded systems. So while the statics go
away, the this overhead does not.

The non-template version is not limited by code speed, but by IO, most of
the time. But different parts of the code have different needs. Given that
some functions will be shared between time critical and space critical
parts of the code, what is needed is a way to annotate the code to tell the
ATS when to instantiate a template, and when to use a shared function.
Thus, being able to use templates in time critical areas, and not in others

Given the above, is there a way to put a wrapper around each template such
that when used, it uses a shared function, but does not add even more C
function call overhead? The idea being a template is instantiated once as a
shared function, with no additional overhead in the generated code, or at
least a very small overhead like one pointer dereference, etc.

I took another look at your example.

Here is what I would do:

abstype Aardvark
abstype SMBus(a:type) // abstract type for handles

extern
fun{a:type}
sendByte (handle: !SMBus(a), Address, Command, err: &int): SMBusStatus

implement
sendByte (Handle, Address, Command) = …

val Handle = SMBus_Aardvark_open(…)

val SMBusStatus = sendByte(Handle, Address, Command, …)

val ((closed)) = SMBus_Aardvark_close(Handle).On Monday, November 2, 2015 at 10:14:57 AM UTC-5, Mike Jones wrote:

So, I assume there will be one implement statement for each bus type? If
not, then the the structure of the code has to be the same for all passed
in types, like the way templates work in C++ where the passed in type, say
int or double, has operators that work for both. In the SMBus case, the
interfaces to the lower SMBus layer, say Aardvark vs. other hardware, may
have significant differences so there needs to be one implementation for
each one.

The next issue is how to call. The Haskell solution has one statement that
picks the type of hardware, then the code is the same. In the template
case, how do you do that?

If the function has to be called as sendByte(0x30, 0x01, err) then
there will be conditionals everywhere, because the call has to give the
type.

Is there a simple way to pick the instance, and call as sendByte(0x30,
0x01, err)?

The final problem is the instance has to capture a handle, which is
returned from an IO call. Templates are instantiated at compile time, not
run time, so I don’t think they can capture the value. Which leads me to
believe I will still have to pass a moniker to every call.

ATS does not allow currying, correct?

The most you can do is capture a value from the outer context I think. But
if a lambda does capture a value, can you return the lambda from a function
without boxing/malloc? That would allow making a creator function that
returns curried functions. Even better if you can return them in a datatype.

The example I am thinking of is opening a handle, and hiding it in a data
type of lambdas, where the datatype is like an interface, and an open
function returns the datatype with the handle hidden.

This would reuse context and make the code faster because the handle does
not need to be passed on every function call. In a loop context, the lambda
could be used with a map, fold, etc.