Compilers suck

right? I mean, as much as I’m excited about ATS and being able to have
the power and speed of ATS+C/++, I’m really not excited about the idea
of having to manage some horrible C toolchain on all the different
platforms I’d ever want to run on. Since I am always thinking of
client stuff, I’m thinking of at least:

Android arm, mips, intel.
iOS arm.
Linux x86, power.
Mac OS power, intel.
Windows XP, Vista, 7, 8.
Windows Mobile Whatever It Is Called Nowadays.
Xbox’s, PS’s, Nintendo’s. ideally even homebrew sdks for the older
game consoles.
HTML5.

My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

Say an execution spends 20% on GC.
Say Bohem-GC takes 2 times the time taken by your favorite GC.
Everything else being the same, the ATS version is only 20% slower.

I am actually quite optimistic about getting back this 20% loss by using
flat types.

By the way, you may also use a non-Boehm GC if you can find one.On Thu, Nov 13, 2014 at 4:07 PM, Ian Denhardt i...@zenhack.net wrote:

Quoting gmhwxi (2014-11-13 15:10:59)

  1. Static array bounds checking is different from dynamic array bounds
    checking. For instance, you may want to ensure an index is within
    a
    subrange of an array. This makes total sense even if you program
    in
    Java.

My overall point was that most of the “systems-programming” things in
ATS are a bit moot when run on the JVM. There are many benefits to
being able to do bounds checking statically. One of these is you find
bugs sooner (before shipping, hopefully). Another is that you don’t
have to do the check dynamically, Which supposedly is important in
some contexts for performance reasons. Similarly, though you could
compile down to something that output the same result, the JVM can’t
actually represent unboxed types in memory, which is something no
compiler can fix.

  1. my inclination would be to go for something that embraces
    the notion of garbage collectors
    I am not sure that I understand your point.
    I can embrace GC in ATS too. In fact, I do so in most of my ATS
    code.

What are you using for a garbage collector? The JVM does a lot of
optimizations that are simply impossible with C, and are therefore
also the particular JVM solutions Raoul linked to (though not
necessarily an ATS-aware solution).

Idiomatic functional programs tend to allocate a lot of memory – which
is acceptable if the garbage collector can clean it up promptly and
efficiently (or if the compiler can figure out how to not actually
allocate it in the first place). My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

I also want add a line:

With the support of templates in ATS, functions implemented for
one reason become useful in new ways. Here is one of my favorites:
list-mergesort can be used to randomly permute a given list :)On Friday, November 14, 2014 1:23:39 AM UTC-5, gmhwxi wrote:

Which is why Guile has a far more useful implementation of
bytevectors, uniform vectors, uniform arrays, etc., than is usual in
Scheme implementations. Any of the above can be used as a pointer view
– a feature that I use a lot. Structures that were invented for one
reason become useful in new ways.

I think I totally understand the point :slight_smile:

On Friday, November 14, 2014 1:10:28 AM UTC-5, Barry Schwartz wrote:

gmhwxi gmh...@gmail.com skribis:

My thinking is simple: The speed of GC rarely matters. I would rather
have
the option to expose the pointer to a dynamically allocated object
without
worrying
its being changed by GC.

Which is why Guile has a far more useful implementation of
bytevectors, uniform vectors, uniform arrays, etc., than is usual in
Scheme implementations. Any of the above can be used as a pointer view
– a feature that I use a lot. Structures that were invented for one
reason become useful in new ways.

Could you explain your reasoning? I’ve only really heard arguments from
the perspective of ease of implementation and, of course, the fact that
with a language like C, precise garbage collection is impossible
(determining if something “is a pointer” is undecidable).

My thinking is simple: The speed of GC rarely matters. I would rather have
the option to expose the pointer to a dynamically allocated object without
worrying
its being changed by GC.

precise garbage collection is impossible

If we talk about 64-bit pointers, then the chance of having data mixed with
pointers
is extremely small. I wouldn’t worry about it.On Thursday, November 13, 2014 9:12:42 PM UTC-5, Ian Denhardt wrote:

(There are points in here that are in response to several different
emails; I sometimes wish threads were directed graphs rather than trees.
the post this is actually a reply to is somewhat arbitrary).

Quoting Hongwei Xi:

Say an execution spends 20% on GC. Say Bohem-GC takes 2 times
the time taken by your favorite GC. Everything else being the
same, the ATS version is only 20% slower.

Do you have a basis for these numbers? I did a bit of searching and
wasn’t able to find credible benchmarks.

I am actually quite optimistic about getting back this 20% loss by
using flat types.

So we’ve drifted from this point in the conversation a bit, but part of
what I meant by “embracing” garbage collection was incorporating its
existence into the execution model – small objects are often times
easier to think about, so if you have a memory system that permits
making lots of them, it can make programming easier.

The JVM doesn’t actually support flat/unboxed types at all – you simply
cannot do that. Your point makes some sense when talking about a
native code environment with a GC, but in the context of the earlier
discussion it would not. The designers of the go programming language
would agree with you I think.

Quoting Raoul Duke:

The conservativeness is a concern. I wish that ATS could also
spit out something that was a “safe” language/AST where
pointers/refs were really typed, so that we could more easily
plug into precise collectors when the chance rears its head. :slight_smile:

Quoting Hongwei Xi:

By the way, I would like to add that I see conservativeness as a
blessing.

Could you explain your reasoning? I’ve only really heard arguments from
the perspective of ease of implementation and, of course, the fact that
with a language like C, precise garbage collection is impossible
(determining if something “is a pointer” is undecidable).

There was a real, somewhat serious bug in the go language runtime for a
while where the conservative garbage collector wasn’t always collecting
garbage – they’ve since moved to a precise collector.

Quoting Barry Schwartz:

Both GNU Guile and Embedded Common Lisp use Boehm GC just fine for
languages that are quintessentially ‘garbage collected’. ATS thus is
on the same level as Scheme and Common Lisp, when it comes to garbage
collection.

ATS is also a language targeted at use cases that are highly
performance sensitive. Lisps are not, and tend to be pretty slow for
reasons completely orthogonal to their garbage collectors. There are
fast implementations, but the ones I know of that have a reputation
for their speed are not using Boehm. Chicken Scheme comes to mind,
which has a very neat implementation of call/cc that plays well with
its generational copying garbage collector.

Quoting Raoul Duke (2014-11-13 13:20:28)

I was hoping somebody would chime in with something like one of these
below, and tell me in great detail about their experiences with them
all, and which one they think is the best way to go :slight_smile:

My take: The JVM is going to render a number of the selling points of
ATS useless – unboxed datatypes, no need for run-time bounds
checking… While there’s sense in having an advanced type system in a
language for building applications where such environments are
acceptable, my inclination would be to go for something that embraces
the notion of garbage collectors (and gets the simplifications to the
programming model as a result), e.g. idris.

The safe C VM presumably matches the expected execution model better,
but it seems more useful for sandboxing existing/legacy C code.

My assumption is that the reason the ATS complier targets C, as opposed
to generating assembly/machine language itself, is to leverage all of
the optimization work that’s going into existing compilers, and for the
portability. We also end up having to deal with their clumsy build
systems unfortunately. I can’t say that I really expect adding an extra
layer to help much, though.

signature.asc (819 Bytes)

It requires some work.

To read from a pointer in ATS, you need a proof and the pointer:

ptr_get (pf | p)

The proof part is already erased when the code-emitting stage is reached.On Thursday, November 13, 2014 6:43:56 PM UTC-5, Raoul Duke wrote:

re: boehm

The conservativeness is a concern. I wish that ATS could also spit out
something that was a “safe” language/AST where pointers/refs were
really typed, so that we could more easily plug into precise
collectors when the chance rears its head. :slight_smile:

My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

Say an execution spends 20% on GC.
Say Bohem-GC takes 2 times the time taken by your favorite GC.
Everything else being the same, the ATS version is only 20% slower.

I am actually quite optimistic about getting back this 20% loss by using
flat types.

By the way, you may also use a non-Boehm GC for running ATS code
if you can find one.On Thursday, November 13, 2014 4:07:33 PM UTC-5, Ian Denhardt wrote:

Quoting gmhwxi (2014-11-13 15:10:59)

  1. Static array bounds checking is different from dynamic array
    bounds
    checking. For instance, you may want to ensure an index is within
    a
    subrange of an array. This makes total sense even if you program
    in
    Java.

My overall point was that most of the “systems-programming” things in
ATS are a bit moot when run on the JVM. There are many benefits to
being able to do bounds checking statically. One of these is you find
bugs sooner (before shipping, hopefully). Another is that you don’t
have to do the check dynamically, Which supposedly is important in
some contexts for performance reasons. Similarly, though you could
compile down to something that output the same result, the JVM can’t
actually represent unboxed types in memory, which is something no
compiler can fix.

  1. my inclination would be to go for something that embraces
    the notion of garbage collectors
    I am not sure that I understand your point.
    I can embrace GC in ATS too. In fact, I do so in most of my ATS
    code.

What are you using for a garbage collector? The JVM does a lot of
optimizations that are simply impossible with C, and are therefore
also the particular JVM solutions Raoul linked to (though not
necessarily an ATS-aware solution).

Idiomatic functional programs tend to allocate a lot of memory – which
is acceptable if the garbage collector can clean it up promptly and
efficiently (or if the compiler can figure out how to not actually
allocate it in the first place). My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

Quoting gmhwxi (2014-11-13 15:10:59)

  1. Static array bounds checking is different from dynamic array bounds
    checking. For instance, you may want to ensure an index is within a
    subrange of an array. This makes total sense even if you program in
    Java.

My overall point was that most of the “systems-programming” things in
ATS are a bit moot when run on the JVM. There are many benefits to
being able to do bounds checking statically. One of these is you find
bugs sooner (before shipping, hopefully). Another is that you don’t
have to do the check dynamically, Which supposedly is important in
some contexts for performance reasons. Similarly, though you could
compile down to something that output the same result, the JVM can’t
actually represent unboxed types in memory, which is something no
compiler can fix.

  1. my inclination would be to go for something that embraces
    the notion of garbage collectors
    I am not sure that I understand your point.
    I can embrace GC in ATS too. In fact, I do so in most of my ATS
    code.

What are you using for a garbage collector? The JVM does a lot of
optimizations that are simply impossible with C, and are therefore
also the particular JVM solutions Raoul linked to (though not
necessarily an ATS-aware solution).

Idiomatic functional programs tend to allocate a lot of memory – which
is acceptable if the garbage collector can clean it up promptly and
efficiently (or if the compiler can figure out how to not actually
allocate it in the first place). My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

signature.asc (819 Bytes)

By the way, I would like to add that I see conservativeness as a blessing.
I remember Rick Lavoe wrote for ATS a copying collector and also a
conservative
mark-sweep collector. We decided to go with the latter, which has since
been used
as the default GC in ATS1.On Thursday, November 13, 2014 6:43:56 PM UTC-5, Raoul Duke wrote:

re: boehm

The conservativeness is a concern. I wish that ATS could also spit out
something that was a “safe” language/AST where pointers/refs were
really typed, so that we could more easily plug into precise
collectors when the chance rears its head. :slight_smile:

I thought about writing atscc2llvm. But we already have gcc -dragonegg and
clang -llvm …On Thu, Nov 13, 2014 at 5:17 PM, Barry Schwartz < chemoe...@chemoelectric.org> wrote:

Ian Denhardt i...@zenhack.net skribis:

Quoting Raoul Duke (2014-11-13 15:37:51)

People seem to have taken my hrefs to mean i want the jvm. That isn’t
my goal. It is one option to consider in the thought experiment of how
to reach the goal: making my life suck less wrt build systems and
compilers across all the various targets I’d like to target. Consider
using one of those VMs-for-C to just get something working at all with
minimal fuss on a new system for doing demos so you could then get
more funds allocated for a real port, etc.

LLVM is probably a much more interesting target that JVM in this regard.

I vaguely remember hearing/seeing something at some point involving
taking the output of the ATS compiler and running it through EMScripten
by way of clang, to get it running in a browser, does anyone else
remember this?

I was going to mention LLVM.

The support for inlining ATS code in Pure programs, which I recently
contributed to the Pure-lang project, works by compiling the ATS to
LLVM bitcode, which is then fed to the Pure system.

It is possible also to distribute and run LLVM bitcode files. (I
believe the command to run a bitcode program is ‘lli’.)

Which is why Guile has a far more useful implementation of
bytevectors, uniform vectors, uniform arrays, etc., than is usual in
Scheme implementations. Any of the above can be used as a pointer view
– a feature that I use a lot. Structures that were invented for one
reason become useful in new ways.

I think I totally understand the point :)On Friday, November 14, 2014 1:10:28 AM UTC-5, Barry Schwartz wrote:

gmhwxi <gmh...@gmail.com <javascript:>> skribis:

My thinking is simple: The speed of GC rarely matters. I would rather
have
the option to expose the pointer to a dynamically allocated object
without
worrying
its being changed by GC.

Which is why Guile has a far more useful implementation of
bytevectors, uniform vectors, uniform arrays, etc., than is usual in
Scheme implementations. Any of the above can be used as a pointer view
– a feature that I use a lot. Structures that were invented for one
reason become useful in new ways.

I was hoping somebody would chime in with something like one of these
below, and tell me in great detail about their experiences with them
all, and which one they think is the best way to go :slight_smile:

(There are points in here that are in response to several different
emails; I sometimes wish threads were directed graphs rather than trees.
the post this is actually a reply to is somewhat arbitrary).

Quoting Hongwei Xi:

Say an execution spends 20% on GC. Say Bohem-GC takes 2 times
the time taken by your favorite GC. Everything else being the
same, the ATS version is only 20% slower.

Do you have a basis for these numbers? I did a bit of searching and
wasn’t able to find credible benchmarks.

I am actually quite optimistic about getting back this 20% loss by
using flat types.

So we’ve drifted from this point in the conversation a bit, but part of
what I meant by “embracing” garbage collection was incorporating its
existence into the execution model – small objects are often times
easier to think about, so if you have a memory system that permits
making lots of them, it can make programming easier.

The JVM doesn’t actually support flat/unboxed types at all – you simply
cannot do that. Your point makes some sense when talking about a
native code environment with a GC, but in the context of the earlier
discussion it would not. The designers of the go programming language
would agree with you I think.

Quoting Raoul Duke:

The conservativeness is a concern. I wish that ATS could also
spit out something that was a “safe” language/AST where
pointers/refs were really typed, so that we could more easily
plug into precise collectors when the chance rears its head. :slight_smile:

Quoting Hongwei Xi:

By the way, I would like to add that I see conservativeness as a
blessing.

Could you explain your reasoning? I’ve only really heard arguments from
the perspective of ease of implementation and, of course, the fact that
with a language like C, precise garbage collection is impossible
(determining if something “is a pointer” is undecidable).

There was a real, somewhat serious bug in the go language runtime for a
while where the conservative garbage collector wasn’t always collecting
garbage – they’ve since moved to a precise collector.

Quoting Barry Schwartz:

Both GNU Guile and Embedded Common Lisp use Boehm GC just fine for
languages that are quintessentially ‘garbage collected’. ATS thus is
on the same level as Scheme and Common Lisp, when it comes to garbage
collection.

ATS is also a language targeted at use cases that are highly
performance sensitive. Lisps are not, and tend to be pretty slow for
reasons completely orthogonal to their garbage collectors. There are
fast implementations, but the ones I know of that have a reputation
for their speed are not using Boehm. Chicken Scheme comes to mind,
which has a very neat implementation of call/cc that plays well with
its generational copying garbage collector.

signature.asc (819 Bytes)

Idiomatic functional programs tend to allocate a lot of memory – which
is acceptable if the garbage collector can clean it up promptly and
efficiently (or if the compiler can figure out how to not actually
allocate it in the first place). My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

(Taken from A garbage collector for C and C++):

Bohem-GC uses a mark-sweep http://www.hboehm.info/gc/complexity.html
algorithm. It provides incremental and generational collection under
operating systems which provide the right kind of virtual memory support.
(Currently this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with
varying restrictions.)

In general, the typical memory allocation pattern in a functional program
suits incremental and generational collection very well.On Thursday, November 13, 2014 4:07:33 PM UTC-5, Ian Denhardt wrote:

Quoting gmhwxi (2014-11-13 15:10:59)

  1. Static array bounds checking is different from dynamic array
    bounds
    checking. For instance, you may want to ensure an index is within
    a
    subrange of an array. This makes total sense even if you program
    in
    Java.

My overall point was that most of the “systems-programming” things in
ATS are a bit moot when run on the JVM. There are many benefits to
being able to do bounds checking statically. One of these is you find
bugs sooner (before shipping, hopefully). Another is that you don’t
have to do the check dynamically, Which supposedly is important in
some contexts for performance reasons. Similarly, though you could
compile down to something that output the same result, the JVM can’t
actually represent unboxed types in memory, which is something no
compiler can fix.

  1. my inclination would be to go for something that embraces
    the notion of garbage collectors
    I am not sure that I understand your point.
    I can embrace GC in ATS too. In fact, I do so in most of my ATS
    code.

What are you using for a garbage collector? The JVM does a lot of
optimizations that are simply impossible with C, and are therefore
also the particular JVM solutions Raoul linked to (though not
necessarily an ATS-aware solution).

Idiomatic functional programs tend to allocate a lot of memory – which
is acceptable if the garbage collector can clean it up promptly and
efficiently (or if the compiler can figure out how to not actually
allocate it in the first place). My sense is that things like Boehm
aren’t really up to the task of keeping up with that style of
programming, even if they’re fine for programs that allocate less
memory.

People seem to have taken my hrefs to mean i want the jvm. That isn’t
my goal. It is one option to consider in the thought experiment of how
to reach the goal: making my life suck less wrt build systems and
compilers across all the various targets I’d like to target. Consider
using one of those VMs-for-C to just get something working at all with
minimal fuss on a new system for doing demos so you could then get
more funds allocated for a real port, etc.

  1. Static array bounds checking is different from dynamic array bounds
    checking. For instance, you may want to ensure an index is within a
    subrange of an array. This makes total sense even if you program in
    Java.

  2. my inclination would be to go for something that embraces
    the notion of garbage collectors
    I am not sure that I understand your point.
    I can embrace GC in ATS too. In fact, I do so in most of my ATS code.On Thursday, November 13, 2014 1:38:55 PM UTC-5, Ian Denhardt wrote:

Quoting Raoul Duke (2014-11-13 13:20:28)

I was hoping somebody would chime in with something like one of these
below, and tell me in great detail about their experiences with them
all, and which one they think is the best way to go :slight_smile:

java - Running/Interpreting C on top of the JVM? - Stack Overflow

Compiling C to Java Bytecode | Depth-First

My take: The JVM is going to render a number of the selling points of
ATS useless – unboxed datatypes, no need for run-time bounds
checking… While there’s sense in having an advanced type system in a
language for building applications where such environments are
acceptable, my inclination would be to go for something that embraces
the notion of garbage collectors (and gets the simplifications to the
programming model as a result), e.g. idris.

The safe C VM presumably matches the expected execution model better,
but it seems more useful for sandboxing existing/legacy C code.

My assumption is that the reason the ATS complier targets C, as opposed
to generating assembly/machine language itself, is to leverage all of
the optimization work that’s going into existing compilers, and for the
portability. We also end up having to deal with their clumsy build
systems unfortunately. I can’t say that I really expect adding an extra
layer to help much, though.

re: boehm

The conservativeness is a concern. I wish that ATS could also spit out
something that was a “safe” language/AST where pointers/refs were
really typed, so that we could more easily plug into precise
collectors when the chance rears its head. :slight_smile:

I could readily write a compiler translating a subset of ATS to JVM.

But who is going to use it?

Such a tool can only be improve if/when it is being used!On Thursday, November 13, 2014 1:43:13 PM UTC-5, Raoul Duke wrote:

no magic bullet?!?!?!?! :frowning:

Say an execution spends 20% on GC. Say Bohem-GC takes 2 times
the time taken by your favorite GC. Everything else being the
same, the ATS version is only 20% slower.

Do you have a basis for these numbers? I did a bit of searching and
wasn’t able to find credible benchmarks.

If it was based on personal experience, I would use 10%. How about
50%? The truth is that GC time cannot be too high (or it is 99% high as
you need the mutator to generate garbage in the first place.
By common sense, the collector runs much faster than the mutator
in terms of the amount of memory reclaimed/allocated.

The JVM doesn’t actually support flat/unboxed types at all.

Then do not use such types if you want to generate JVM. For instance,
atscc2js does not support flat/unboxed types, either.

ATS is also a language targeted at use cases that are highly
performance sensitive. Lisps are not, and tend to be pretty slow for
reasons completely orthogonal to their garbage collectors. There are
fast implementations, but the ones I know of that have a reputation
for their speed are not using Boehm. Chicken Scheme comes to mind,
which has a very neat implementation of call/cc that plays well with
its generational copying garbage collector.

I see GC in ATS as a potentiometer instead of a fixed resistor.

If GC is supported, then I will almost always use it if it does not prevent
me
from accomplishing my task. In most of my own cases. Boehm-GC are more
than enough. If you find a case that is not, I will be happy to see it.

If GC is not supported, then I will try to use linear types. See, for
instance, this
Arduino program:

I can also gradually reduce the amount of GC needed by replacing datatypes
with linear datatypes.On Thursday, November 13, 2014 9:12:42 PM UTC-5, Ian Denhardt wrote:

(There are points in here that are in response to several different
emails; I sometimes wish threads were directed graphs rather than trees.
the post this is actually a reply to is somewhat arbitrary).

Quoting Hongwei Xi:

Say an execution spends 20% on GC. Say Bohem-GC takes 2 times
the time taken by your favorite GC. Everything else being the
same, the ATS version is only 20% slower.

Do you have a basis for these numbers? I did a bit of searching and
wasn’t able to find credible benchmarks.

I am actually quite optimistic about getting back this 20% loss by
using flat types.

So we’ve drifted from this point in the conversation a bit, but part of
what I meant by “embracing” garbage collection was incorporating its
existence into the execution model – small objects are often times
easier to think about, so if you have a memory system that permits
making lots of them, it can make programming easier.

The JVM doesn’t actually support flat/unboxed types at all – you simply
cannot do that. Your point makes some sense when talking about a
native code environment with a GC, but in the context of the earlier
discussion it would not. The designers of the go programming language
would agree with you I think.

Quoting Raoul Duke:

The conservativeness is a concern. I wish that ATS could also
spit out something that was a “safe” language/AST where
pointers/refs were really typed, so that we could more easily
plug into precise collectors when the chance rears its head. :slight_smile:

Quoting Hongwei Xi:

By the way, I would like to add that I see conservativeness as a
blessing.

Could you explain your reasoning? I’ve only really heard arguments from
the perspective of ease of implementation and, of course, the fact that
with a language like C, precise garbage collection is impossible
(determining if something “is a pointer” is undecidable).

There was a real, somewhat serious bug in the go language runtime for a
while where the conservative garbage collector wasn’t always collecting
garbage – they’ve since moved to a precise collector.

Quoting Barry Schwartz:

Both GNU Guile and Embedded Common Lisp use Boehm GC just fine for
languages that are quintessentially ‘garbage collected’. ATS thus is
on the same level as Scheme and Common Lisp, when it comes to garbage
collection.

ATS is also a language targeted at use cases that are highly
performance sensitive. Lisps are not, and tend to be pretty slow for
reasons completely orthogonal to their garbage collectors. There are
fast implementations, but the ones I know of that have a reputation
for their speed are not using Boehm. Chicken Scheme comes to mind,
which has a very neat implementation of call/cc that plays well with
its generational copying garbage collector.

no magic bullet?!?!?!?! :frowning: