Thursday 22 July 2010

Eta

In the last two weeks what began as a small attempt at writing a work-saving template system for individual-based simulations turned into my first sort-of somewhat working (but not so work-saving) compiler for Eta (or η).
Eta is a project I have been working on/thinking about since a couple of years already. It started off with my increasing dislike for all the syntactic and semantic warts of the bloated mess that is C++. At the time I was desperately looking for an alternative, however everything I found that promised enough performance was either immature or only slightly less warty and bloated (sidenote: I think D is a much better language than C++ and I really hope it catches on. Still, it's far from being a good language IMHO.).
So I did what everybody who should instead really, really work on his PhD thesis (and I'm *not* talking about a PhD in Computer Science) would do - I started thinking about how to design my own language.
As others have said language design tends to be more successful when it tries to scratch a personal itch than when it attempts to solve other people's problems. In my case I really wanted a language that would make it easier and more fun to implement the simulations I am working on. That means the language had to be
  • fast. It *does* make a difference whether I have to wait 5 days or 8 days for a set of simulations to finish.
  • strictly and statically typed. As I said before the major difficulty when writing simulations is that errors are often silent. Every bit of static guarantee the compiler can give helps.
  • interoperable with C/C++. I'm not going to reimplement all the libraries I use.
  • expressive. For reasons of efficiency dynamic operations are often out in simulations. Therefore similar redundant patterns start to show up at lots of different places. E.g. if I add a new trait to an individual it has to be read, set, initialized, mutated, written to the data file, read from a config file, etc. Some of it can be alleviated by some advanced template wizardry but in the end I sooner or later usually fall back to external code generation. My ideal language would have built-in compile-time macros to solve this problem.
  • clear and unambiguous. One problem with C++ is that understanding what *exactly* a particular piece of code does can be non-trivial. Apart from syntactic idiosyncrasies things like silent shadowing of globals and implicit conversion rules make it necessary to be aware of a big amount of context to understand local semantics. This is especially a problem for simulations since (due to lack of external tests) code review plays an essential role in ensuring their correctness. A good language should therefore reduce the amount of context necessary to understand a piece of code as much as possible.
In addition I wanted a *nice* language. This is of course to a large degree a matter of taste, but for me an aesthetically pleasing language has to be concise - I hate the wordiness of Java or Pascal. Also I really like operators, in my opinion they greatly increase readability (I'm constantly tempted by unicode, but until now issues with inputting non-ascii characters have kept me from going that way. If only everybody had APL keyboards...). Furthermore I want a language to be as orthogonal as possible - everything should be the result of the combination of a few simple rules. I really dislike ad-hoc rules just to achieve some convenient effect or pointless duplication of functionality (why does D for example need templates, CTFE, macros"string mixins", traits and conditional compilation?).
Apart from these general principles I had a couple of specific technical ideas about mechanisms I wanted to include in the language. I will leave the details on that, on how I implemented Eta and on how the language looks like currently to the next post, however.