Sunday 15 November 2009

Android, Java, Eclipse

During the weekend I wrote my first little android application (Leander, a frontend for dict.leo.org). I did it mainly to earn myself an Archos 5, but beyond that I was also just curious. This was the first time I wrote something for a different platform than general purpose PC/workstation/server, it was the first time I really used Eclipse and it was the first time in at least ten years that I used Java again.
As is to be expected from a resource-limited device the SDK feels a bit constrained and decidedly non-fancy. On the other hand that makes it relatively clear and easy to pick up.
I did not like Java when I first had to use it in a graphics programming course back in the nineties and I still do not really care for it. Its verbosity and redundancy annoy me. I think Java just feels overwhelmingly pedestrian.
All the rough edges and annoyances the whole experience could have had however were muffled by Eclipses constant supervision. I do not know whether it is the progress of technology or the different attitude of the language but the last time I used an IDE (Visual C++ back in the nineties as well), I did not feel nearly as cared for as this time.
All this hand-holding makes development sort of a brainless activity - just follow the suggestions by the IDE, copy and paste a few things from the docs or online sources... done. Anyways it is fun and I am curious to see whether my app is actually going to be used by anyone ;-).

Google Go, OOP, Interfaces and Inheritance

It seems these days new programming languages and in particular "systems" programming languages aiming to replace C/C++ are sprouting like mushrooms. I like programming languages therefore for me this is fun, the general public however usually is utterly unaware of these small languages, at least until they become old and somewhat established.
Last week however a new language was presented by nobody different than Google itself. Accordingly it made quite a splash. After everybody had cooled down a bit it turned out that this new language - Go - was mostly quite unremarkable. It consists mainly of a non-daring combination of tried and true language features each of which has been around for quite a while, put together with a strong focus on the simple and non-fancy.
It seems the only feature that received a bit of lasting attention is the lack of classical OOP. Although dissing OOP is sort of a trend at the moment, presenting a new supposedly mainstream language without classes and inheritance still attracts attention.
Instead of inheritance Go promotes composition. Polymorphism is achieved by a very simple mechanism - instead of letting classes declare the conformance to an interface at the time of the declaration of a class, every type that has the right combination of methods associated with it automatically conforms to an interface.
This last feature is it that the designers of Go (and quite a few other people) seem to be most excited about.
The funny thing is - this has been one of my pet peeves for ages and is actually quite an old hat. When I started to learn C++ (coming from Objective C) one of the things that bugged me most was that which interface a class implemented (or abstract base class it derived from in C++ terminology) was practically part of its implementation details. The feeling that this was a bad idea became even stronger when I later on learned Java. There is all this nice polymorphism and reflection but if you just quickly want to make a new interface for an existing class you have to jump through all sorts of hoops.
Then I found out that gcc had this nice C++ extension called 'signatures' which worked more or less exactly like Go's interfaces. At the time when I discovered them the documentation still bravely stated that signatures were being considered as an official part of the language. As we all know, this never happened so I didn't use them (as far as I know nobody did) and instead toyed with the idea of autogenerating some kind of template-based interface-adaptor.
It turned out that I wasn't the first to do so. A language called Heron was based on the same idea and even implemented as a front end to C++. It seems the author however lost interest at some point and abandoned the language.
Further there is the greatly underappreciated and sadly abandoned language Sather which separated inheritance and polymorphism as well.
Obviously not a new idea, thus, but unfortunately one that never took. A while ago I even proposed the same feature for the D language on their mailing list - of course unsuccesfully, the feature was too alien, my explanation too bad and my reputation too non-existent.
At some point my frustration with the rigidity of OOP (and many, many other things) in C++ became so big that I did what everybody seems to do - I started working on my own programming language. Separation of code reuse and polymorphism are one of its key features (the scope of the project long since snowballed from a modest redesign of C++ to a complete start from scratch, but that's a different story).
Accordingly it gives me a bit of a stale feeling to see everybody getting all excited about this supposedly revolutionary and brilliant feature. On the other hand having the momentum of Google behind it will hopefully finally give the idea enough exposure to find out whether it is indeed a viable alternative to classical OOP.

Tuesday 27 October 2009

The weirdness of male lactation

When our first child was under way, we had a few interesting lunch discussions with some friends/colleagues over this one. Recently it resurfaced but under a very different aspect.

Weird biology

One of the reasons I love biology is that it is the realm of the weird, wonderful and bizarre. For every rule in biology there is usually a weird, obscure exception (e.g. flying or land-walking fish, egg-laying mammals, herbivorous spiders, gliding snakes, diving lizards, ...). Also in general every bizarre thing one can think of has evolved at least once (e.g. infectious cancer, tongue-replacing isopod, cartwheeling spider, parasitic males, sex-changing fish, child birth through the pelvis, ...). (we could call call these the Biological Laws of Weirdness)
Surprisingly however there is only one not entirely convincing example for male lactation in mammals. This is even stranger given that a) male care (sans lactating) does occur in mammals, b) male mammals are anatomically absolutely capable of lactating and c) it is easy to come up with scenarios where it would be quite beneficial for a male mammal to be able to feed its young.
The whole thing is puzzling enough that it even deserved a paper in TREE.

Weird people

The issue came to my attention again recently when I stumbled upon a small article in a swedish online newspaper about a guy who tried to train himself to lactate (I think people fulfill the same two rules of weirdness I mentioned above). While this shows admirable determination (and imperviousness to social pressure as we will see in a moment) this is in my opinion nothing to write home about - as I said people are weird and if the guy wants to lactate, be my guest. The real eye-opener came when I started to read the comments to the article. I do not recall the exact numbers but out of 40-50 comments more than three quarters displayed negativity ranging from ridicule over denial to outright foaming, spittle-spraying rage. This reaction absolutely astonishes me. I mean, I am certainly not eager to try it myself, but come on guys, why does it bother you so much that this one swedish boy tries to squeeze some milk from his nipples?
I do not want to overinterprete the matter but I think this might be a sympton of some deep insecurities many men have concerning their gender roles. Maybe I will write a blog post about that...

Epilogue

While looking up the references for this post I found out that I have actually been in good (if not the best) company with my puzzlement. It seems John Maynard Smith asked the same question in his 1978 book "The Evolution of Sex". Thirty years and we are still left to wonder...

Monday 26 October 2009

Scientific versus "regular" programming - part I

A huge part of the effort (and the resulting progress) in computer science is dedicated to making it easier for people to create better programs in a shorter amount of time. To this end new tools and methodologies are developed.
However if we zoom in a bit differences between different areas of application of programming become obvious. Consequently the demands placed on the required tools and methods differ as well.
In my field (theoretical biology) and I think generally in areas of science that require the development of simulation software programming happens under very special conditions that lead to a unique set of requirements for the process of software creation.

what is a good program?

Many clever people have written whole books on the topic and I am certainly not an expert, but in a nutshell a good program in most situations has to fulfill these criteria:
  • correctness - It has to do the things it is supposed to do (and only those).
  • efficiency - It has to do them using a reasonable amount of resources (time, memory, etc.).
  • maintainability - It has to be reasonably easy to change the program in the future.
Making it easier for people to make programs conform to these criteria (or at least find out whether they do) with a reasonable amount of effort has been the main driving force behind the development of new languages, platforms, IDEs, coding conventions, etc. Accordingly it is nowadays a *lot* easier to produce correct, efficient and maintainable code than it was, say thirty years ago.


Although similar the criteria for what makes a "good" program in a scientific context differ in important details.


efficiency

It has often been said (in many variations) that Moore's law made efficiency unimportant. This is certainly true in many areas as shown by the success of dynamic, interpreted (and horribly slow) languages such as Ruby or Python.
For someone who writes and uses simulation programs however time is always a limiting factor (disk space and memory are others but less so in recent years). Given more time (or higher execution speed) it is possible to test more parameter combinations, build in more details, run more replicates or observe more long term dynamics - all of which (might) lead to better results, which make for better publications which will bring more fame, fortune and general happiness.

execution speed is important

maintainability

The need to program in a way that makes it easy (or at least possible) to make changes to a program later on has lead to the evolution of whole industries.
In the scientific context this is only an issue for library code and tools. Most of the code written by the scientist herself is usually a one-off effort and stored in the virtual attic after publication of the corresponding paper(s).
There is a related issue of understandability and clarity of code but I will talk about this later.

maintainability is (with certain caveats) a minor problem

correctness

I think program correctness is maybe the aspect where scientific programming differs most from "mainstream" programming.
The correctness of an operating system or a game is determined by how much the respective program behaves according to specifications. Bugs are found by people running into situations where the program behaves in a way it shouldn't (crashes, rendering glitches, hangups, etc.). Observing the program is therefore ultimately the way most bugs are detected in such a situation. Luckily this also means that those bugs that have the strongest effect on program behavior tend to be the easiest to detect.
In a simulation on the other hand the behavior is the outcome of the program. The program is correct if the behavior is produced according to the specified rules. Of course simulations also have easily observable bugs (e.g. the program crashes) but these are not dangerous. Many errors however "just" lead to wrong results. These bugs are very dangerous because they can go entirely undetected while making the whole program (or at least the work done with it) effectively worthless. Especially in more complicated simulations in principle the only way to find these bugs is by rigorous examination of the source code.

correctness is essential, difficult to obtain and even more difficult to prove

clarity

This leads us directly to an additional criterion for program quality that is usually seen as a part of maintainability but in the context of scientific programming deserves in my opinion a bullet point on its own - clarity and legibility of the source code.
If some (serious) bugs can only be found by reasoning about the source code then it becomes of paramount importance to write the code in a way that makes it easy to reason about. In this sense clarity is a means to fulfill the correctness criterion.
In a scientific context however the source code of a program is more than just the intermediate stage towards producing an executable that can be used. An essential part of the way science happens is that one scientist's results have to be reproducible by other scientists. In the empirical fields that means that methods are published down to the last onerous detail. In a mathematical paper enough steps of a calculation are given that it is possible to retrace the authors' steps (for a suitable definition of 'possible'...). Given the notorious dissociation between source code and documentation the code ultimately is the authoritative source on what a simulation does (unfortunately there is no real standard for the publication of source code (yet), although usually most authors at least offer to provide the source code on request - but that is a different blog post). Source code therefore is also a means of communication between scientists and therefore should be written in a way that makes it as easy to understand as possible.
In my opinion this is a vastly underappreciated aspect of at least those programming courses for scientists that I am aware of.

clarity of source code is essential


It should be clear by now that producing a good program requires a specific approach in a scientific context. In the next part of this post I will explain which consequences the specific "socio-economic environment" has for scientific programming. Then I will explore the consequences for the design of better tools for scientific programming.

update (27/10/09 10:28)

Please also check out the interesting comments on reddit.