Sunday 1 November 2009

The future of programming

Over the last couple of months I've been ruminating about what an ideal next-stage programming language would look like.

For starters it would be strongly-typed compiled language (non of this fail at runtime, impossible to debug nonsense masquerading as easy-to-write code. I spend my time modifying and debugging code written by others and that's where you need to be efficient).

Also, for ease of adoption, it would have to be a Java-based language so that you could use the existing compilers and Java libraries.

The main thrust would be to avoid focussing too much on technical language features, but to instead move towards eliminating the current need to interpret between what's in my head and what's written on the screen. That is, not to concentrate on how fantastic closures may or may-not be, but to look forward to what we want code to look like. And to know what to avoid, look at C++ template magic!! The cannonical example of what we want is the leap forward of the for-each loop.

So here's my brain dump of the things currently getting in the way when transferring either way between brain and screen:

1. Auto-immutable versions of classes (i.e. a version of "const"). E.g. you can only call accessors, not modifiers on a const reference.

2. Value-types that are as good as built-in-types. (Pass by value, operator overloading, no inheritance).

3. Return multiple values (might be specialisation of automatic n-tuples, e.g. given classes C and D, {C,D} defines a new class containing a C & a D).

4. Expressive syntax as per Linq/Groovy e.g. for(x:set).where(x.isAble)... ("where" is user-definable external to X)

5. Named parameters when calling functions e.g. setPos(x:10,y:20) rather than setPos(10,20)

6. Optional parameters, i.e. possibly null parameters, can be omitted.

7. References are not null unless marked so (to be enforced by compiler) - i.e. remove almost all checks for null.

8. Preconditions / postconditions / invariants supported e.g. by annotation (to be enforced by compiler where possible).

9. Scope of variables inside try{} extends past end of try{} to remove horrendous split of declare and initialisation.

10. Implicit try{} - i.e. declare a catch/finally and it auto-generates a try{} for the preceeding code - no more nested try blocks.

11. Auto resource-cleanup. Probably achievable by previous two points+ making close() etc. throw no exceptions, but I like the auto-pointer resource-is-initialisation pattern which declares at creation that it will be closed rather than having to remember to put it in a finally block.

12. Implicit implementation of I/F - no need to implement void fns or fns that can return null, and ability to declare default implementation in I/F.

13. Fns not virtual by default (you need to design for a function to be overridable, and currently you cannot tell which are).

14. Built in support for "copy on modify" - value-types (which can't have references to external classes, only internal value-types) must be efficient, which they wouldn't be without this.

All these points would make the code I work with on a daily basis simpler and easier to understand / work with, more explicit, with less brain/screen translation.

I'm sure that I've forgotten some points, but that's a good enough brain dump for now.

Wednesday 10 June 2009

Silence and Javascript

I recently started a new job - hence the gap in posting. Ironically, considering my post about the importance of failure at compile-time, I have been grappling with some legacy Javascript. Any error is only detected by the program running incorrectly! It's like working with hand-crafted machine code on my C64 back in the 80's... Javascript is also a lesson against design by browser-war.

Monday 6 April 2009

Resource Handling in Java: Summary

My last 5 blog entries have covered the thorny issue of how to handle ensuring resources are closed correctly. I recommended two solutions out of the various approaches. If you don't care about close() exceptions eclipsing ealier ones, go for the simple:


void foo() throws IOException {
    SomeResource r = new SomeResource();
    try {
        //do some stuff with r
    }
    finally {
        r.close();
    }
}

But don't forget to nest your try loops for multiple resources!
If you want to avoid the eclipsing problem, or you wish to avoid the nesting of try loops, then you need to write a framework API, which would enable you to write:

void foo() throws IOException {
    Closer closer = new Closer();//the new helper class
    SomeResource r1=null;
    SomeResource r2=null;
    try {
        //create and use r1 & r2
        closer.completed();
    }
    finally {
        closer.close(r2,r1);
    }
}

There are many subtle issues to decide upon when writing Closer so I have purposefully refrained from writing a specific implementation so that the argument is about the general design rather than the specifics. (For example, how we cope with classes that don't implement closable?). I may blog an example implementation at a later date.

Java Resource Handling: Java 7

So, how does all this fit in with the proposals for solving this problem for Java 7?

ARM (Automatic Resource Management)

This proposal for Java 7 by Joshua Bloch, is for a slightly enhanced version of C#s "using" statement, enhanced by allowing support for multiple resources in a single declaration without nesting, e.g. my alternative #2 above:

void foo() throws IOException {
    Closer closer = new Closer();//the new helper class
    SomeResource r1=null;
    SomeResource r2=null;
    try {
        //create and use r1 & r2
        closer.completed();
    }
    finally {
        closer.close(r2,r1);
    }
}

becomes:

void foo() throws IOException {
    do (SomeResource r1=null;
    SomeResource r2=null;) {
        //create and use r1 & r2
    }
}

(though, obviously, in either example you might create r1&r2 where they are declared). The advantages of alternative #2 all apply to ARM, and porting from this pattern to this Java 7 construct would be trivial.

BGGA

The big rival approach is BGGA Closures, which give handling close() as a prime example:

with(FileReader in : makeReader()) with(FileWriter out : makeWriter()) {
    // code using in and out
}

(where with() is a framework API). This example neatly shows the power of the proposal (no need for an extension specifically for handling close(), ability to change how exceptions thrown by close() are handled), it also shows how the code becomes more complex. Like adding templates to a language I need to spend hours reading the spec and playing with example code before I can become proficient, and still it is liable to obfuscation (what a wonderfully self-unaware word that is!).

Thoughts

Neither proposal solves the underlying problems - restricted lifetime objects as per C++ is an elegant solution that (like the baby with the bathwater) has been thrown out along with explicit memory management, to the language's detriment. It would be far from easy to retrofit this.

Secondly, neither proposal fixes close() - it is not necessary for this API to throw exceptions - the exceptions it throws should be limited to those associated with flushing the final writes, and so there should be a flush() call (that can throw) which you call before close() (which cannot), hence moving close() to finally.
In fact, both proposals side-step this problem, with ARM leaving it as an open issue, and BGGA leaving it up to you!

Let's face it though - neither you nor I are likely to be influencing the choice for Java 7 - so the key thing is to write your code so that it is clear, robust, and capable of porting to whatever proposal is implemented. Both of my recommended approaches do that, and the choice is yours. 

Java Resource Handling: Throwing the correct error

To recap - we want to create a resource, do some processing with it, then close it. If it throws an exception, then it must get closed and propogate the first exception encountered.

Naive Approach

The call to close() in our code must be within a try block (for the occasions where it needs to be supressed), and then the code gets quite complicated, e.g:

void foo() throws IOException {
        IOException exception = null;
        SomeResource r1 = null;
        try {
            //create and use r1
        }
        catch(IOException e) {
            exception=e;
        }
        try {
            if(r1!=null) r1.close();
        }
        catch(IOException closeException) {
            if(exception==null) {
                throw closeException;
            }
        }
        if(exception!=null) {
            throw exception;
        }
    }

Not nice!

Alternative #1

How about we have more than one call to close()? Then the code can be quite significantly simplified:

void foo() throws IOException {
        SomeResource r1=null;
        try {
            //create and use r1
            r1.close();
        }
        finally {
            try {
                if(r1!=null) r1.close;
            }
            catch(IOException e) { /*swallow error*/ }
        }
    }
 
This relies on it being legal to call close() multiple times (we can always have a flag to stop it being called twice, which is easy to do if it's wrapped up in a framework).
The obvious extra simplification is to write a closeSilently function:

void CloseSilently(Closeable c) {
    try {
        if(c!=null) c.close();
    }
    catch(IOException e) { /*swallow error*/ }
}

which enables us to write:

void foo() throws IOException {
    SomeResource r1=null;
    try {
        //create and use r1
        r1.close();
    }
    finally {
        closeSilently(r1);
    }
}

So let us critique this solution - slight errors in this code can lead to the worse class of defect (see my first post) - omit the first call to close() and the code will appear to work correctly until close() throws an error which is silently ignored! You don't need nested try blocks to handle multiple resources, but you do need to duplicate all the calls to close() in the same order as calls to closeSilently(). This can be solved by means of a helper framework, but still the deadly silent error is a real problem...

Alternative #2

We used a try/catch + try/catch set-up in the Naive Approach in order to detect if the resource-using code threw an exception or not. Catch isn't the only way to do; note that the last line of the try block runs if and only if no exception was thrown, and we can use this fact instead as follows:

void foo() throws IOException {
    boolean tryBlockCompleted = false;
    SomeResource r1=null;
    try {
        //create and use r1
        tryBlockCompleted = true;
    }
    finally {
        if(tryBlockCompleted) {
            r1.close();
        } else {
            CloseSilently(r1);
        }
    }
}

A helper class would make life a lot easier here:

void foo() throws IOException {
    Closer closer = new Closer();//the new helper class
    SomeResource r1=null;
    try {
        //create and use r1
        closer.completed();
    }
    finally {
        closer.close(r1);
    }
}

In Closer::close() we can assert that completed() has been called, hence in the unit tests we can ensure that the code is used correctly :-)
How about multiple resources?

void foo() throws IOException {

    Closer closer = new Closer();//the new helper class
    SomeResource r1=null;
    SomeResource r2=null;
    try {
        //create and use r1 & r2
        closer.completed();
    }
    finally {
        closer.close(r2,r1);
    }
}

Not particularly complex :-)

Alternative #3

We have a minor variation to alternative 2 above - do we have to make closing in the correct order explicit (as above), or add each closable item to closer as it is created and rely upon it to get the order correct? E.g.: 


void foo() throws IOException {
    Closer closer = new Closer();//the new helper class
    try {
        SomeResource r1 = ...
        closer.add(r1);
        SomeResource r2 = ...
        closer.add(r2);
//use r1 & r2
        closer.completed();
    }
    finally {
        closer.close();
    }
}

This alternative, whilst attractive, isn't actually any shorter - and omitted calls to Closer::add() could lead to silent runtime errors.

Alternative #4

Up until now the helping frameworks discussed have been classes that have been used (or functions that have been called) from the resource-handling code. There is an alternative approach, Inversion of control, where we pass the framework the resource-handling code we wish to run and it calls our code from within its own error handling & close calling code. In Java 6 that means creating an anonymous inner class containing our resource-handling code and passing it to the framework, e.g.:

void foo() throws IOException {
    String result = new ResourceHandler {
        SomeResource r1=...;
        add(r1);
        SomeResource r2=...;
        add(r2);
    }.execute();
}

This is quite attractively short, but looks confusing and still has the silent defect problem associated with Alternative #3 if we omit a call to add()
(See link for a paper describing a very similar approach to this).

Alternative #5

How about we change the rules, and ask for multiple exceptions to be thrown when multiple exceptions are thrown; i.e. when an exception thrown results in multiple extra exceptions when closing the resources, throw all of them.
See link for an example of this approach. This example is over-complex and the bolier-plate code can easily be hidden as per Alternatives 2,3 & 4, but this ignores the extra burden laid upon the client. Handling the array of exceptions thrown is rather over-complicated, and to what end?

Summary

Of the various solutions presented, the only one which appears to be safe to use is alternative #2, the others are too prone to silent error, or are pointlessly over-complicated.

Tuesday 31 March 2009

Resource Acquisition Is Initialization

Let's take a moment away from Java to consider how C++ addresses the problem of resource handling. The usual technique is known as Resource Acquisition Is Initialization. The aspect of this pattern relevant to this discussion is that you wrap up the resource handling in a class which closes the resources in its destructor. In C++ you can declare the object as a stack variable, and thus whether the client code throws an exception or returns normally, the destructor is always called and the resource closed. In addition note that destructors cannot throw exceptions, so any exception thrown by close() would be ignored - consequently in C++ close() does not throw an exception. This can be easily achieved with an API e.g. clients calling a flush() function (which can throw) followed by close().

With these two differences from Java (controllable object lifetimes & close() not throwing exceptions) we find making exception-safe client code becomes trivial:

void foo() {
MyResourceWrapper w;
//do some stuff with w

Where we have a class:

class myResourceWrapper
{
public:
    myResourceWrapper() : r()
    {
    }
 
    ~myResourceWrapper()
    {
        r.close();
    }
 
    void bar()
    {
        //function to manipulate r...
    }
 
private:
    myResource r;
 
    // prevent copying and assignment; not implemented
    myResourceWrapper(const myResourceWrapper&);
    myResourceWrapper& operator= (const myResourceWrapper&);
};

Note that using multiple resource wrappers in a single function provides no extra complications (no equivalent to the nested try blocks) - each class that needs closing is wrapped once and then can be used in an exception-safe fashion everywhere.

Simple Resource Handling in Java

Now we've seen how not to do it, let's see the simple solution and its obvious pitfalls.

void foo() throws IOException {
SomeResource r = new SomeResource();
try {
//do some stuff with r
}
finally {
r.close();
}

This is a trivial solution, in all likelihood good enough, but it has an obvious flaw: if close() throws an exception, then this masks any exception that occured in the try block. This is not a major flaw, since the code still throws an IOException from the correct function, but it may hinder any code handling the error, or give misleading error messages or diagnostics.

For multiple resources we need nested try blocks:

void foo() throws IOException {
SomeResource r1;
try {
//open r1 etc.
AnotherResource r2;
try {
//open r2
}
finally {
r2.close();
}
finally {
r1.close();
}
}

This has the additional problem that the code is looking rather convoluted.

Resource Leak Anti-Patterns in Java

Guarding against resource leaks in Java is fraught with peril. I've seen the following three well-known anti-patterns used by experienced Java developers, so it's well worth starting any discussion on the correct way to manage resources by restating how not to do it!

Anti-Pattern #1: Close resources in your finalizer

You put all your resources as member variables of a class, and then close them in the finalizer, just as if it were a C++ destructor.
The fundamental problem with this approach is that finalizers are only called when the VM is freeing up memory, so you're hoping that the system runs out of a plentiful resource (memory) before it runs out of a scarce resource.
Regardless of what cludges you attempt (you were hoping that System.gc() or System.runFinalization() might help out weren't you?) this approach is fatally flawed from the outset.

Anti-Pattern #2: Mishandling multiple calls to close()

The standard Java approach is to put your resource handling code in a try block, and the resource closing in a finally block.

void foo() throws IOException {
SomeResource r1;
AnotherResource r2;
try {
//open r1 & r2 and do some other stuff that might throw an exception
}
finally {
r2.close();
r1.close();
}
}

However, close() can throw an exception, and if r2.close() throws an exception, then r1 will never get closed...

Anti-Pattern #3: Supressing close() exceptions

It's bad form to throw exceptions in a finally clause since it both masks any exception thrown in the try block and also causes the rest of the finally clause not to run, so you often see the following solution:

void foo() throws IOException {
SomeResource r = new SomeResource();
try {
//do some stuff with r
}
finally {
try{ r.close(); } catch(IOException e) { /*swallow e silently*/ }
}
}


Normally this works nicely - until close() throws an exception after the try block has been succesful, which causes the error to be silently ignored!

This leads to the worse class of defect (as per my last post) - one that easily goes undetected until it causes a failure in the real world, and also one that is very difficult to track down.

Obvious Bugs in Code

I have been reminded recently of a fundamental rule in programming & design which nevertheless has been sidelined or undervalued so often that it's worth repeating and repeating and repeating again.

You should aim to write/design/structure your code/language/framework so that bugs are obvious.

Bugs can be spotted at various points, a slight over-simplification would be to characterise these stages as:
  • Defect is seen by looking at the code
  • Defect gives rise to a compiler error
  • Defect gives rise to a compiler warning
  • Program falls over at run-time at (or immediately after) the erroneous code.
  • Erroneous behaviour caught by automated testing
  • Erroneous behaviour caught later (or perhaps never)
The earlier a bug is spotted, the less impact it has, so the aim should always be to pick it up as high up this list as possible.

I was reminded of this issue by considering how best to guard against resource leaks in Java. There are any number of suggestions online for how to do this, and most of the solutions make your code rather complex (so errors are easy to introduce, but hard to spot) while at the same time errors would all fall in to the "detect at runtime (hopefully by us rather than our customers)" category.

In my first set of posts I'm going to consider this issue, always keeping this fundamental rule in mind.