ChiMu  
 
Menu Edge About   Products   Services   Projects   Publications  
  Publications > SmallJava      Previous Page Previous TOC Next Next Page

Becoming Pessimistic: Changing when an object’s type is verified

SmallJava’s most significant difference from Java is that a message can be sent to an object that is not known to understand that message. If the object has implemented a method matching the message signature then everything executes as expected. If the object has no matching method then SmallJava throws a "DoesNotUnderstandException".

This behavior I call "optimistic" messaging: you assume an object can understand a message and only handle the special cases when the object does not. This is as opposed to "pessimistic" messaging where you make sure an object understands a message before sending it to the object [1]. So let us compare these two approaches.

Optimistic and Pessimistic Messaging

Optimistic and pessimistic messaging have identical behavior if the message will be successfully understood. The main difference between the two approaches is when an unsuccessful message send is recognized. For optimistic messaging it will not be recognized until you send the message and for pessimistic it will be recognized at some time before sending the message. Using our Point#vectorFrom [2] example:

        vectorFrom(point) {
           return new Point(x - point.x(), y - point.y());
        }

If we want to make sure messages like #x and #y are understood by ‘point’ then for an optimistic approach we would have to do something like this:

        vectorFrom(point) {
            try {
               return new Point(x - point.x(), y - point.y());
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
        }

The approach is identical to what you would do for handling any message that could throw an Exception; the only difference is that the message is never actually "received" by the object [3].

So what would pessimistic messaging look like for our example? Assume we add to SmallJava a "message-check" with syntax "(#message)" that allows us to check whether an object understands a particular message. A message-check will do nothing if the object understands the message, but will throw a "DoesNotUnderstandException" if the object does not. Then we can change our example to:

        vectorFrom(point) {
            try {
               return new Point(x - ((#x) point).x(), y - ((#y) point).y());
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
        }

Now we have guaranteed that ‘point’ will successfully respond to #x before sending ‘x()’ and will successfully respond #y before sending ‘y()’. We do not need the optimistic messaging anymore.

OK, that really didn’t provide us with ANY benefit. We still throw the same exception on failure and we throw that exception "just a fraction of a second" before we would actually have sent the message. Why bother?

Advantages of pessimistic messaging

The advantage of pessimistic messaging is that we have more control of when the test is done. For example, we can make sure nothing happens in the method if we didn’t actually get a ‘point’ that responds to #x and #y:

        vectorFrom(point) {
            try {
                (#x,#y) point;
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
            return new Point(x - point.x(), y - point.y());
        }

This is now different behavior than what our optimistic version of the method produced. Our optimistic version would have sent ‘x()’ to the point before checking whether the point responded to #y. This version verifies [4] that ‘point’ understands both #x and #y before sending any messages to it.

Forgetting and Remembering verification

SmallJava’s message-check verification happens to the object in a variable at a given time. What happens when that variable changes or when the object moves to a new variable? We loose the verification and must verify again. The following is not completely pessimistic because of the second assignment to ‘pointCopy’:

        vectorFrom(point) {
            try {
                temp = (#x,#y) point;
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
            newX = temp.x();
            temp = this;
            newX = newX-temp.x();
            
            return new Point(newX, y - point.y());
        }

In the example above we do not know whether the second ‘temp.x()’ will be successful. We have to do another check after assigning to the variable a second time:

            try {
                temp = (#x,#y) this;
            } catch (DoesNotUnderstandException e2) {
                //Do the right thing
            }
            newX = newX-temp.x();

Remembering verification

Well, that produces some very noisy methods. It would be better if we could have a little more memory of previous verifications. We can remember verification of an object by keeping track of the object during variable assignments:

        vectorFrom(point) {
            try {
                temp1 = ((#x,#y) point);
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
            temp2 = temp1;
            newX = temp1.x();
            newY = temp2.y();
            
            return new Point(x - newX, y - newY);
        }

Unfortunately this doesn’t provide us with too much benefit in most programs.

Our other option is to insist that all assignments to a particular variable will always be message-checked before the assignment:

        vectorFrom(point) {
            (#x,#y) temp1;
            (#x,#y) temp2;
            try {
                temp1 = (#x,#y) point;
            } catch (DoesNotUnderstandException e) {
                //Do the right thing
            }
            temp2 = temp1;
            newX = temp1.x();
            newY = temp2.y();
            
            return new Point(x - newX, y - newY);
        }

All assignments to ‘temp1’ and ‘temp2’ must now check whether the value of the assignment passes the message-check. This has no impact to our assignment to ‘temp1’, but it does allow us to assign to ‘temp2’ without doing a further check. We have defined invariants for the variables [5] temp1 and temp2 that guarantee that the assignment from temp1 to temp2 will succeed. Since we can do this invariant check at compile-time we now have a simple "compile-time message-check" capability for SmallJava.

Pessimistic inside: Optimistic outside

While adding all this support for pessimistic checking, we unfortunately have been cheating a bit. We left out the implementation of "Do the right thing". What is the right thing to do? Well, it is possible that a method can handle different types of ‘point’s and by finding out which type of point it has it will behave differently. For example, we could use a default ‘z’ value if we are given a 2d point when we expect a 3d point. This is useful behavior but is not the most common behavior.

The more common answer is that if a method doesn’t get what it expects it doesn’t know what to do. In this case we really can’t catch the error at all: we have to let it go to the caller. That makes us an optimistic method from the caller’s point of view. For example:

pointA.vectorFrom(pointB)

will throw a DoesNotUnderstandException if pointB is not able to respond to (#x,#y). So whether we are optimistic or pessimistic within our method, we are still optimistic as far as the caller is concerned.

Pessimistic in: Pessimistic out

How can we change this? We need more invariants within our #vectorFrom method so we don’t have any message-checks inside it. Our only message-check is to ‘point’ itself, so if we can make the caller guarantee that ‘point’ passes our message-check then we will guarantee we can execute the method without further verification. Notationally this is simple enough:

        vectorFrom((#x,#y) point) {            
            return new Point(x - point.x(), y - point.y());
        }

We are now forcing the caller to explicitly satisfy our requirement that ‘point’ understands (#x,#y) before they can even call our method instead of making the caller handle our DoesNotUnderstandException if ‘point’ does not understand (#x,#y). We just passed the verification requirement in a different and more-explicit manner than before [6].

Repassing the pessimistic requirement

SmallJava could previously handle the DoesNotUnderstandException through the usual exception handling mechanisms, so only the ultimate handler of the DoesNotUnderstandException (who ever that may be) would need to be involved [7]. Now we have to handle the pessimistic checking explicitly at all levels. If the caller to our #vectorFrom method looked like this under optimistic messaging:

        class Line {
            Line(pointA, pointB) {
                this.pointA = pointA;
                this.pointB = pointB;
            }
            
            vector() {
                return vectorFrom(pointA,pointB);
            }
            
            pointA, pointB;
        }

It would now have to look like this for pessimistic verification:

        class Line {
            Line((#x,#y) pointA, (#x,#y) pointB) {
                this.pointA = pointA;
                this.pointB = pointB;
            }
            
            vector() {
                return vectorFrom(pointA,pointB);
            }
            
            (#x,#y) pointA, pointB;
        }

Where does the buck stop?

We keep passing the buck for the pessimistic message-check verification, but some SmallJava expression must be a "buck consumer" and "verification producer". We have already seen one of them: the explicit message-check when used as an expression instead of an invariant. All the pessimistic-invariants allowed us to do one thing: remember an earlier message-check. If we have a method:

        (#x,#y) point;
        try {
            point = (#x,#y) newPoint;
        } catch (DoesNotUnderstandException e) {
            //Do the right thing
        }
        (new Line(point,point)).vector();

then we can take advantage of the single message-check of ‘newPoint’ to know all the other messages involved with creating a line and sending #vector to the line will succeed. We now have a pretty good memory caused by explicitly stating what we want to remember (require) about message-understanding throughout the flow of the program.

Having one message-check is better than many, but what if I want to get rid of that message check too? Aren’t there any other "buck consumers"? There is one other case where the result of an expression is guaranteed to respond to certain messages, object construction. If we build a new Point we know what messages it responds to, the methods Point implements. The expression:

    new Point(x,y);

is equivalent to:

    (#x,#y,#r,#theta,#vectorFrom) new Point(x,y);

so we can finally get rid of our last message-check for our example program:

        (#x,#y) point = new Point(x,y);
        (new Line(point,point)).vector();

Now it is completely verifiable at compile time that all message sends will be successful (ignoring ‘x’ and ‘y’). We have turned off the need for Optimistic messaging for SmallJava for this particular example and now have a completely pessimistic and compile-time verified program.

The Unmentionables

There were several aspects unmentioned in the above discussion of pessimistically verified message checking. What happens when we get a ‘null’? What are the return values for the methods? How do we know that the method implementing a message (by name) is semantically equivalent to what we expect the method to do? These I will address in future sections on ‘null’ explicitly and on ‘interface’ instead of ‘message’ based typing. For the moment I will leave them unmentioned.

Where’s the static typing?

We have shown how we can make SmallJava pessimistic, but how do we make it completely compile-time verified or "statically typed"? If all our pessimistic message-checks can be moved to the point of object construction then all of them can be verified at compile-time. This would be a statically message-checked program. Is this possible? Generally, no. At some point we will have to hope that a particular object understands more messages than we have been assured that it understands. For example, if we have a keyed collection object:

    class KeyedCollection {
        atKey_put(key, value) {...}
        atKey(key) {…; return value;}
    }

Then what can we be sure about the object returned from #atKey? We can’t be sure of anything. For example, in:

        (#x,#y) inPoint = ...;
        namesToPoints.atKey_put("test",inPoint);
        outPoint = namesToPoints.atKey("test");

We can’t be sure that ‘outPoint’ is able to respond to (#x,#y), so we will have to do a runtime message check to verify it. We can still be pessimistic by checking ‘output’ before sending a message to it, but we can not do it statically.

Sure we can! We can define a new class:

    class KeyedCollectionOfPoints {
        atKey_put(key, (#x,#y) value) {...}
        (#x,#y) atKey(key) {...; return value;}
    }

Well, that solves our problem but now we have added even more information (or "noise") to our program. We had to create a whole new class to support being able to statically verify that a "collection of points" is really a ‘CollectionOfPoints’. Also note that a ‘KeyedCollectionOfPoints’ can not be just a "wrapper" of a ‘KeyedCollection’: We can not use a KeyedCollection to implement our KeyedCollectionOfPoints because we would stilly have to do a runtime check to convert the "atKey" to an "(#x,#y) atKey". We have to completely rewrite the KeyedCollectionOfPoints from scratch to have the new compile-time verifiable message-checks. So much for code reuse.

A solution to all this extra effort is to have parameterized classes that "effectively" (or actually) code-generate classes that are compile-time verifiable for a given set of message constraints. We can develop a "template" class:

    class KeyedCollectionOf<valueMessages> {
        atKey_put(key, <valueMessages> value) {...}
        <valueMessages> atKey(key) {...; return value;}
    }

And simply use it like so:

        (#x,#y) inPoint = ...;
        namesToPoints = new KeyedCollectionOf<(#x,#y)>();
        namesToPoints.atKey_put("test",inPoint);
        outPoint = namesToPoints.atKey("test");

So now we at least don’t have to write a bunch of different classes for every variation we need, we can let the compiler do it for us. And then the compiler can statically verify the messages.

Some final remarks about static typing

One aspect to notice for the above classes is that they are not compatible. A KeyedCollectionOfPoints can not be used where you expect a KeyedCollection because the method #atKey_put(,(#x,#y)) is more restrictive than #atKey_put(,). A KeyedCollection can not be used where you expect a KeyedCollectionOfPoints because the method #atKey() is more lenient than #atKey()->(#x,#y). The classes are completely incompatible and effectively unrelated except for the design similarity.

The second remark is that Java doesn’t support parameterized classes and interferes with developing your own specialized versions of classes because of weak interaction between types and polymorphism. For example, even if you have your own KeyedCollectionOfPoints, you can not develop a subclass of Enumeration that will return a (#x,#y) point for #nextElement. Java does not support covariant return types so even if you defined a ‘PointEnumeration’ it must either not inherit from ‘Enumeration’ or it has to return the same "type" as Enumeration returns, which knows nothing about (#x,#y) of point. This will be discussed again in a later section possibly titled "EiffelJava", but for now we can say Java itself is incapable or poorly capable of making a program compile-time verifiable.

The final remark is that all this static typing ignores ‘null’ values, which would fail all the message-checks we have been applying and prevent the ability to statically type a program. To make that static typing work we will have to say what a ‘null’ means and how it interacts with the message-checking or type system. This will be discussed in the next posting.

Conclusion

Comparing Optimistic and Pessimistic messaging

What are the tradeoffs between optimistic and pessimistic messaging now that we have shown both for SmallJava?

Optimistic messaging requires much less noise to accomplish the same, if successful, result. It is also far easier to change: if we decide to send a new message to a ‘point’ we can just send the message and know (or hope) that the object will understand it. With pessimistic messaging we have to explicitly say what messages we require an object to understand. This caused us to put a lot more information into the program (much of it "obvious") and means we have to update all this information if we decide to send a new message to an object. (I will discuss alternative ways to declaring what new messages can be understood in the section on interfaces.) Overall, optimistic messaging is much less painful and correct programs are still correct programs.

Pessimistic messaging allows us to move message-checks earlier in a program’s execution and to consolidate multiple message-checks into a single check. This allows us to identify and respond to failed message-checks long before a program needs to rely on those message checks. In many cases these message-checks can be moved all the way to the point of object construction, which allows them to be verified at compile-time. If this were possible then we could be surer of what our program does before execution verification. We have made a good step forward if our programs frequently have mistakes of expecting an object to respond to a message that it doesn’t understand.

Unfortunately complete compile-time checking is rarely possible because of weak language support, ‘null’s, or program behavior more complex than the capabilities of even a good language. Compile-time type checking also significantly reduces the reusability of classes: it requires generating new classes with all the proper "types" to be used in a particular context.

All the extra information for pessimistic messaging provided us with an additional form of documentation. Besides having the name of a method, the name of its parameters, and the context of the methods implementation (i.e. its class), pessimistic messaging allows us to express the expected methods on the parameters and the return value. Whether this is valuable documentation or not depends on the quality of expressiveness of the more core components: the method name, the parameter names and the context. If these are very descriptive and consistent through the whole application, then the pessimistic information may not be at all useful. The topic of documentation will be discussed in a later section.

Alternatives to explicit pessimistic messaging

There are other alternatives to explicitly declaring the pessimistic message-checks. We could have a program try to verify that the optimistic program will work correctly using either no extra information or much less information than the explicit pessimistic programs above. This would provide us with all the benefits of both optimistic and pessimistic messaging. It could also generate the additional documentation that a pessimistic program can provide. See [Brach+G 93][8] for a starting reference point to these types of languages. For this document I will ignore inference capabilities since they are not available in Java, Smalltalk, and other "mainstream" OO languages.

Deciding on Optimistic vs. Pessimistic messaging

SmallJava_0 supports optimistic messaging. You can send a message to any object and if the object understands the message (it has implemented a matching method) it will respond. If not, the object would throw a "DoesNotUnderstandException" that the caller can catch and respond to. This is very clean and simple, relying on the same exception handling abilities in the rest of SmallJava. Optimistic messaging’s problems are that a program can only be verified by running it and that there is less documentation of what is expected of a variable or parameter.

Should SmallJava_1 support pessimistic messaging: Should it support the ability to check whether an object understands a message before sending it to it? The answer would seem to be an emphatic yes. The only cost is the addition of the syntax "(#message,#message2,...)" that does a message check or that requires the user of a variable or parameter to do a message check. This is useful in several ways:

It allows a program to move the location of a message-check to the point where it can better handle a failure

It provides the possibility of compile-time verification.

It provides the possibility of extra documentation (that the program will actually use).

These all seem valuable enough to add them to SmallJava.

From Java and Smalltalk’s point of view this is uncontroversial: this capability is in both languages. Java provides the "cast" operation and typed variables that we will discuss in future sections. Smalltalk provides the ability to ask an object whether it #respondsTo: a message. This returns a boolean instead of throwing an exception, but the meaning is the same. Smalltalk does not support an invariant on a variable [actually, some do or at least document the invariant], but that seems a useful capability within the spirit of commercial Smalltalk. So pessimistic message verification is available in both languages and should be available in SmallJava.

Requiring Pessimistic Checking

Now for the big question: Should SmallJava_1 REQUIRE only pessimistic message checking and abandon the optimistic message checking? So far, the answer would have to be no. In many cases the pessimistic checking is not providing us with any gain because there is no advantage to moving the message-check point, the checks are not compile-time verifiable, and the invariant provide poor extra documentation. But without the gain, there is no point in the pain: every time we want an explicit message-check we have to add a lot of extra words to our program and make sure all these message-checks are in agreement with each other.

In future sections we will be discussing other language features that may make pessimistic message checking more useful and less painful. We will also be discussing aspects that make pessimistic message checking less useful (e.g. for ‘null’s). After dealing with these features and aspects we can revisit the question of whether pessimistic checking is useful enough to be required. For now, SmallJava_1 supports both.

Summary

We defined and analyzed optimistic and pessimistic message checking and found that they are both useful enough to include in SmallJava_1 and that neither is so useful as to warrant excluding the other. Our change to SmallJava was the addition of a message-check with syntax (#message1, #message2) and of a message-requirement invariant which uses the same syntax.

---------------------------------------------------------------------

[1] The terminology and behavior is similar to database transactions.

[2] I use "#foo" to label a message and "Class#foo" for a method in a particular class. The use of ‘#’ is similar enough between both Smalltalk (where it indicates a Symbol) and Java (where for Javadoc it indicates a method) to be the best choice.

[3] Or you could view it that Object defines all methods with a default behavior of: "throw new DoesNotUnderstandException();"

[4] The database transaction terminology would be a pessimistic "lock" on ‘point’

[5] We might now want to call them "invariables"

[6] We have changed the contract with the caller, see [Meyer 97]

[7] DoesNotUnderstandException is a subclass of RuntimeException, which explains why it does not have to be in a Throws clause.

[8] If anyone has a good collection of references for type inferencing, I will add them to this.

 
Publications > SmallJava Previous Page Previous TOC Next Next Page