ChiMu  
 
Menu Edge About   Products   Services   Projects   Publications  
  Publications                 

Objects, Messages, and Blocks in Smalltalk

Overview

This is a discussion about Blocks and their relationship to Objects and Messages in Smalltalk. It also attempts to formalize the difference between delivering a message and an actual message object (a Letter).

The following is a reference to the main thread-point that this short paper is on:

Original Posting: Blocks as Objects

Marcel Weiher wrote:
> Dave Harris writes:
> >If you think of a block as a light-weight syntax for creating an anonymous
> >class with a single method (#value), they do, in fact, come down to
> >objects and methods.
>
> That's not how they look to me.

Then you really need to look at them again. A block is almost exactly syntactic-sugar for create an object of a (new) anonymous class with a single method. For example, the block:

   [:a :b | a < b ]
expands into:
   Object subclass: #Anonymous1
   ...
   Anonymous1>>valueWith: a with: b
      ^a < b

   ^Anonymous1 new

The above is a "clean" block, and all similar types of block would be very easy to transform. Even if you needed to pass objects in from local scope, you would simply do that through the 'new' constructor.

The ability of blocks to modify local scope (change the binding of variables) and to return out of their 'value' context "dirties" the above transformation, but these abilities are too useful to remove. Even they could (sometimes) be reproduced by messages that ask the block the final value of the variables after execution, or a 'shouldReturn' flag.

Or to see the similarity from another direction... Just remove all the blocks from your code, create the classes like 'Anonymous1', and put instances of them where the inlined block would be. Your only problem will be with various 'dirty' capabilities, but many block don't need these. The simplest example would be sorting-related blocks, where you could easily create a class:

   CompareAtoB
that was similar to the 'Anonymous1' class above. And then have:
   aCollection asSortedCollection: (CompareAtoB newLessThan)

If you simply want the code within the block to have a name. You could do that (without naming the 'Valuable'[1] class itself) by using the idiom:

    [:a :b | self sortA: a comparedToB: b]

This seperates concerns a bit, but has an unfortunate side-effect: it dirties the block and makes it seem like 'self' has to be involved in the evaluation of this other object (the Block object). Besides an unneeded general conceptual complexity, it also will penalize most VM performance pretty significantly [AFAIK]. ------- Now if you want a really BEAUTIFUL solution, you could support anonymous inlined classes like:

   ....
   aCollection sortBy: <| Object subclass |
      >>valueWith: a with: b
          self sortA: a comparedToB: b
          !

      >>sortA: a comparedToB: b
          ^a <= b
          !
    |>

This would be so BEAUTIFUL that it would make sure you created full-blown classes, never thought about the capability again, and avoided coffee for the next few months...

:->
--Mark
mark.fussell@chimu.com

[1] Where 'Valueable' stands for the externally visible 'value', 'valueWith', etc. protocol that Blocks and many other objects support.

Subsequent Discussions

Blocks, Objects, and Closures

Marcel Weiher wrote:
> But if you look a little more closely how blocks are used in
> creating control structures etc. it is absolutely obvious that
> blocks are reified *expressions*.  Nothing more, nothing less.

The term 'reified expression' is equivalent to the well-defined term 'closure'. If you want to look at all of Smalltalk in terms of closures, feel free. But then that is inconsistent with disliking Blocks. Patrick Logan's view would be the consistent impression:

Patrick Logan wrote:
> Yep, this is how I view OOP anyway: objects are convenient syntax for
> managing related functions, closed over common data. This view makes
> anonymous blocks the "legitimate" syntax and all the class/method
> stuff just syntactic sugar to avoid using anonymous blocks for
> everything.

Blocks in Smalltalk terms

On the other hand, if you want to look at Smalltalk (or Self) in its own terms, the primary constructs are:

   (1) Objects
   (2) Messages

Where Objects have:

   (1a) Identity: the ability to tell two Object apart no matter whether
        they appear similar in Behavior
   (1b) Behavior: the result of the current or subsequent Message sends
        to that Object and other objects in the system

and Messages simply:

   (2a) Pass an Object [the receiver] some parameters
   (2b) Return an Object [the result] back to the sender

In these simplest OO terms, clean Blocks ARE simply objects. They can be treated just like any other object within the system. The fact they have different syntax and a class without an IDE name, is thoroughly irrelevant. These differences can't even be expressed in the basic OO terms. If you build a second or third level of scaffolding terminology over the basics of OO and they somehow do not cleanly support Blocks, then that is the scaffolding's fault (or 'restriction') not anything about Blocks not being consistent with OO. And having to add more terms to the scaffolding just to describe Blocks is not reasonable if the basic conceptual OO terms can describe Blocks.

Control structures in and of themselves do not harm the 'blocks are objects' truth. The statement:

   something ifTrue: block1 ifFalse: block2
is (ignoring optimizations) actually sending 'value' to either 'block1' or 'block2'. The protocol is:
   Boolean>>ifTrue:  ifFalse: 

None of the Smalltalk control structures rely on Blocks having full closure capabilities, it is only Smalltalk programs that can/do cause the need for dirty Blocks.

Dirty blocks

But it is true that some types of blocks frequently head toward the "dirty" side (Luke stay away from the dirt) and need full closure capability. If you dislike this tendancy, I would think you should argue against this alone as opposed to Blocks in general. For example, you might say:

   (A) [:a :b | a < b]  GOOD
   (B) [:each | anObject accept: each] OK, but not so good
   (C) [:each | each isHappy ifTrue: [anObject := each] ] BAD
   (D) [:each | each isHappy ifTrue: [^each] ] BAD
   etc.

I think arguing that (C) and (D) are BAD is not that hard. And frequently a solution to (C) and (D) is to simply compose/refactor your method better. The particular examples both work with a simple 'detect:' statement instead of the implied 'do:' statement. I think the shorter the method and the more the method has statements at only one level of abstraction, the less likely it would have the variations in (C) and (D). Getting rid of them completely would almost certainly require a more painful solution (like throwing Exceptions instead of the inner '^' return or having special expression control syntax), but avoiding them would lead to better code when possible.

The variation in (B) could be avoided with something like:

      (VisitWrapper newSubject: anObject)
but I don't think that is worth much unless it becomes a common idiom.

While-loop example

Marcel Weiher wrote:
> Consider:
>
>       while ( expr ) expr;
>       block whileTrue: block.
>
> This also explains why they do (and must) have the ability
> to modify variables in their surrounding scope: that's just
> what expressions do.

I don't think the example inherently explains that at all, and I would suggest trying to look at it in terms of Objects before you think 'expressions'. The statement:

       <<Valueable[Boolean]>> whileTrue: <<Valueable>>

does not require either Valueable object to modify local variables. For example:

    [self hasMoreInput] whileTrue: [self processInput]

or

    aBooleanHolder whileTrue: aCommandObject

Sometimes one of the <<Valueable>> objects might need to be a full closure (more for the inner '^' than modifying a local variable), but that is only because of a higher level application need, certainly not the result of an apparent syntax change[1].

> Of course, treating blocks like anonynous methods defined
> inside another method is an interesting option (though
> defining a whole new class seems complete overkill).

Actually, the opposite is true. Methods are not a part of the basic OO model[2], and certainly not methods within methods. Adding this would be overkill. But treating Blocks as Objects with different behavior from all other Objects (i.e. they have a different Class) is easily within the basic OO model. This is "underkill". Dirty blocks might as well be called full closures since there is plenty of understanding on that topic. Calling dirty blocks 'reified c-like expressions' is probably not going to help a 'C', Smalltalk, Eiffel, or Lisp programmer. Calling them Objects with full-closure capabilities may help some percentage of that group.

Summary

The amazing part of Smalltalk and Self is how far a pure Object approach really will take you without having any nastiness, and then how little nastiness is required to be truly useful. Only some types of Blocks are nasty [and powerful], the rest are just plain old Objects with a special construction syntax.

--Mark
mark.fussell@chimu.com

[1] You can write 'C'-like code in Smalltalk but it will rarely be good Smalltalk, so comparing 'C'-like expressions with their closest Smalltalk equivalent is probably not a good idea unless you are going to contrast them.

[2] Yes, I know you have to implement the Object's Behavior somehow and Classes with Methods is the common way, but I would call that second-level (inner-object detail) as opposed to core.

"Nasty" terminology

Eliot Miranda wrote:
> Mark Fussell wrote:
> > The amazing part of Smalltalk and Self is how far a pure Object approach
> > really will take you without having any nastiness, and then how little
> > nastiness is required to be truly useful.  Only some types of Blocks are
> > nasty [and powerful], the rest are just plain old Objects with a special
> > construction syntax.
>
> I really don't buy these poorly applied value judgments of "nasty".  Either
> you look at the entirety of the problem (what the solution would look like
> without closures with ability to assign-to their enclosing lexical
> environment, and/or without indefinite extent, and/or non-local return, vs
> what it would look like with it) or you're "being economical with the
> actualite".

I think Eliot was speaking generally, but MY definition of "nastiness" through the previous posting was simply:

    Going beyond a simple Object-Message paradigm
and was not meant to be a value judgement.

I should probably have consistently used the term "dirtiness" since it coincides with the block terminology and is slightly less pejorative. And I will thoroughly agree with Eliot's points with a poem

    You may need to put little bits of dirt here and there
    to avoid making your application a pile of **** everywhere

:-)

Personally, I don't really think closure capability is particularly nasty dirt at all. The only confusing area is returning out of a context that is not executing anymore. That shows a block being used a bit beyond the programmers understanding (or the unfortunate lack of a "return from block" syntax). The rest of the time, some code might be a bad use of closure capabilities (e.g. using 'do' instead of a smarter iterator) but it isn't particularly confusing.

[[Rest removed as not on same topic]]

Messages are not Obects [in simplest OO terms]

Marcel Weiher wrote:
> Mark Fussell  writes:
> >and Messages simply:
> >   (2a) Pass an Object [the receiver] some parameters
> >   (2b) Return an Object [the result] back to the sender
>
> >In these simplest OO terms, clean Blocks ARE simply objects.
>
> So are messages, but that's just because everything in ST
> is an object.  However, messages (and blocks) are about
> being evaluated (/sent), whereas normal objects are not.

Messages ARE NOT objects in these simplest OO terms. Objects and messages are different (exclusive and complementary) concepts. To define "Object" you need "Message" and to define "Message" you need "Object".

Repeating the definition:

Objects have:
   (1a) Identity: the ability to tell two Object apart no matter whether
        they appear similar in Behavior
   (1b) Behavior: the result of the current or subsequent Messages
        to that Object and other objects in the system

Messages simply:
   (2a) Pass an Object [the receiver] some parameter objects
   (2b) Return an Object [the result] back to the sender object

These are complementary and exclusive definitions. Messages are not objects nor vice-versa[1].

To give the primordial analogy: Objects are like Cells, with a membrane that separates their outsides (providing the behavior that others can see) from their insides (the implementation of the behavior). Messages are interactions with the cell membrane. Messages do not have a membrane, they don't have identity, and they don't have behavior. They are simply the act of:

   Passing a Cell [the receiver] some other cells
   Returning a Cell [the result] back to the sender of the message
Messages aren't cells or objects. I can rename Message to 'Permeation' temporarily if that helps:
The primary constructs are:
   (1) Cells
   (2) Permeations

Where Cells have:
   (1a) Identity: the ability to tell two Object apart no matter whether
        they appear similar in Behavior
   (1b) Behavior: the result of the current or subsequent Permeation
        to that Cell and other cells in the system

and Permeations simply:
   (2a) Pass a Cell [the receiver] some parameter cells
   (2b) Return a Cell [the result] back to the sender cell

Permeations/messages are simply not objects in OO terms.

On the other hand, clean blocks ARE simply objects, with a membrane that responds to certain messages. Blocks "are about" receiving #value messages. Blocks aren't sent, blocks aren't evaluated, blocks are sent messages: usually #value and its related protocol. The statement:

    b := [:a :b | a < b].
Creates an object and assigns it to 'b'. I can then send it a message:
   ^b value: 1 value: 2.

The sending of #value:value: could be called 'evaluating' but that is just a categorization of the kind of behavior commonly associated with '#value:value:' messages.

Turning Messages into Objects

You can turn the concept of Message into an Object but they are not objects in these simplest OO terms (the original terms Alan Kay was working with in building scalable [i.e. like biological] systems). As soon as you talk about MessageObjects, you have either moved up a layer of abstraction or you are completely changing the conceptual model (e.g. like switching to closure terminology). If you want to talk about Smalltalk in terms other than the simplest OO terms, that is fine, but you then can't claim that Blocks don't make sense as Objects in OO terms, they just don't make sense in your particular reference system.

> The issue is how *behaviour* is specified.  I'd say the
> central paradigm is message passing, not "anonymous function
> passing", so it would be nice if the higher order mechanisms
> were also built around message passing (messages that take
> messages as arguments and deal with them) instead of suddenly
> switching to "anonymous functions" or "expressions" or "closures"
> or whatever?

But I [and others] never switched... you are forcing a switch even when there is no problem. Clean blocks are simply objects. You send them a message, they do something inside themselves, and they return a value. I didn't switch to "anonymous function passing", so I don't know where that is comming from. Messages pass objects that "permeate the membrane" and they return objects that "return through the membrane". You have to have both Objects and these (non-Object) messages to understand that basics of OO. And blocks are among the Object category.

Lazy evaluation

> Another example would be something like lazy evaluation.  Should
> it be
>      anObject computeResult.
>  ->  anObject lazy:[ :someObject | someObject computeResult].
> or
>      anObject      computeResult
>  ->  anObject lazy computeResult
>
> You can try the same with asynchronous messages sends, futures,
> etc.  Basically any type of higher-order mechanism.

If

   (anObject lazy)

creates an object that will return a Future to all message sends, then that is fine. Sending it #computeResult just returns a Future.

If you want to create MessageObjects as a higher level of abstraction, could just do it with the basic Object capabilities (and standard protocols):

   anObject message: (Message newNamed: #computeResult)
   anObject lazy:    (Message newNamed: #computeResult with: anObject)
   anObject lazy:    (#computeResult asMessageWith: anObject)

But it is likely that the MessageObject protocol might as well use the <<Valueable>> protocol:

   >>value
   >>value: a
   >>value: a value: b

So the implementation of 'lazy:' would be.

   Object>>lazy: aMessage
      ...
      aMessage value: self
      ...

And any single-parameter #value: supporting object will work. Although there are lots of these (see the ValueModel classes), one of the most common is ("lo and behold") Block objects. The simplest way to implement:

    Message class>>newNamed: aSymbol
is
   ^[:receiver | receiver perform: aSymbol]
although many other implementations are fine/better. For example, having true MessageObjects might be good:
   Message>>value: receiver
      ^receiver perform: mySymbol
Further, you could extract the 'perform' overhead [if needed] with custom Behavior construction. But in any case the way to produce higher-order functionality in Smalltalk is to create smarter objects.

Changing syntax to support MessageObjects (Letters)

If you want to change syntax (syntactic sugar) so every message within a pseudo-Smalltalk is turned into sending a MessageObject, you would turn:
    anObject computeResult.
into:
    anObject message: (#computeResult asMessage).

Continuing down that path would take you to a "higher level" MessageObject-passing language, but it would still have to understand the core-level permeation/message concepts just to support the single '#message:' message. If you go far enough, you may have little reason to use Smalltalk as the base, or you will still have Smalltalk concepts the next level down.

Finally, just to repeat the Blocks are Objects argument, the transformation of my previous example:

   b := [:a :b | a < b].
   ^b value: 1 value: 2.
would mean:
   b := [:a :b | a < b]. "Create an Object"
   b message: (#value:value asMessageWith: 1 with: 2)

The Block is still an object and is not affect by the transformation of message sends into full MessageObjects.

--Mark
mark.fussell@chimu.com

[1] Further, not everything in Smalltalk is an object. I think Eliot and other have posted on this many times, so you might want to search 'deja.com' for the examples.

Term Message -> [Delivery, Letter]

For some reason, I feel like trying again from a slightly new, completely aligned (among Marcel Weiher, myself, and others) perspective on Blocks and such. Although not crucial to the main discussion, I feel I need to define some terms in new unambiguous terminology. The main problem is the overloaded term 'message', so this posting simply removes that overloading.

I will rename both my use of 'message' and Marcel's use of 'message' so they don't conflict. I think it works pretty well and is very faithful to the authors' intents:

My 'message' will be renamed 'Delivery'.

This is the basic Smalltalk capability of permeating an Object/Cell's membrane: that is, enabling an object to respond to a message selector.

Marcel's 'message' will be renamed 'Letter'.

This is what I previously called a MessageObject, and it requires the concept of Delivery but it is not equivalent. It is pretty much equivalent to the [not usually needed] Message class in Smalltalk, which uses the 'doesNotUnderstand:' delivery.
So my previous definition of OO in its most basic terms becomes:
The primary constructs of OO are:
   (1) Objects
   (2) Deliveries

Objects have:
   (1a) Identity: the ability to tell two Object apart no matter whether
        they appear similar in Behavior
   (1b) Behavior: the result of the current or subsequent Deliveries
        to that Object and other objects in the system

Deliveries:
   (2a) Pass an Object [the receiver] some attached Objects
   (2b) Return an Object [the reply] back to the sender Object
Although the above is the simplest conception of OO, we can formalize/reify what is being passed to an Object at Delivery time by creating a concept of a Letter. A Letter is a kind of Object that has a 'name' and any number of attachments. By changing our conceptualization we get:
The primary constructs of Letter-based OO are:
   (1) Objects
   (2) Deliveries
   (3) Letters

Objects have:
   (1a) Identity: the ability to tell two Object apart no matter whether
        they appear similar in Behavior
   (1b) Behavior: the result of the current or subsequent Deliveries
        to that Object and other objects in the system

Deliveries:
   (2a) Pass an Object [the receiver] a Letter
   (2b) Return an Object [the reply] back to the sender Object

A Letter is an Object that:
   (3a) Has a name (Object)
   (3b) Has a number (0..n) of attached Objects

I think this Letter-based OO integrates in Marcel's model/desires without losing a way to refer to the lower level 'delivery' process. Adding a bit of Smalltalk details to this second conception, we can take the following approach.

All objects respond to the 'receive:' delivery. This is kind of equivalent to the "doesNotUnderstand:" method except it 'abstractly' always occurs instead of just when other approaches fail. Within the 'receive:' method you would need to write the functionality that responds appropriately to the kinds of Letters you might receive. For the moment, we can ignore the internal details.

If you use a delivery name other than 'receive:' it is equivalent to creating a Letter that has the name of the delivery name and the arguments to the delivery are attached to the letter:

   anObject do: foo
becomes:
   anObject receive: (Letter name: #do attachment1: foo)

Please ignore the infinite recursion in the above (assume Letter has a 'primitive' implementation to avoid this problem). All we care about is anObject can receiver a Letter that has some attachments associated with it.

This transformation can be done by the compiler or even the object itself... the only important aspect is that the transformation conceptually always occurs.

Summary

With the above we can talk about either Letters explicitly or Deliveries explicitly depending on which we need to focus on. Be careful using the term 'message' because Deliveries and Letters are very different concepts, and if the reader picks a different one then the writer, miscommunication will ensue.

Now onto the [barely :-] more interesting issue: Smalltalk without Blocks.

--Mark
 
Publications