Objects, Messages, and Blocks in Smalltalk
Overview
This is a discussion about Blocks and their relationship
to Objects and Messages in Smalltalk. It also attempts to
formalize the difference between delivering a message and
an actual message object (a Letter).
The following is a reference to the main thread-point that this short paper is on:
Original Posting: Blocks as Objects
Marcel Weiher wrote:
> Dave Harris writes:
> >If you think of a block as a light-weight syntax for creating an anonymous
> >class with a single method (#value), they do, in fact, come down to
> >objects and methods.
>
> That's not how they look to me.
Then you really need to look at them again. A block is almost exactly
syntactic-sugar for create an object of a (new) anonymous class with a
single method. For example, the block:
[:a :b | a < b ]
expands into:
Object subclass: #Anonymous1
...
Anonymous1>>valueWith: a with: b
^a < b
^Anonymous1 new
The above is a "clean" block, and all similar types of block would be
very easy to transform. Even if you needed to pass objects in from
local scope, you would simply do that through the 'new' constructor.
The ability of blocks to modify local scope (change the binding of
variables) and to return out of their 'value' context "dirties" the
above transformation, but these abilities are too useful to remove.
Even they could (sometimes) be reproduced by messages that ask the block
the final value of the variables after execution, or a 'shouldReturn'
flag.
Or to see the similarity from another direction... Just remove all the
blocks from your code, create the classes like 'Anonymous1', and put
instances of them where the inlined block would be. Your only problem
will be with various 'dirty' capabilities, but many block don't need
these. The simplest example would be sorting-related blocks, where you
could easily create a class:
CompareAtoB
that was similar to the 'Anonymous1' class above. And then have:
aCollection asSortedCollection: (CompareAtoB newLessThan)
If you simply want the code within the block to have a name. You could
do that (without naming the 'Valuable'[1] class itself) by using the
idiom:
[:a :b | self sortA: a comparedToB: b]
This seperates concerns a bit, but has an unfortunate side-effect: it
dirties the block and makes it seem like 'self' has to be involved in
the evaluation of this other object (the Block object). Besides an
unneeded general conceptual complexity, it also will penalize most VM
performance pretty significantly [AFAIK].
-------
Now if you want a really BEAUTIFUL solution, you could support anonymous
inlined classes like:
....
aCollection sortBy: <| Object subclass |
>>valueWith: a with: b
self sortA: a comparedToB: b
!
>>sortA: a comparedToB: b
^a <= b
!
|>
This would be so BEAUTIFUL that it would make sure you created
full-blown classes, never thought about the capability again, and
avoided coffee for the next few months...
:->
--Mark
mark.fussell@chimu.com
Subsequent Discussions
Blocks, Objects, and Closures
Marcel Weiher wrote:
> But if you look a little more closely how blocks are used in
> creating control structures etc. it is absolutely obvious that
> blocks are reified *expressions*. Nothing more, nothing less.
The term 'reified expression' is equivalent to the well-defined term
'closure'. If you want to look at all of Smalltalk in terms of
closures, feel free. But then that is inconsistent with disliking
Blocks. Patrick Logan's view would be the consistent impression:
Patrick Logan wrote:
> Yep, this is how I view OOP anyway: objects are convenient syntax for
> managing related functions, closed over common data. This view makes
> anonymous blocks the "legitimate" syntax and all the class/method
> stuff just syntactic sugar to avoid using anonymous blocks for
> everything.
Blocks in Smalltalk terms
On the other hand, if you want to look at Smalltalk (or Self) in its own
terms, the primary constructs are:
(1) Objects
(2) Messages
Where Objects have:
(1a) Identity: the ability to tell two Object apart no matter whether
they appear similar in Behavior
(1b) Behavior: the result of the current or subsequent Message sends
to that Object and other objects in the system
and Messages simply:
(2a) Pass an Object [the receiver] some parameters
(2b) Return an Object [the result] back to the sender
In these simplest OO terms, clean Blocks ARE simply objects. They can
be treated just like any other object within the system. The fact they
have different syntax and a class without an IDE name, is thoroughly
irrelevant. These differences can't even be expressed in the basic OO
terms. If you build a second or third level of scaffolding terminology
over the basics of OO and they somehow do not cleanly support Blocks,
then that is the scaffolding's fault (or 'restriction') not anything
about Blocks not being consistent with OO. And having to add more terms
to the scaffolding just to describe Blocks is not reasonable if the
basic conceptual OO terms can describe Blocks.
Control structures in and of themselves do not harm the 'blocks are
objects' truth. The statement:
something ifTrue: block1 ifFalse: block2
is (ignoring optimizations) actually sending 'value' to either 'block1'
or 'block2'. The protocol is:
Boolean>>ifTrue: ifFalse:
None of the Smalltalk control structures rely on Blocks having full
closure capabilities, it is only Smalltalk programs that can/do cause
the need for dirty Blocks.
Dirty blocks
But it is true that some types of blocks frequently head toward the
"dirty" side (Luke stay away from the dirt) and need full closure
capability. If you dislike this tendancy, I would think you should
argue against this alone as opposed to Blocks in general. For example,
you might say:
(A) [:a :b | a < b] GOOD
(B) [:each | anObject accept: each] OK, but not so good
(C) [:each | each isHappy ifTrue: [anObject := each] ] BAD
(D) [:each | each isHappy ifTrue: [^each] ] BAD
etc.
I think arguing that (C) and (D) are BAD is not that hard. And
frequently a solution to (C) and (D) is to simply compose/refactor your
method better. The particular examples both work with a simple
'detect:' statement instead of the implied 'do:' statement. I think the
shorter the method and the more the method has statements at only one
level of abstraction, the less likely it would have the variations in
(C) and (D). Getting rid of them completely would almost certainly
require a more painful solution (like throwing Exceptions instead of the
inner '^' return or having special expression control syntax), but
avoiding them would lead to better code when possible.
The variation in (B) could be avoided with something like:
(VisitWrapper newSubject: anObject)
but I don't think that is worth much unless it becomes a common idiom.
While-loop example
Marcel Weiher wrote:
> Consider:
>
> while ( expr ) expr;
> block whileTrue: block.
>
> This also explains why they do (and must) have the ability
> to modify variables in their surrounding scope: that's just
> what expressions do.
I don't think the example inherently explains that at all, and I would
suggest trying to look at it in terms of Objects before you think
'expressions'. The statement:
<<Valueable[Boolean]>> whileTrue: <<Valueable>>
does not require either Valueable object to modify local variables. For
example:
[self hasMoreInput] whileTrue: [self processInput]
or
aBooleanHolder whileTrue: aCommandObject
Sometimes one of the <<Valueable>> objects might need to be a full
closure (more for the inner '^' than modifying a local variable), but
that is only because of a higher level application need, certainly not
the result of an apparent syntax change[1].
> Of course, treating blocks like anonynous methods defined
> inside another method is an interesting option (though
> defining a whole new class seems complete overkill).
Actually, the opposite is true. Methods are not a part of the basic OO
model[2], and certainly not methods within methods. Adding this would
be overkill. But treating Blocks as Objects with different behavior
from all other Objects (i.e. they have a different Class) is easily
within the basic OO model. This is "underkill". Dirty blocks might as
well be called full closures since there is plenty of understanding on
that topic. Calling dirty blocks 'reified c-like expressions' is
probably not going to help a 'C', Smalltalk, Eiffel, or Lisp
programmer. Calling them Objects with full-closure capabilities may
help some percentage of that group.
Summary
The amazing part of Smalltalk and Self is how far a pure Object approach
really will take you without having any nastiness, and then how little
nastiness is required to be truly useful. Only some types of Blocks are
nasty [and powerful], the rest are just plain old Objects with a special
construction syntax.
--Mark
mark.fussell@chimu.com
"Nasty" terminology
Eliot Miranda wrote:
> Mark Fussell wrote:
> > The amazing part of Smalltalk and Self is how far a pure Object approach
> > really will take you without having any nastiness, and then how little
> > nastiness is required to be truly useful. Only some types of Blocks are
> > nasty [and powerful], the rest are just plain old Objects with a special
> > construction syntax.
>
> I really don't buy these poorly applied value judgments of "nasty". Either
> you look at the entirety of the problem (what the solution would look like
> without closures with ability to assign-to their enclosing lexical
> environment, and/or without indefinite extent, and/or non-local return, vs
> what it would look like with it) or you're "being economical with the
> actualite".
I think Eliot was speaking generally, but MY definition of "nastiness"
through the previous posting was simply:
Going beyond a simple Object-Message paradigm
and was not meant to be a value judgement.
I should probably have consistently used the term "dirtiness" since it
coincides with the block terminology and is slightly less pejorative.
And I will thoroughly agree with Eliot's points with a poem
You may need to put little bits of dirt here and there
to avoid making your application a pile of **** everywhere
:-)
Personally, I don't really think closure capability is particularly
nasty dirt at all. The only confusing area is returning out of a
context that is not executing anymore. That shows a block being used a
bit beyond the programmers understanding (or the unfortunate lack of a
"return from block" syntax). The rest of the time, some code might be a
bad use of closure capabilities (e.g. using 'do' instead of a smarter
iterator) but it isn't particularly confusing.
[[Rest removed as not on same topic]]
Messages are not Obects [in simplest OO terms]
Marcel Weiher wrote:
> Mark Fussell writes:
> >and Messages simply:
> > (2a) Pass an Object [the receiver] some parameters
> > (2b) Return an Object [the result] back to the sender
>
> >In these simplest OO terms, clean Blocks ARE simply objects.
>
> So are messages, but that's just because everything in ST
> is an object. However, messages (and blocks) are about
> being evaluated (/sent), whereas normal objects are not.
Messages ARE NOT objects in these simplest OO terms. Objects and
messages are different (exclusive and complementary) concepts. To
define "Object" you need "Message" and to define "Message" you need
"Object".
Repeating the definition:
Objects have:
(1a) Identity: the ability to tell two Object apart no matter whether
they appear similar in Behavior
(1b) Behavior: the result of the current or subsequent Messages
to that Object and other objects in the system
Messages simply:
(2a) Pass an Object [the receiver] some parameter objects
(2b) Return an Object [the result] back to the sender object
These are complementary and exclusive definitions. Messages are not
objects nor vice-versa[1].
To give the primordial analogy: Objects are like Cells, with a membrane
that separates their outsides (providing the behavior that others can
see) from their insides (the implementation of the behavior). Messages
are interactions with the cell membrane. Messages do not have a
membrane, they don't have identity, and they don't have behavior. They
are simply the act of:
Passing a Cell [the receiver] some other cells
Returning a Cell [the result] back to the sender of the message
Messages aren't cells or objects. I can rename Message to 'Permeation'
temporarily if that helps:
The primary constructs are:
(1) Cells
(2) Permeations
Where Cells have:
(1a) Identity: the ability to tell two Object apart no matter whether
they appear similar in Behavior
(1b) Behavior: the result of the current or subsequent Permeation
to that Cell and other cells in the system
and Permeations simply:
(2a) Pass a Cell [the receiver] some parameter cells
(2b) Return a Cell [the result] back to the sender cell
Permeations/messages are simply not objects in OO terms.
On the other hand, clean blocks ARE simply objects, with a membrane that
responds to certain messages. Blocks "are about" receiving #value
messages. Blocks aren't sent, blocks aren't evaluated, blocks are sent
messages: usually #value and its related protocol. The statement:
b := [:a :b | a < b].
Creates an object and assigns it to 'b'. I can then send it a message:
^b value: 1 value: 2.
The sending of #value:value: could be called 'evaluating' but that is
just a categorization of the kind of behavior commonly associated with
'#value:value:' messages.
Turning Messages into Objects
You can turn the concept of Message into an Object but they are not
objects in these simplest OO terms (the original terms Alan Kay was
working with in building scalable [i.e. like biological] systems). As
soon as you talk about MessageObjects, you have either moved up a layer
of abstraction or you are completely changing the conceptual model (e.g.
like switching to closure terminology). If you want to talk about
Smalltalk in terms other than the simplest OO terms, that is fine, but
you then can't claim that Blocks don't make sense as Objects in OO
terms, they just don't make sense in your particular reference system.
> The issue is how *behaviour* is specified. I'd say the
> central paradigm is message passing, not "anonymous function
> passing", so it would be nice if the higher order mechanisms
> were also built around message passing (messages that take
> messages as arguments and deal with them) instead of suddenly
> switching to "anonymous functions" or "expressions" or "closures"
> or whatever?
But I [and others] never switched... you are forcing a switch even when
there is no problem. Clean blocks are simply objects. You send them a
message, they do something inside themselves, and they return a value.
I didn't switch to "anonymous function passing", so I don't know where
that is comming from. Messages pass objects that "permeate the
membrane" and they return objects that "return through the membrane".
You have to have both Objects and these (non-Object) messages to
understand that basics of OO. And blocks are among the Object category.
Lazy evaluation
> Another example would be something like lazy evaluation. Should
> it be
> anObject computeResult.
> -> anObject lazy:[ :someObject | someObject computeResult].
> or
> anObject computeResult
> -> anObject lazy computeResult
>
> You can try the same with asynchronous messages sends, futures,
> etc. Basically any type of higher-order mechanism.
If
(anObject lazy)
creates an object that will return a Future to all message sends, then
that is fine. Sending it #computeResult just returns a Future.
If you want to create MessageObjects as a higher level of abstraction,
could just do it with the basic Object capabilities (and standard
protocols):
anObject message: (Message newNamed: #computeResult)
anObject lazy: (Message newNamed: #computeResult with: anObject)
anObject lazy: (#computeResult asMessageWith: anObject)
But it is likely that the MessageObject protocol might as well use the
<<Valueable>> protocol:
>>value
>>value: a
>>value: a value: b
So the implementation of 'lazy:' would be.
Object>>lazy: aMessage
...
aMessage value: self
...
And any single-parameter #value: supporting object will work. Although
there are lots of these (see the ValueModel classes), one of the most
common is ("lo and behold") Block objects. The simplest way to
implement:
Message class>>newNamed: aSymbol
is
^[:receiver | receiver perform: aSymbol]
although many other implementations are fine/better. For example,
having true MessageObjects might be good:
Message>>value: receiver
^receiver perform: mySymbol
Further, you could extract the 'perform' overhead [if needed] with
custom Behavior construction. But in any case the way to produce
higher-order functionality in Smalltalk is to create smarter objects.
Changing syntax to support MessageObjects (Letters)
If you want to change syntax (syntactic sugar) so every message within a
pseudo-Smalltalk is turned into sending a MessageObject, you would turn:
anObject computeResult.
into:
anObject message: (#computeResult asMessage).
Continuing down that path would take you to a "higher level"
MessageObject-passing language, but it would still have to understand
the core-level permeation/message concepts just to support the single
'#message:' message. If you go far enough, you may have little reason
to use Smalltalk as the base, or you will still have Smalltalk concepts
the next level down.
Finally, just to repeat the Blocks are Objects argument, the
transformation of my previous example:
b := [:a :b | a < b].
^b value: 1 value: 2.
would mean:
b := [:a :b | a < b]. "Create an Object"
b message: (#value:value asMessageWith: 1 with: 2)
The Block is still an object and is not affect by the transformation of
message sends into full MessageObjects.
--Mark
mark.fussell@chimu.com
Term Message -> [Delivery, Letter]
For some reason, I feel like trying again from a slightly new,
completely aligned (among Marcel Weiher, myself, and others) perspective
on Blocks and such. Although not crucial to the main discussion, I feel
I need to define some terms in new unambiguous terminology. The main
problem is the overloaded term 'message', so this posting simply removes
that overloading.
I will rename both my use of 'message' and Marcel's use of 'message' so
they don't conflict. I think it works pretty well and is very faithful
to the authors' intents:
My 'message' will be renamed 'Delivery'.
This is the basic Smalltalk capability of permeating an
Object/Cell's membrane: that is, enabling an object to
respond to a message selector.
Marcel's 'message' will be renamed 'Letter'.
This is what I previously called a MessageObject,
and it requires the concept of Delivery but it is not
equivalent. It is pretty much equivalent to the
[not usually needed] Message class in Smalltalk, which
uses the 'doesNotUnderstand:' delivery.
So my previous definition of OO in its most basic terms becomes:
The primary constructs of OO are:
(1) Objects
(2) Deliveries
Objects have:
(1a) Identity: the ability to tell two Object apart no matter whether
they appear similar in Behavior
(1b) Behavior: the result of the current or subsequent Deliveries
to that Object and other objects in the system
Deliveries:
(2a) Pass an Object [the receiver] some attached Objects
(2b) Return an Object [the reply] back to the sender Object
Although the above is the simplest conception of OO, we can
formalize/reify what is being passed to an Object at Delivery time by
creating a concept of a Letter. A Letter is a kind of Object that has a
'name' and any number of attachments. By changing our conceptualization
we get:
The primary constructs of Letter-based OO are:
(1) Objects
(2) Deliveries
(3) Letters
Objects have:
(1a) Identity: the ability to tell two Object apart no matter whether
they appear similar in Behavior
(1b) Behavior: the result of the current or subsequent Deliveries
to that Object and other objects in the system
Deliveries:
(2a) Pass an Object [the receiver] a Letter
(2b) Return an Object [the reply] back to the sender Object
A Letter is an Object that:
(3a) Has a name (Object)
(3b) Has a number (0..n) of attached Objects
I think this Letter-based OO integrates in Marcel's model/desires
without losing a way to refer to the lower level 'delivery' process.
Adding a bit of Smalltalk details to this second conception, we can take
the following approach.
All objects respond to the 'receive:' delivery. This is kind of
equivalent to the "doesNotUnderstand:" method except it 'abstractly'
always occurs instead of just when other approaches fail. Within the
'receive:' method you would need to write the functionality that
responds appropriately to the kinds of Letters you might receive. For
the moment, we can ignore the internal details.
If you use a delivery name other than 'receive:' it is equivalent to
creating a Letter that has the name of the delivery name and the
arguments to the delivery are attached to the letter:
anObject do: foo
becomes:
anObject receive: (Letter name: #do attachment1: foo)
Please ignore the infinite recursion in the above (assume Letter has a
'primitive' implementation to avoid this problem). All we care about is
anObject can receiver a Letter that has some attachments associated with
it.
This transformation can be done by the compiler or even the object
itself... the only important aspect is that the transformation
conceptually always occurs.
Summary
With the above we can talk about either Letters explicitly or Deliveries
explicitly depending on which we need to focus on. Be careful using the
term 'message' because Deliveries and Letters are very different
concepts, and if the reader picks a different one then the writer,
miscommunication will ensue.
Now onto the [barely :-] more interesting issue: Smalltalk without
Blocks.
--Mark
|