Criteria for evaluating indentation styles
Overview
This is a discussion of some criteria to consider for indentation styles
in multiple programming languages. Indentation style is a very contentious topic,
but there is some hope that certain criteria could be used to objectively evaluate
different styles. This is an attempt to discuss some of those criteria. The original
context was Java so the examples are mostly in Java (but the style discussions are
heavily influenced by other languages).
The following is a reference to the main thread-point that this short paper is on:
Original Posting: Where do people put their '{'
If anyone is interested in studying different indentation styles and
their affects on readability, I would suggest focusing on three things:
- How does the indentation style look without the characters (i.e.
"greeked")?
- How well does the style work with different language syntaxes
(for similar semantics)?
- How well does the style support common "atomic" semantic changes
to the source code?
"Greek" it
Considering first:
(1) How does the indentation style look without the characters (i.e.
"greeked")?
For an indentation style to be good, the structure of a program should
be visible and consistent at differnt levels of detail. Greeking helps
show some of these different levels of detail by taking away the noise
of the actual code. Ideally the picture you get from "a mile high"
(fully greeked) should be more abstract but in agreement with the detail
down at ground level (the full code).
I have included several (A..E) example indentation styles with
content greeked into a '#'. In the first variation I turned all
characters (and spaces) into '#' and in a second variation below this I
partially greeked the source but kept the braces themselves. The third
variation is the source itself.
The full greeking shows the outermost abstraction, the partial (with
'<') greeking shows a little more detail, and so on down to the full
source [2].
[If you want to go to the next topic, skip through all the examples]
============================================
===== Fully greeked:
============================================
------------------ (A) ----------------------
###################################
################################
##################################
#######################################
######################################################
########################
#######################################
#############################################
#############
#############
##
##
##
------------------ (B) ----------------------
#################################
#
################################
################################
#
#######################################
######################################################
######################
#
#######################################
#
#########################################
#
#############
#############
##
##
##
------------------ (C) ----------------------
###################################
################################
##################################
#######################################
######################################################
########################
#######################################
#############################################
#############
#############
##
##
##
------------------ (D) ----------------------
#################################
#
################################
################################
#
#######################################
######################################################
######################
#
#######################################
#
#########################################
#
#############
#############
##
##
##
------------------ (E) ----------------------
#################################
####################################
################################
###########################################
######################################################
######################
###########################################
#############################################
#################
#############
##
##
##
============================================
===== Partially greeked:
============================================
------------------ (A) ----------------------
################################# <
################################
################################ <
#######################################
######################################################
###################### <
#######################################
> ######################################### <
#############
#############
>#
>#
>#
------------------ (B) ----------------------
#################################
<
################################
################################
<
#######################################
######################################################
######################
<
#######################################
>
#########################################
<
#############
#############
>#
>#
>#
------------------ (C) ----------------------
################################# <
################################
################################ <
#######################################
######################################################
###################### <
#######################################
> ######################################### <
#############
#############
>#
>#
>#
------------------ (D) ----------------------
#################################
<
################################
################################
<
#######################################
######################################################
######################
<
#######################################
>
#########################################
<
#############
#############
>#
>#
>#
------------------ (E) ----------------------
#################################
< ################################
################################
< #######################################
######################################################
######################
< #######################################
> #########################################
< #############
#############
>#
>#
>#
============================================
===== Original source:
============================================
------------------ (A) ----------------------
for (int i = 0; i < maxRead; i++) {
SlotPi eachSlot = slotAtIndex(i);
if (eachSlot.isOptimisticLock()) {
Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row);
if (slotValue == null) {
if (rowSlotValue != null) return false;
} else if (!slotValue.equals(rowSlotValue)) {
//---spacer--
return false;
};
};
};
------------------ (B) ----------------------
for (int i = 0; i < maxRead; i++)
{
SlotPi eachSlot = slotAtIndex(i);
if (eachSlot.isOptimisticLock())
{
Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row);
if (slotValue == null)
{
if (rowSlotValue != null) return false;
}
else if (!slotValue.equals(rowSlotValue))
{
//---spacer--
return false;
};
};
};
------------------ (C) ----------------------
for (int i = 0; i < maxRead; i++) {
SlotPi eachSlot = slotAtIndex(i);
if (eachSlot.isOptimisticLock()) {
Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row);
if (slotValue == null) {
if (rowSlotValue != null) return false;
} else if (!slotValue.equals(rowSlotValue)) {
//---spacer--
return false;
};
};
};
------------------ (D) ----------------------
for (int i = 0; i < maxRead; i++)
{
SlotPi eachSlot = slotAtIndex(i);
if (eachSlot.isOptimisticLock())
{
Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row);
if (slotValue == null)
{
if (rowSlotValue != null) return false;
}
else if (!slotValue.equals(rowSlotValue))
{
//---spacer--
return false;
};
};
};
------------------ (E) ----------------------
for (int i = 0; i < maxRead; i++)
{ SlotPi eachSlot = slotAtIndex(i);
if (eachSlot.isOptimisticLock())
{ Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row);
if (slotValue == null)
{ if (rowSlotValue != null) return false;
}
else if (!slotValue.equals(rowSlotValue))
{ //---spacer--
return false;
};
};
};
============================================
I certainly didn't include every option, but the above are some of the
ones discussed recently.
Of the above, (E) does not greek at all well: the structure of the fully
greeked code and the final source structure does not visually match very
well. So (E) has serious visual peculiarities.
Variation (D) seems inferior to (B) in terms of visual clues but it
is hard to say: both produce interesting patterns. You may have noticed
one interesting feature of the above: (A) and (C) are actually the same
style with different spacing rules. This makes (A) have a similar
appearance to (B), and some people may like a less tight flow
(fractional spacing would help too).
Overall (A)/(C) tend to better describe the structure without extra
visual non-information, but I could see an argument for the "cute
patterns" in (B) and (D).
Working with different language syntaxes
Onto the next topic:
(2) How well does the style work with different language syntaxes
(for similar semantics)?
Now some people may not care about working in multiple languages, but
for those of us who do, it would be nice if similar things are as
similar as possible (without loosing the spirit of each language). This
is true even if you only use one language at a time, but may move to a
new one at some point.
So to pick a couple interesting alternatives to Java: consider Eiffel
and Python. Eiffel uses keyword delimiters instead of braces to
structure blocks and Python uses indentation itself to structure
blocks. These are significantly different from Java in syntax but are
semantically equivalent.
Transforming to different languages can be a bit peculiar, but we are
just interested in syntax changes that affect indentation styles so I
will produce some hybrid versions of the source above:
Python-Java
A Python-Java version of the code snippet might be:
for (int i = 0; i < maxRead; i++):
SlotPi eachSlot = slotAtIndex(i)
if eachSlot.isOptimisticLock():
Object slotValue = slotValues.atIndex(i)
Object rowSlotValue = eachSlot.newSlotValueFromRow(row)
if slotValue == null:
if rowSlotValue != null return false
else if !slotValue.equals(rowSlotValue):
//---spacer--
return false
The greeked Python-Java version would look like:
#################################
###############################
##############################
######################################
#####################################################
######################
######################################
############################################
############
############
This is basically variation (A) without requiring any delimiters, and
most noticeably the closing delimiters. And note that the argument
about indentation style is moot with Python: you can't change the style
without changing the actual meaning.
Eiffel-Java
An Eiffel-Java version of the code snippet might be:
for (int i = 0; i < maxRead; i++) do
SlotPi eachSlot = slotAtIndex(i)
if eachSlot.isOptimisticLock() then
Object slotValue = slotValues.atIndex(i);
Object rowSlotValue = eachSlot.newSlotValueFromRow(row)
if slotValue == null then
if rowSlotValue != null return false
elseif !slotValue.equals(rowSlotValue) then
//---spacer--
return false
end;
end;
end; -- for
and the greeked version would be:
#####################################
################################
###################################
#######################################
######################################################
#########################
#######################################
#############################################
#############
#############
####
####
#########
Basically changing to Eiffel simply enlarges the closing braces to the
full word 'end' and the opening braces to 'then', 'do', etc. Eiffel has
an official style which is very similar to (A) but allows a few other
variations with similar visual properties. For example, if the
predicate to the 'if' is large you could change style to:
if
slotValue == null
then
if rowSlotValue != null return false
elseif
!slotValue.equals(rowSlotValue)
then
//---spacer--
return false
end;
Reviewing all our example styles, the only style that works at all (and
would match closely) in these different languages is (A) [and (C) as a
modified (A)]. So by this criteria, (A) is the only acceptable style
and all the other styles are unacceptable.
How "atomic" is the style
Finally onto:
(3) How well does the style support common "atomic" semantic changes
to the source code?
Consider changes like
(3.1) Add a line in a block
(3.2) Remove a line from a block
(3.3) Move a block
and so on. Which styles treat these as atomic changes?
As some have mentioned in previous postings, the problem with (E) or
anything similar (braces tied to the first or last line of block) is
that adding/removing a line would cause you to modify an existing line
(or worse, cause you to make a mistake). (A)-(D) don't have this
problem.
When moving a block, (B)-(D) require selecting more text (or rows) but
otherwise all of (A)-(D) are similar if you move the block with control
structure itself. (B) and (D) allow you to move the block without the
control structure, but that is unlikely to be valuable for any real
change (and (A) would always have the block information with the target
control structure anyway).
Going the other direction, we can consider the atomicness of the source
verses common semantic changes: (B) and (D) have lines that could
physically be deleted or forgotten but few programs could reasonably
want to delete them. This is certainly a flaw.
So again (A) is generally better than the other variations for this
criteria although only (E) has serious problems.
Summary
So the different styles have real advantages and disadvantages in
certain measurable criteria[1].
A final analysis might simply be to compare official language standards
like: [Eiffel], [Java], and [Python]. One style pervades all three of
them (and many other languages as well): the "comb-like" style (A). So
I strongly disagree that this style is just driven by book editors: it
is actually the style that most closely presents the meaning of the
code, and does this in a common way among currently popular language
syntaxes.
I do agree that it is more important that a project having some standard
(and/or a good code-reformatter) than to build consensus among
developers-in-general on this topic. But it would be nice to have both
:-)
--Mark
mark.fussell@chimu.com
# Greek a java source file
while ( $line = <STDIN> ) {
chomp($line);
if ($line =~ m/^(\s*)/) {
print $1;
$line = $';
}
##Other variations
#$line =~ s/./\#/g;
#$line =~ s/[^\{\}\s]/\#/g;
#print $line;
while ($line =~ m/(\s*[\{\}]\s*)/) {
$before = $`;
$after = $';
$match = $1;
$match =~ tr/\{\}/\<\>/;
print "#" x (length($before));
print $match;
$line = $after;
}
print "#" x (length($line));
print "\n";
}
--Mark
mark.fussell@chimu.com
|