drafts/io-language.telml - when-computer (master)

Tree @master (Download .tar.gz)

io-language.telml @master — raw · history · blame

\meta{( "io-basics" "the io programming language" ("programming") 1448061082)}
The Io language is a small, very cool object-oriented language.

There's a broad consensus among the programming-language
aficionados I talk to: object-oriented programming is not nearly
as useful as it was once considered to be.
It's easy to come to that conclusion
after seeing the OOP-mania of the 90's give way to bloated,
complicated codebases, the infamous piles of
\tt{AbstractStrategyFactory} subclasses and dependency injection
frameworks needed to build flexible Java software,
the signed-in-triplicate verbosity of Objective-C code, or
the impossible-to-optimize amounts of indirection in Ruby or
Python.

Certainly, \em{some} problem domains map well to
object-oriented programming: the traditional example is GUI
programming, in which the little heterogeneous chunks of
state that are objects map quite well to the widgets
in a traditional WIMP interface. But in other areas, the
object-oriented approach falls flat: high-performance video
games, for example, have realized that traditional
object-oriented modeling techniques result in \em{abysmal}
cache performance, and end up using object-oriented languages
to produce
\link{http://gamesfromwithin.com/data-oriented-design|very much \em{non}-object-oriented designs}.

It's common to see fans of object-orientation object, "Ah, well,
that's because C++ isn't \em{really} what we mean," which does
sound a little bit weaselly\ref{scot},
\sidenote{\link{https://en.wikipedia.org/wiki/No_true_Scotsman|No \em{true} object puts sugar on its porridge!}}
but even Alan Kay, originator of
the phrase "object-oriented", once said\ref{quot}
\sidenote
{ In particular, in his \link{https://www.youtube.com/watch?v=oKg1hTOQXoY|awesome talk at OOPSLA in 1997},
 which you should \em{certainly} watch at some point.}

\blockquote
{
I made up the term 'object-oriented', and I can tell you I didn't
have C++ in mind!
}

Well, what \em{did} Kay have in mind? From
\link{http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en|a classic response by Kay himself}:

\blockquote
{
OOP to me means only messaging, local retention and protection and
hiding of state-process, and extreme late-binding of all things.
It can be done in Smalltalk and in LISP. There are possibly other
systems in which this is possible, but I'm not aware of them.
}

Everything feature Kay describes is something that \em{forces} the
programmer to write abstract programs: hiding of local state
means that features \em{must} rely on external interfaces;
messaging means that interfaces \em{must} be abstract and
generally specified; late-binding means the programmer \em{must} write code
that works with alternate indistinguishable implementations,
rather than specifying a concrete implementation up-front. Clearly,
C++ does not fit the bill: it lacks messaging as a language feature,
has only optional protection of state-process, and heavily discourages
late-binding when it allows it at all! So: what does a Kay-style
object-oriented language look like?

Kay wrote the above passage in 2003, and since then, at least a few other languages
have come around that are in the original spirit of Kay's vision.
One of them is
\link{http://iolanguage.org/|Steve Dekorte's Io language},
which I'd like to provide a whirlwind tour of here.

The wonderful thing about Io is how \em{small} it is: it chooses
a few simple pieces, adds some simple sugar, and builds everything
else out of those pieces. It's very much feels like the
distilled essence of a language paradigm.

\code
{\ttstr{"Hello, world!"} println
}

Syntactically, it's very sparse—the above snippet, which contains
just two tokens, invokes a method\ref{msg} called \tt{println} on a string
literal.
\sidenote{
More properly, we're \em{sending a message}. The difference seems pedantic
at first, but Io—in contrast to Java or C++—lets us consider the message
as a \em{thing}, examining it as a value or sending it elsewhere by delegating
or duplicating it.
}
We \em{could} include the trailing parentheses—as
the \tt{println} method here takes an empty argument list—but Io allows us
to omit them.

\code
{\ttstr{"Hello, world!"} println()
}

Unlike most members of the SmallTalk family, Io doesn't have
unusual split-apart method names.\ref{meth}
\sidenote
{
  In SmallTalk, method names have "holes" for arguments, indicated
  by a colon. A method   name will look like
  \tt{withFoo:andBar:} and invoking it will look like
  \tt{myObj withFoo: y andBar: z}. This is a compelling design
  choice in that it often \em{forces} a programmer to document an interface,
  because a method will look like
  \tt{rectWithWidth: x andHeight: y},
  giving you omnipresent documentation about the order of arguments.
  It is, however, an unusual design choice.
}
A method name is a single identifier, and it takes trailing
arguments in parens, unless there are no arguments.
There's some special syntax for operators, to make
precedence work right, but operators are themselves sugar for calling
methods named things like \tt{+} or \tt{*}:

\code
{\ttcom{# we can write this}
2 + 3 * 4  println
\ttcom{# or, equivalently, this}
2 +(3 *(4)) println()
}

It's an imperative language, so we can create and assign variables:

\code
{\ttcom{Io>} x := 2
\ttcom{Io>} x println
2
\ttcom{Io>} x = x + 1
\ttcom{Io>} x println
3
}

But assignment is also syntactic sugar—underneath, it's still method calls!
The code above can be desugared to explicit calls to
assignment-related methods:

\code
{\ttcom{Io>} setSlot(\ttstr{"x"}, 2)
\ttcom{Io>} getSlot(\ttstr{"x"}) println()
2
\ttcom{Io>} updateSlot(\ttstr{"x"}, x +(1))
\ttcom{Io>} getSlot(\ttstr{"x"}) println()
3
}

If you look closely you'll notice that those methods aren't being
called on any object in particular: when we don't supply an explicit
object to call a method on, it will get called on some ambient
\tt{self} object,
sort of like \tt{this} in other OO languages. When we're sitting at
the interactive language prompt, that ambient object is called
the \tt{Lobby}. The above code is therefore \em{also} equivalent
to\ref{lob}
\sidenote
{
Although notice that \tt{Lobby} is itself a variable, so it
could itself be taken as sugar for \tt{getSlot(\ttstr{"Lobby"})} called on
the \tt{Lobby} object. This starts to hint at the stack of turtles
underneath the Io language: it really \em{is} objects all the way
down, in several ways.
}

\code
{\ttcom{Io>} Lobby setSlot(\ttstr{"x"}, 2)
\ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println()
2
\ttcom{Io>} Lobby updateSlot(\ttstr{"x"}, x +(1))
\ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println()
3
}

So in Io, all actions are method calls, even "primitive" actions like
assignment or variable access.

Unlike most common object-oriented languages today, Io does not use
class declarations: it's a \em{prototype-based} language. The other
well-known language that uses prototypal OO is JavaScript, and
my opinion is that it does so very badly: consequently, many people
have strongly negative ideas about prototype OO. Io's model is
not quite so hairy or complicated. In Io, to create a new object,
we clone an old one. We do so with the \tt{clone} method, which
we can use on the generic built-in \tt{Object}.

\code{\ttcom{Io>} myPoint := Object \ttkw{clone}}

Once we have a new object, we can use \tt{:=} to change the values
of its \em{slots}, or local values.

\code
{\ttcom{Io>} myPoint x := 2
\ttcom{Io>} myPoint y := 8
\ttcom{Io>} (myPoint x + myPoint y) println
10
}

If we clone an object, the new object remembers which object it was
cloned from—its \em{prototype}—and every time we look up a slot in
that object, it will first check whether \em{it} has that slot;
otherwise, it'll check to see if its prototype has the slot, and so
on back up the chain. That means we can clone \tt{myPoint} into a
new object, but the new object will still have access to everything
we defined on \tt{myPoint}:

\code
{\ttcom{Io>} newPoint := myPoint \ttkw{clone}
\ttcom{Io>} newPoint x println
2
}

We can override values on \tt{newPoint} without changing them on
its parent object:

\code
{\ttcom{Io>} newPoint x := 7
\ttcom{Io>} newPoint x println \ttcom{# The child has the new value}
7
\ttcom{Io>} myPoint x println \ttcom{# The parent still has the old one}
2
}

On the other hand, changes to the parent object will be reflected
in \em{non-overridden} values on the child object:

\code
{\ttcom{Io>} myPoint y = 3 \ttcom{# we can change the parent value}
\ttcom{Io>} newPoint y println \ttcom{# and the child can see it}
3
}

We can create methods on objects as well, using slot assignment
and the \tt{method} constructor. Within the code of a \tt{method},
the \em{ambient object} points to the object in which the method is held,
so all variables accesses will be looked up inside the object that
holds the method:

\code
{\ttcom{Io>} myPoint isOrigin := \ttkw{method}(x == 0 and y == 0)
\ttcom{Io>} myPoint isOrigin println
false
}

This means that if we copy that method to another object, the \tt{x} and
\tt{y} variables referenced in the method will now refer to that new object.
Determining what \tt{self} means for a method like this
is simple: it refers to the object through which we invoke the method.

\code
{\ttcom{Io>} otherPoint := Object \ttkw{clone}
\ttcom{Io>} otherPoint isOrigin := myPoint getSlot("isOrigin")
\ttcom{Io>} otherPoint x := 0
\ttcom{Io>} otherPoint y := 0
\ttcom{Io>} otherPoint isOrigin println
true
}

Methods can of course take arguments:

\code
{\ttcom{Io>} myPoint eq := \ttkw{method}(other,
      x == other x and y == other y)
\ttcom{Io>} myPoint eq(myPoint) println
true
\ttcom{Io>} myPoint eq(otherPoint) println
false
}

Because we can clone any object, any object can serve as prototype for
another object. I probably would, in practice, build up a proper \em{Point}
abstraction a little bit differently:

\code
{Point := Object \ttkw{clone}
Point new := \ttkw{method}(nx, ny,
  p := Point clone;
  p x := nx;
  p y := ny;
  p)
Point isOrigin := \ttkw{method}(x == 0 and y == 0)
Point add := \ttkw{method}(other,
  new (x + other x, y + other y))
Point sub := \ttkw{method}(other,
  new (x - other x, y - other y))
}

Here I fill in all the relevant methods on a \tt{Point}, and when I want to
create an "instance", I clone the object and fill in the \tt{x} and \tt{y} values.
Cloning doesn't just serve the same purpose as \em{instatiation} in other OO languages,
though; it's also how we'd implement \em{subclassing}. To create a new "subclass"
of \tt{Point}, I clone the \tt{Point} object and start filling in new methods
instead of instantiating variables:

\code
{MutablePoint := Point \ttkw{clone}
MutablePoint setX := \ttkw{method}(nx, x = nx; self)
MutablePoint setY := \ttkw{method}(ny, y = ny; self)
}

For that matter, there's no reason we have to distinguish between
\em{instantiating} and \em{subclassing}: that's just me explaining things
in the traditional terms of class-based OO languages.
We could simultaneously create a new "instance" and add extra methods,
which corresponds neither strictly to subclassing nor
to instantiation. If that object turns out to be useful, we can create
new copies by cloning and modifying those as needed, allowing that
"instance" to form a new "class". It's really quite flexible, and
\link{http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html|extensive
resources have been written about how to use prototype-based modelling effectively.}

So we know that \tt{method}s look up their locals in the object where
they're stored. But consider the classic Scheme \em{counter}, a function
which returns one number higher every time you call it:

\code
{(\ttkw{define} (mk-counter)
  (\ttkw{let} ((n 0))
    (\ttkw{lambda} ()
      (set! n (+ n 1))
      n)))
}

If we try to translate this to Io using a \tt{method}, we'll run into a
problem:

\code
{mkCounter := \ttkw{method}(
  n := 0
  \ttkw{method}( n = n + 1 )
)}

When we run this, we get an error:

\code
{\ttcom{Io>} c := mkCounter
\ttcom{Io>} c
Exception: Object does not respond to 'n'
}

I said before: a \tt{method} looks up any variables mentioned inside the
context where it's stored. The inner \tt{method} we create and return is
stored in the \tt{Lobby}, because we're calling this at the prompt. Therefore,
it is looking for \tt{n} in the \tt{Lobby}, and not in the enclosing
lexical scope! If we add an \tt{n} to the \tt{Lobby}, then our code will start
working:

\code
{\ttcom{Io>} n := 0
\ttcom{Io>} c println
1
\ttcom{Io>} c println
2
}

But that's not what we wanted! We want the variable to be hidden inside a
closure, so we have private, exclusive access to it. So in this case, instead
of using a \tt{method}, we can use a \tt{block}, which is like a \tt{method}
except that variables are looked up \em{in the enclosing lexical scope}
intead.

\code
{mkCounter := \ttkw{method}(
  n := 0
  \ttkw{block}( n = n + 1 )
)}

Unlike \tt{method}s, \tt{block}s have to be invoked with a \tt{call} method:

\code
{\ttcom{Io>} c := mkCounter
\ttcom{Io>} c call
1
\ttcom{Io>} c call
2}

For both \tt{method}s and \tt{block}s, the place where local variables are
stored is just an object: they have a new fresh object to store new local variables,
but the prototype of that object that corresponds to the
location of the \tt{method} or the enclosing static scope around the \tt{block}.
Looking up variables in those scopes uses the same \tt{getSlot(...)} operation
to look up the prototype chain, and assigning to a slot uses the same
\tt{setSlot(...)} or \tt{updateSlot(...)} operations.
It really \em{is} objects (and messages) all the way down.

At this point, I've explained almost all the core
features of the Io language. There are some more dynamic features:
an object can, for example, resend or forward messages to other
objects, and Io has frankly \em{staggering} amounts of introspection.
Methods and blocks (which are, themselves, objects) even have their own
\tt{code} method, which gives us the source code of the method in a
manipulable form at runtime, so we can
\link{http://viewsourcecode.org/why/hackety.org/2008/01/05/ioHasAVeryCleanMirror.html|introspect on (and modify) the AST itself}.
Additionally, Io has a well-designed standard library, a nice
concurrency model (both actors and futures implemented via
coroutines) and a clean, well-defined C interface, making
it incredibly easy to embed into a larger project as a scripting language.

But a major reason I like Io is that it builds \em{so much of itself}
on top of so few, straightforward features: a
\link{http://iolanguage.org/scm/io/docs/IoGuide.html#Appendix-Grammar|barebones grammar},
combined with cloning objects and dispatching messages gives us a
lot of expressive power. This is much closer, language-wise, to what
Kay had in mind: late-bound, message-passing-based interfaces that
hide internal state behind public APIs, and very little else.