Added IO language draft
Getty Ritter
9 years ago
| 1 | \meta{( "io-basics" "the io programming language" ("programming"))} | |
| 2 | The Io language is a small, very cool object-oriented language. | |
| 3 | ||
| 4 | There's a broad consensus among the programming-language | |
| 5 | aficionados I talk to: object-oriented programming is not nearly | |
| 6 | as useful as it was once considered to be. | |
| 7 | It's easy to come to that conclusion | |
| 8 | after seeing the OOP-mania of the 90's give way to bloated, | |
| 9 | complicated codebases, the infamous piles of | |
| 10 | \tt{AbstractStrategyFactory} subclasses and dependency injection | |
| 11 | frameworks needed to build flexible Java software, | |
| 12 | the signed-in-triplicate verbosity of Objective-C code, or | |
| 13 | the impossible-to-optimize amounts of indirection in Ruby or | |
| 14 | Python. | |
| 15 | ||
| 16 | Certainly, \em{some} problem domains map well to | |
| 17 | object-oriented programming: the traditional example is GUI | |
| 18 | programming, in which the little heterogeneous chunks of | |
| 19 | state that are objects map quite well to the widgets | |
| 20 | in a traditional WIMP interface. But in other areas, the | |
| 21 | object-oriented approach falls flat: high-performance video | |
| 22 | games, for example, have realized that traditional | |
| 23 | object-oriented modeling techniques result in \em{abysmal} | |
| 24 | cache performance, and end up using object-oriented languages | |
| 25 | to produce | |
| 26 | \link{http://gamesfromwithin.com/data-oriented-design|very much \em{non}-object-oriented designs}. | |
| 27 | ||
| 28 | It's common to see fans of object-orientation object, "Ah, well, | |
| 29 | that's because C++ isn't \em{really} what we mean," which does | |
| 30 | sound a little bit weaselly\ref{scot}, | |
| 31 | \sidenote{\link{https://en.wikipedia.org/wiki/No_true_Scotsman|No \em{true} object puts sugar on its porridge!}} | |
| 32 | but even Alan Kay, originator of | |
| 33 | the phrase "object-oriented", once said\ref{quot} | |
| 34 | \sidenote{In particular, in his talk at OOPSLA in 1997.} | |
| 35 | ||
| 36 | \blockquote | |
| 37 | { | |
| 38 | I made up the term 'object-oriented', and I can tell you I didn't | |
| 39 | have C++ in mind! | |
| 40 | } | |
| 41 | ||
| 42 | Well, what \em{did} Kay have in mind? From | |
| 43 | \link{http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en|a classic response by Kay himself}: | |
| 44 | ||
| 45 | \blockquote | |
| 46 | { | |
| 47 | OOP to me means only messaging, local retention and protection and | |
| 48 | hiding of state-process, and extreme late-binding of all things. | |
| 49 | It can be done in Smalltalk and in LISP. There are possibly other | |
| 50 | systems in which this is possible, but I'm not aware of them. | |
| 51 | } | |
| 52 | ||
| 53 | Everything feature Kay describes is something that \em{forces} the | |
| 54 | programmer to write abstract programs: hiding of local state | |
| 55 | means that features \em{must} rely on external interfaces; | |
| 56 | messaging means that interfaces \em{must} be abstract and | |
| 57 | generally specified; late-binding means the programmer \em{must} write code | |
| 58 | that works with alternate indistinguishable implementations, | |
| 59 | rather than specifying a concrete implementation up-front. Clearly, | |
| 60 | C++ does not fit the bill: it lacks messaging as a language feature, | |
| 61 | has only optional protection of state-process, and heavily discourages | |
| 62 | late-binding when it allows it at all! So: what does a Kay-style | |
| 63 | object-oriented language look like? | |
| 64 | ||
| 65 | Kay wrote the above passage in 2003, and since then, at least a few other languages | |
| 66 | have come around that are in the original spirit of Kay's vision. | |
| 67 | One of them is | |
| 68 | \link{http://iolanguage.org/|Steve Dekorte's Io language}, | |
| 69 | which I'd like to provide a whirlwind tour of here. | |
| 70 | ||
| 71 | The wonderful thing about Io is how \em{small} it is: it chooses | |
| 72 | a few simple pieces, adds some simple sugar, and builds everything | |
| 73 | else out of those pieces. It's very much feels like the | |
| 74 | distilled essence of a language paradigm. | |
| 75 | ||
| 76 | \code | |
| 77 | {\ttstr{"Hello, world!"} println | |
| 78 | } | |
| 79 | ||
| 80 | Syntactically, it's very sparse—the above snippet, which contains | |
| 81 | just two tokens, invokes a method\ref{msg} called \tt{println} on a string | |
| 82 | literal. | |
| 83 | \sidenote{ | |
| 84 | More properly, we're \em{sending a message}. The difference seems pedantic | |
| 85 | at first, but Io—in contrast to Java or C++—lets us consider the message | |
| 86 | as a \em{thing}, examining it as a value or sending it elsewhere by delegating | |
| 87 | or duplicating it. | |
| 88 | } | |
| 89 | We \em{could} include the trailing parentheses—as | |
| 90 | the \tt{println} method here takes an empty argument list—but Io allows us | |
| 91 | to omit them. | |
| 92 | ||
| 93 | \code | |
| 94 | {\ttstr{"Hello, world!"} println() | |
| 95 | } | |
| 96 | ||
| 97 | Unlike most members of the SmallTalk family, Io doesn't have | |
| 98 | unusual split-apart method names.\ref{meth} | |
| 99 | \sidenote | |
| 100 | { | |
| 101 | In SmallTalk, method names have "holes" for arguments, indicated | |
| 102 | by a colon. A method name will look like | |
| 103 | \tt{withFoo:andBar:} and invoking it will look like | |
| 104 | \tt{myObj withFoo: y andBar: z}. This is a compelling design | |
| 105 | choice in that it often \em{forces} a programmer to document an interface, | |
| 106 | because a method will look like | |
| 107 | \tt{rectWithWidth: x andHeight: y}, | |
| 108 | giving you omnipresent documentation about the order of arguments. | |
| 109 | It is, however, an unusual design choice. | |
| 110 | } | |
| 111 | A method name is a single identifier, and it takes trailing | |
| 112 | arguments in parens, unless there are no arguments. | |
| 113 | There's some special syntax for operators, to make | |
| 114 | precedence work right, but operators are themselves sugar for calling | |
| 115 | methods named things like \tt{+} or \tt{*}: | |
| 116 | ||
| 117 | \code | |
| 118 | {\ttcom{# we can write this} | |
| 119 | 2 + 3 * 4 println | |
| 120 | \ttcom{# or, equivalently, this} | |
| 121 | 2 +(3 *(4)) println() | |
| 122 | } | |
| 123 | ||
| 124 | It's an imperative language, so we can create and assign variables: | |
| 125 | ||
| 126 | \code | |
| 127 | {\ttcom{Io>} x := 2 | |
| 128 | \ttcom{Io>} x println | |
| 129 | 2 | |
| 130 | \ttcom{Io>} x = x + 1 | |
| 131 | \ttcom{Io>} x println | |
| 132 | 3 | |
| 133 | } | |
| 134 | ||
| 135 | But assignment is also syntactic sugar—underneath, it's still method calls! | |
| 136 | The code above can be desugared to explicit calls to | |
| 137 | assignment-related methods: | |
| 138 | ||
| 139 | \code | |
| 140 | {\ttcom{Io>} setSlot(\ttstr{"x"}, 2) | |
| 141 | \ttcom{Io>} getSlot(\ttstr{"x"}) println() | |
| 142 | 2 | |
| 143 | \ttcom{Io>} updateSlot(\ttstr{"x"}, x +(1)) | |
| 144 | \ttcom{Io>} getSlot(\ttstr{"x"}) println() | |
| 145 | 3 | |
| 146 | } | |
| 147 | ||
| 148 | If you look closely you'll notice that those methods aren't being | |
| 149 | called on any object in particular: when we don't supply an explicit | |
| 150 | object to call a method on, it will get called on some ambient | |
| 151 | \tt{self} object, | |
| 152 | sort of like \tt{this} in other OO languages. When we're sitting at | |
| 153 | the interactive language prompt, that ambient object is called | |
| 154 | the \tt{Lobby}. The above code is therefore \em{also} equivalent | |
| 155 | to\ref{lob} | |
| 156 | \sidenote | |
| 157 | { | |
| 158 | Although notice that \tt{Lobby} is itself a variable, so it | |
| 159 | could itself be taken as sugar for \tt{getSlot(\ttstr{"Lobby"})} called on | |
| 160 | the \tt{Lobby} object. This starts to hint at the stack of turtles | |
| 161 | underneath the Io language: it really \em{is} objects all the way | |
| 162 | down, in several ways. | |
| 163 | } | |
| 164 | ||
| 165 | \code | |
| 166 | {\ttcom{Io>} Lobby setSlot(\ttstr{"x"}, 2) | |
| 167 | \ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println() | |
| 168 | 2 | |
| 169 | \ttcom{Io>} Lobby updateSlot(\ttstr{"x"}, x +(1)) | |
| 170 | \ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println() | |
| 171 | 3 | |
| 172 | } | |
| 173 | ||
| 174 | So in Io, all actions are method calls, even "primitive" actions like | |
| 175 | assignment or variable access. | |
| 176 | ||
| 177 | Unlike most common object-oriented languages today, Io does not use | |
| 178 | class declarations: it's a \em{prototype-based} language. The other | |
| 179 | well-known language that uses prototypal OO is JavaScript, and | |
| 180 | my opinion is that it does so very badly: consequently, many people | |
| 181 | have strongly negative ideas about prototype OO. Io's model is | |
| 182 | not quite so hairy or complicated. In Io, to create a new object, | |
| 183 | we clone an old one. We do so with the \tt{clone} method, which | |
| 184 | we can use on the generic built-in \tt{Object}. | |
| 185 | ||
| 186 | \code{\ttcom{Io>} myPoint := Object \ttkw{clone}} | |
| 187 | ||
| 188 | Once we have a new object, we can use \tt{:=} to change the values | |
| 189 | of its \em{slots}, or local values. | |
| 190 | ||
| 191 | \code | |
| 192 | {\ttcom{Io>} myPoint x := 2 | |
| 193 | \ttcom{Io>} myPoint y := 8 | |
| 194 | \ttcom{Io>} (myPoint x + myPoint y) println | |
| 195 | 10 | |
| 196 | } | |
| 197 | ||
| 198 | If we clone an object, the new object remembers which object it was | |
| 199 | cloned from—its \em{prototype}—and every time we look up a slot in | |
| 200 | that object, it will first check whether \em{it} has that slot; | |
| 201 | otherwise, it'll check to see if its prototype has the slot, and so | |
| 202 | on back up the chain. That means we can clone \tt{myPoint} into a | |
| 203 | new object, but the new object will still have access to everything | |
| 204 | we defined on \tt{myPoint}: | |
| 205 | ||
| 206 | \code | |
| 207 | {\ttcom{Io>} newPoint := myPoint \ttkw{clone} | |
| 208 | \ttcom{Io>} newPoint x println | |
| 209 | 2 | |
| 210 | } | |
| 211 | ||
| 212 | We can override values on \tt{newPoint} without changing them on | |
| 213 | its parent object: | |
| 214 | ||
| 215 | \code | |
| 216 | {\ttcom{Io>} newPoint x := 7 | |
| 217 | \ttcom{Io>} newPoint x println \ttcom{# The child has the new value} | |
| 218 | 7 | |
| 219 | \ttcom{Io>} myPoint x println \ttcom{# The parent still has the old one} | |
| 220 | 2 | |
| 221 | } | |
| 222 | ||
| 223 | On the other hand, changes to the parent object will be reflected | |
| 224 | in \em{non-overridden} values on the child object: | |
| 225 | ||
| 226 | \code | |
| 227 | {\ttcom{Io>} myPoint y = 3 \ttcom{# we can change the parent value} | |
| 228 | \ttcom{Io>} newPoint y println \ttcom{# and the child can see it} | |
| 229 | 3 | |
| 230 | } | |
| 231 | ||
| 232 | We can create methods on objects as well, using slot assignment | |
| 233 | and the \tt{method} constructor. Within the code of a \tt{method}, | |
| 234 | the \em{ambient object} points to the object in which the method is held, | |
| 235 | so all variables accesses will be looked up inside the object that | |
| 236 | holds the method: | |
| 237 | ||
| 238 | \code | |
| 239 | {\ttcom{Io>} myPoint isOrigin := \ttkw{method}(x == 0 and y == 0) | |
| 240 | \ttcom{Io>} myPoint isOrigin println | |
| 241 | false | |
| 242 | } | |
| 243 | ||
| 244 | This means that if we copy that method to another object, the \tt{x} and | |
| 245 | \tt{y} variables referenced in the method will now refer to that new object. | |
| 246 | Determining what \tt{self} means for a method like this | |
| 247 | is simple: it refers to the object through which we invoke the method. | |
| 248 | ||
| 249 | \code | |
| 250 | {\ttcom{Io>} otherPoint := Object \ttkw{clone} | |
| 251 | \ttcom{Io>} otherPoint isOrigin := myPoint getSlot("isOrigin") | |
| 252 | \ttcom{Io>} otherPoint x := 0 | |
| 253 | \ttcom{Io>} otherPoint y := 0 | |
| 254 | \ttcom{Io>} otherPoint isOrigin println | |
| 255 | true | |
| 256 | } | |
| 257 | ||
| 258 | Methods can of course take arguments: | |
| 259 | ||
| 260 | \code | |
| 261 | {\ttcom{Io>} myPoint eq := \ttkw{method}(other, | |
| 262 | x == other x and y == other y) | |
| 263 | \ttcom{Io>} myPoint eq(myPoint) println | |
| 264 | true | |
| 265 | \ttcom{Io>} myPoint eq(otherPoint) println | |
| 266 | false | |
| 267 | } | |
| 268 | ||
| 269 | Because we can clone any object, any object can serve as prototype for | |
| 270 | another object. I probably would, in practice, build up a proper \em{Point} | |
| 271 | abstraction a little bit differently: | |
| 272 | ||
| 273 | \code | |
| 274 | {Point := Object \ttkw{clone} | |
| 275 | Point new := \ttkw{method}(nx, ny, | |
| 276 | p := Point clone; | |
| 277 | p x := nx; | |
| 278 | p y := ny; | |
| 279 | p) | |
| 280 | Point isOrigin := \ttkw{method}(x == 0 and y == 0) | |
| 281 | Point add := \ttkw{method}(other, | |
| 282 | new (x + other x, y + other y)) | |
| 283 | Point sub := \ttkw{method}(other, | |
| 284 | new (x - other x, y - other y)) | |
| 285 | } | |
| 286 | ||
| 287 | Here I fill in all the relevant methods on a \tt{Point}, and when I want to | |
| 288 | create an "instance", I clone the object and fill in the \tt{x} and \tt{y} values. | |
| 289 | Cloning doesn't just serve the same purpose as \em{instatiation} in other OO languages, | |
| 290 | though; it's also how we'd implement \em{subclassing}. To create a new "subclass" | |
| 291 | of \tt{Point}, I clone the \tt{Point} object and start filling in new methods | |
| 292 | instead of instantiating variables: | |
| 293 | ||
| 294 | \code | |
| 295 | {MutablePoint := Point \ttkw{clone} | |
| 296 | MutablePoint setX := \ttkw{method}(nx, x = nx; self) | |
| 297 | MutablePoint setY := \ttkw{method}(ny, y = ny; self) | |
| 298 | } | |
| 299 | ||
| 300 | For that matter, there's no reason we have to distinguish between | |
| 301 | \em{instantiating} and \em{subclassing}: that's just me explaining things | |
| 302 | in the traditional terms of class-based OO languages. | |
| 303 | We could simultaneously create a new "instance" and add extra methods, | |
| 304 | which corresponds neither strictly to subclassing nor | |
| 305 | to instantiation. If that object turns out to be useful, we can create | |
| 306 | new copies by cloning and modifying those as needed, allowing that | |
| 307 | "instance" to form a new "class". It's really quite flexible, and | |
| 308 | \link{http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html|extensive | |
| 309 | resources have been written about how to use prototype-based modelling effectively.} | |
| 310 | ||
| 311 | So we know that \tt{method}s look up their locals in the object where | |
| 312 | they're stored. But consider the classic Scheme \em{counter}, a function | |
| 313 | which returns one number higher every time you call it: | |
| 314 | ||
| 315 | \code | |
| 316 | {(\ttkw{define} (mk-counter) | |
| 317 | (\ttkw{let} ((n 0)) | |
| 318 | (\ttkw{lambda} () | |
| 319 | (set! n (+ n 1)) | |
| 320 | n))) | |
| 321 | } | |
| 322 | ||
| 323 | If we try to translate this to Io using a \tt{method}, we'll run into a | |
| 324 | problem: | |
| 325 | ||
| 326 | \code | |
| 327 | {mkCounter := \ttkw{method}( | |
| 328 | n := 0 | |
| 329 | \ttkw{method}( n = n + 1 ) | |
| 330 | )} | |
| 331 | ||
| 332 | When we run this, we get an error: | |
| 333 | ||
| 334 | \code | |
| 335 | {\ttcom{Io>} c := mkCounter | |
| 336 | \ttcom{Io>} c | |
| 337 | Exception: Object does not respond to 'n' | |
| 338 | } | |
| 339 | ||
| 340 | I said before: a \tt{method} looks up any variables mentioned inside the | |
| 341 | context where it's stored. The inner \tt{method} we create and return is | |
| 342 | stored in the \tt{Lobby}, because we're calling this at the prompt. Therefore, | |
| 343 | it is looking for \tt{n} in the \tt{Lobby}, and not in the enclosing | |
| 344 | lexical scope! If we add an \tt{n} to the \tt{Lobby}, then our code will start | |
| 345 | working: | |
| 346 | ||
| 347 | \code | |
| 348 | {\ttcom{Io>} n := 0 | |
| 349 | \ttcom{Io>} c println | |
| 350 | 1 | |
| 351 | \ttcom{Io>} c println | |
| 352 | 2 | |
| 353 | } | |
| 354 | ||
| 355 | But that's not what we wanted! We want the variable to be hidden inside a | |
| 356 | closure, so we have private, exclusive access to it. So in this case, instead | |
| 357 | of using a \tt{method}, we can use a \tt{block}, which is like a \tt{method} | |
| 358 | except that variables are looked up \em{in the enclosing lexical scope} | |
| 359 | intead. | |
| 360 | ||
| 361 | \code | |
| 362 | {mkCounter := \ttkw{method}( | |
| 363 | n := 0 | |
| 364 | \ttkw{block}( n = n + 1 ) | |
| 365 | )} | |
| 366 | ||
| 367 | Unlike \tt{method}s, \tt{block}s have to be invoked with a \tt{call} method: | |
| 368 | ||
| 369 | \code | |
| 370 | {\ttcom{Io>} c := mkCounter | |
| 371 | \ttcom{Io>} c call | |
| 372 | 1 | |
| 373 | \ttcom{Io>} c call | |
| 374 | 2} | |
| 375 | ||
| 376 | For both \tt{method}s and \tt{block}s, the place where local variables are | |
| 377 | stored is just an object: they have a new fresh object to store new local variables, | |
| 378 | but the prototype of that object that corresponds to the | |
| 379 | location of the \tt{method} or the enclosing static scope around the \tt{block}. | |
| 380 | Looking up variables in those scopes uses the same \tt{getSlot(...)} operation | |
| 381 | to look up the prototype chain, and assigning to a slot uses the same | |
| 382 | \tt{setSlot(...)} or \tt{updateSlot(...)} operations. | |
| 383 | It really \em{is} objects (and messages) all the way down. | |
| 384 | ||
| 385 | At this point, I've explained almost all the core | |
| 386 | features of the Io language. There are some more dynamic features: | |
| 387 | an object can, for example, resend or forward messages to other | |
| 388 | objects, and Io has frankly \em{staggering} amounts of introspection. | |
| 389 | Methods and blocks (which are, themselves, objects) even have their own | |
| 390 | \tt{code} method, which gives us the source code of the method in a | |
| 391 | manipulable form at runtime, so we can | |
| 392 | \link{http://viewsourcecode.org/why/hackety.org/2008/01/05/ioHasAVeryCleanMirror.html|introspect on (and modify) the AST itself}. | |
| 393 | Additionally, Io has a well-designed standard library, a nice | |
| 394 | concurrency model (both actors and futures implemented via | |
| 395 | coroutines) and a clean, well-defined C interface, making | |
| 396 | it incredibly easy to embed into a larger project as a scripting language. | |
| 397 | ||
| 398 | But a major reason I like Io is that it builds \em{so much of itself} | |
| 399 | on top of so few, straightforward features: a | |
| 400 | \link{http://iolanguage.org/scm/io/docs/IoGuide.html#Appendix-Grammar|barebones grammar}, | |
| 401 | combined with cloning objects and dispatching messages gives us a | |
| 402 | lot of expressive power. This is much closer, language-wise, to what | |
| 403 | Kay had in mind: late-bound, message-passing-based interfaces that | |
| 404 | hide internal state behind public APIs, and very little else. |