Added IO language draft
Getty Ritter
9 years ago
1 | \meta{( "io-basics" "the io programming language" ("programming"))} | |
2 | The Io language is a small, very cool object-oriented language. | |
3 | ||
4 | There's a broad consensus among the programming-language | |
5 | aficionados I talk to: object-oriented programming is not nearly | |
6 | as useful as it was once considered to be. | |
7 | It's easy to come to that conclusion | |
8 | after seeing the OOP-mania of the 90's give way to bloated, | |
9 | complicated codebases, the infamous piles of | |
10 | \tt{AbstractStrategyFactory} subclasses and dependency injection | |
11 | frameworks needed to build flexible Java software, | |
12 | the signed-in-triplicate verbosity of Objective-C code, or | |
13 | the impossible-to-optimize amounts of indirection in Ruby or | |
14 | Python. | |
15 | ||
16 | Certainly, \em{some} problem domains map well to | |
17 | object-oriented programming: the traditional example is GUI | |
18 | programming, in which the little heterogeneous chunks of | |
19 | state that are objects map quite well to the widgets | |
20 | in a traditional WIMP interface. But in other areas, the | |
21 | object-oriented approach falls flat: high-performance video | |
22 | games, for example, have realized that traditional | |
23 | object-oriented modeling techniques result in \em{abysmal} | |
24 | cache performance, and end up using object-oriented languages | |
25 | to produce | |
26 | \link{http://gamesfromwithin.com/data-oriented-design|very much \em{non}-object-oriented designs}. | |
27 | ||
28 | It's common to see fans of object-orientation object, "Ah, well, | |
29 | that's because C++ isn't \em{really} what we mean," which does | |
30 | sound a little bit weaselly\ref{scot}, | |
31 | \sidenote{\link{https://en.wikipedia.org/wiki/No_true_Scotsman|No \em{true} object puts sugar on its porridge!}} | |
32 | but even Alan Kay, originator of | |
33 | the phrase "object-oriented", once said\ref{quot} | |
34 | \sidenote{In particular, in his talk at OOPSLA in 1997.} | |
35 | ||
36 | \blockquote | |
37 | { | |
38 | I made up the term 'object-oriented', and I can tell you I didn't | |
39 | have C++ in mind! | |
40 | } | |
41 | ||
42 | Well, what \em{did} Kay have in mind? From | |
43 | \link{http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en|a classic response by Kay himself}: | |
44 | ||
45 | \blockquote | |
46 | { | |
47 | OOP to me means only messaging, local retention and protection and | |
48 | hiding of state-process, and extreme late-binding of all things. | |
49 | It can be done in Smalltalk and in LISP. There are possibly other | |
50 | systems in which this is possible, but I'm not aware of them. | |
51 | } | |
52 | ||
53 | Everything feature Kay describes is something that \em{forces} the | |
54 | programmer to write abstract programs: hiding of local state | |
55 | means that features \em{must} rely on external interfaces; | |
56 | messaging means that interfaces \em{must} be abstract and | |
57 | generally specified; late-binding means the programmer \em{must} write code | |
58 | that works with alternate indistinguishable implementations, | |
59 | rather than specifying a concrete implementation up-front. Clearly, | |
60 | C++ does not fit the bill: it lacks messaging as a language feature, | |
61 | has only optional protection of state-process, and heavily discourages | |
62 | late-binding when it allows it at all! So: what does a Kay-style | |
63 | object-oriented language look like? | |
64 | ||
65 | Kay wrote the above passage in 2003, and since then, at least a few other languages | |
66 | have come around that are in the original spirit of Kay's vision. | |
67 | One of them is | |
68 | \link{http://iolanguage.org/|Steve Dekorte's Io language}, | |
69 | which I'd like to provide a whirlwind tour of here. | |
70 | ||
71 | The wonderful thing about Io is how \em{small} it is: it chooses | |
72 | a few simple pieces, adds some simple sugar, and builds everything | |
73 | else out of those pieces. It's very much feels like the | |
74 | distilled essence of a language paradigm. | |
75 | ||
76 | \code | |
77 | {\ttstr{"Hello, world!"} println | |
78 | } | |
79 | ||
80 | Syntactically, it's very sparse—the above snippet, which contains | |
81 | just two tokens, invokes a method\ref{msg} called \tt{println} on a string | |
82 | literal. | |
83 | \sidenote{ | |
84 | More properly, we're \em{sending a message}. The difference seems pedantic | |
85 | at first, but Io—in contrast to Java or C++—lets us consider the message | |
86 | as a \em{thing}, examining it as a value or sending it elsewhere by delegating | |
87 | or duplicating it. | |
88 | } | |
89 | We \em{could} include the trailing parentheses—as | |
90 | the \tt{println} method here takes an empty argument list—but Io allows us | |
91 | to omit them. | |
92 | ||
93 | \code | |
94 | {\ttstr{"Hello, world!"} println() | |
95 | } | |
96 | ||
97 | Unlike most members of the SmallTalk family, Io doesn't have | |
98 | unusual split-apart method names.\ref{meth} | |
99 | \sidenote | |
100 | { | |
101 | In SmallTalk, method names have "holes" for arguments, indicated | |
102 | by a colon. A method name will look like | |
103 | \tt{withFoo:andBar:} and invoking it will look like | |
104 | \tt{myObj withFoo: y andBar: z}. This is a compelling design | |
105 | choice in that it often \em{forces} a programmer to document an interface, | |
106 | because a method will look like | |
107 | \tt{rectWithWidth: x andHeight: y}, | |
108 | giving you omnipresent documentation about the order of arguments. | |
109 | It is, however, an unusual design choice. | |
110 | } | |
111 | A method name is a single identifier, and it takes trailing | |
112 | arguments in parens, unless there are no arguments. | |
113 | There's some special syntax for operators, to make | |
114 | precedence work right, but operators are themselves sugar for calling | |
115 | methods named things like \tt{+} or \tt{*}: | |
116 | ||
117 | \code | |
118 | {\ttcom{# we can write this} | |
119 | 2 + 3 * 4 println | |
120 | \ttcom{# or, equivalently, this} | |
121 | 2 +(3 *(4)) println() | |
122 | } | |
123 | ||
124 | It's an imperative language, so we can create and assign variables: | |
125 | ||
126 | \code | |
127 | {\ttcom{Io>} x := 2 | |
128 | \ttcom{Io>} x println | |
129 | 2 | |
130 | \ttcom{Io>} x = x + 1 | |
131 | \ttcom{Io>} x println | |
132 | 3 | |
133 | } | |
134 | ||
135 | But assignment is also syntactic sugar—underneath, it's still method calls! | |
136 | The code above can be desugared to explicit calls to | |
137 | assignment-related methods: | |
138 | ||
139 | \code | |
140 | {\ttcom{Io>} setSlot(\ttstr{"x"}, 2) | |
141 | \ttcom{Io>} getSlot(\ttstr{"x"}) println() | |
142 | 2 | |
143 | \ttcom{Io>} updateSlot(\ttstr{"x"}, x +(1)) | |
144 | \ttcom{Io>} getSlot(\ttstr{"x"}) println() | |
145 | 3 | |
146 | } | |
147 | ||
148 | If you look closely you'll notice that those methods aren't being | |
149 | called on any object in particular: when we don't supply an explicit | |
150 | object to call a method on, it will get called on some ambient | |
151 | \tt{self} object, | |
152 | sort of like \tt{this} in other OO languages. When we're sitting at | |
153 | the interactive language prompt, that ambient object is called | |
154 | the \tt{Lobby}. The above code is therefore \em{also} equivalent | |
155 | to\ref{lob} | |
156 | \sidenote | |
157 | { | |
158 | Although notice that \tt{Lobby} is itself a variable, so it | |
159 | could itself be taken as sugar for \tt{getSlot(\ttstr{"Lobby"})} called on | |
160 | the \tt{Lobby} object. This starts to hint at the stack of turtles | |
161 | underneath the Io language: it really \em{is} objects all the way | |
162 | down, in several ways. | |
163 | } | |
164 | ||
165 | \code | |
166 | {\ttcom{Io>} Lobby setSlot(\ttstr{"x"}, 2) | |
167 | \ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println() | |
168 | 2 | |
169 | \ttcom{Io>} Lobby updateSlot(\ttstr{"x"}, x +(1)) | |
170 | \ttcom{Io>} Lobby getSlot(\ttstr{"x"}) println() | |
171 | 3 | |
172 | } | |
173 | ||
174 | So in Io, all actions are method calls, even "primitive" actions like | |
175 | assignment or variable access. | |
176 | ||
177 | Unlike most common object-oriented languages today, Io does not use | |
178 | class declarations: it's a \em{prototype-based} language. The other | |
179 | well-known language that uses prototypal OO is JavaScript, and | |
180 | my opinion is that it does so very badly: consequently, many people | |
181 | have strongly negative ideas about prototype OO. Io's model is | |
182 | not quite so hairy or complicated. In Io, to create a new object, | |
183 | we clone an old one. We do so with the \tt{clone} method, which | |
184 | we can use on the generic built-in \tt{Object}. | |
185 | ||
186 | \code{\ttcom{Io>} myPoint := Object \ttkw{clone}} | |
187 | ||
188 | Once we have a new object, we can use \tt{:=} to change the values | |
189 | of its \em{slots}, or local values. | |
190 | ||
191 | \code | |
192 | {\ttcom{Io>} myPoint x := 2 | |
193 | \ttcom{Io>} myPoint y := 8 | |
194 | \ttcom{Io>} (myPoint x + myPoint y) println | |
195 | 10 | |
196 | } | |
197 | ||
198 | If we clone an object, the new object remembers which object it was | |
199 | cloned from—its \em{prototype}—and every time we look up a slot in | |
200 | that object, it will first check whether \em{it} has that slot; | |
201 | otherwise, it'll check to see if its prototype has the slot, and so | |
202 | on back up the chain. That means we can clone \tt{myPoint} into a | |
203 | new object, but the new object will still have access to everything | |
204 | we defined on \tt{myPoint}: | |
205 | ||
206 | \code | |
207 | {\ttcom{Io>} newPoint := myPoint \ttkw{clone} | |
208 | \ttcom{Io>} newPoint x println | |
209 | 2 | |
210 | } | |
211 | ||
212 | We can override values on \tt{newPoint} without changing them on | |
213 | its parent object: | |
214 | ||
215 | \code | |
216 | {\ttcom{Io>} newPoint x := 7 | |
217 | \ttcom{Io>} newPoint x println \ttcom{# The child has the new value} | |
218 | 7 | |
219 | \ttcom{Io>} myPoint x println \ttcom{# The parent still has the old one} | |
220 | 2 | |
221 | } | |
222 | ||
223 | On the other hand, changes to the parent object will be reflected | |
224 | in \em{non-overridden} values on the child object: | |
225 | ||
226 | \code | |
227 | {\ttcom{Io>} myPoint y = 3 \ttcom{# we can change the parent value} | |
228 | \ttcom{Io>} newPoint y println \ttcom{# and the child can see it} | |
229 | 3 | |
230 | } | |
231 | ||
232 | We can create methods on objects as well, using slot assignment | |
233 | and the \tt{method} constructor. Within the code of a \tt{method}, | |
234 | the \em{ambient object} points to the object in which the method is held, | |
235 | so all variables accesses will be looked up inside the object that | |
236 | holds the method: | |
237 | ||
238 | \code | |
239 | {\ttcom{Io>} myPoint isOrigin := \ttkw{method}(x == 0 and y == 0) | |
240 | \ttcom{Io>} myPoint isOrigin println | |
241 | false | |
242 | } | |
243 | ||
244 | This means that if we copy that method to another object, the \tt{x} and | |
245 | \tt{y} variables referenced in the method will now refer to that new object. | |
246 | Determining what \tt{self} means for a method like this | |
247 | is simple: it refers to the object through which we invoke the method. | |
248 | ||
249 | \code | |
250 | {\ttcom{Io>} otherPoint := Object \ttkw{clone} | |
251 | \ttcom{Io>} otherPoint isOrigin := myPoint getSlot("isOrigin") | |
252 | \ttcom{Io>} otherPoint x := 0 | |
253 | \ttcom{Io>} otherPoint y := 0 | |
254 | \ttcom{Io>} otherPoint isOrigin println | |
255 | true | |
256 | } | |
257 | ||
258 | Methods can of course take arguments: | |
259 | ||
260 | \code | |
261 | {\ttcom{Io>} myPoint eq := \ttkw{method}(other, | |
262 | x == other x and y == other y) | |
263 | \ttcom{Io>} myPoint eq(myPoint) println | |
264 | true | |
265 | \ttcom{Io>} myPoint eq(otherPoint) println | |
266 | false | |
267 | } | |
268 | ||
269 | Because we can clone any object, any object can serve as prototype for | |
270 | another object. I probably would, in practice, build up a proper \em{Point} | |
271 | abstraction a little bit differently: | |
272 | ||
273 | \code | |
274 | {Point := Object \ttkw{clone} | |
275 | Point new := \ttkw{method}(nx, ny, | |
276 | p := Point clone; | |
277 | p x := nx; | |
278 | p y := ny; | |
279 | p) | |
280 | Point isOrigin := \ttkw{method}(x == 0 and y == 0) | |
281 | Point add := \ttkw{method}(other, | |
282 | new (x + other x, y + other y)) | |
283 | Point sub := \ttkw{method}(other, | |
284 | new (x - other x, y - other y)) | |
285 | } | |
286 | ||
287 | Here I fill in all the relevant methods on a \tt{Point}, and when I want to | |
288 | create an "instance", I clone the object and fill in the \tt{x} and \tt{y} values. | |
289 | Cloning doesn't just serve the same purpose as \em{instatiation} in other OO languages, | |
290 | though; it's also how we'd implement \em{subclassing}. To create a new "subclass" | |
291 | of \tt{Point}, I clone the \tt{Point} object and start filling in new methods | |
292 | instead of instantiating variables: | |
293 | ||
294 | \code | |
295 | {MutablePoint := Point \ttkw{clone} | |
296 | MutablePoint setX := \ttkw{method}(nx, x = nx; self) | |
297 | MutablePoint setY := \ttkw{method}(ny, y = ny; self) | |
298 | } | |
299 | ||
300 | For that matter, there's no reason we have to distinguish between | |
301 | \em{instantiating} and \em{subclassing}: that's just me explaining things | |
302 | in the traditional terms of class-based OO languages. | |
303 | We could simultaneously create a new "instance" and add extra methods, | |
304 | which corresponds neither strictly to subclassing nor | |
305 | to instantiation. If that object turns out to be useful, we can create | |
306 | new copies by cloning and modifying those as needed, allowing that | |
307 | "instance" to form a new "class". It's really quite flexible, and | |
308 | \link{http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html|extensive | |
309 | resources have been written about how to use prototype-based modelling effectively.} | |
310 | ||
311 | So we know that \tt{method}s look up their locals in the object where | |
312 | they're stored. But consider the classic Scheme \em{counter}, a function | |
313 | which returns one number higher every time you call it: | |
314 | ||
315 | \code | |
316 | {(\ttkw{define} (mk-counter) | |
317 | (\ttkw{let} ((n 0)) | |
318 | (\ttkw{lambda} () | |
319 | (set! n (+ n 1)) | |
320 | n))) | |
321 | } | |
322 | ||
323 | If we try to translate this to Io using a \tt{method}, we'll run into a | |
324 | problem: | |
325 | ||
326 | \code | |
327 | {mkCounter := \ttkw{method}( | |
328 | n := 0 | |
329 | \ttkw{method}( n = n + 1 ) | |
330 | )} | |
331 | ||
332 | When we run this, we get an error: | |
333 | ||
334 | \code | |
335 | {\ttcom{Io>} c := mkCounter | |
336 | \ttcom{Io>} c | |
337 | Exception: Object does not respond to 'n' | |
338 | } | |
339 | ||
340 | I said before: a \tt{method} looks up any variables mentioned inside the | |
341 | context where it's stored. The inner \tt{method} we create and return is | |
342 | stored in the \tt{Lobby}, because we're calling this at the prompt. Therefore, | |
343 | it is looking for \tt{n} in the \tt{Lobby}, and not in the enclosing | |
344 | lexical scope! If we add an \tt{n} to the \tt{Lobby}, then our code will start | |
345 | working: | |
346 | ||
347 | \code | |
348 | {\ttcom{Io>} n := 0 | |
349 | \ttcom{Io>} c println | |
350 | 1 | |
351 | \ttcom{Io>} c println | |
352 | 2 | |
353 | } | |
354 | ||
355 | But that's not what we wanted! We want the variable to be hidden inside a | |
356 | closure, so we have private, exclusive access to it. So in this case, instead | |
357 | of using a \tt{method}, we can use a \tt{block}, which is like a \tt{method} | |
358 | except that variables are looked up \em{in the enclosing lexical scope} | |
359 | intead. | |
360 | ||
361 | \code | |
362 | {mkCounter := \ttkw{method}( | |
363 | n := 0 | |
364 | \ttkw{block}( n = n + 1 ) | |
365 | )} | |
366 | ||
367 | Unlike \tt{method}s, \tt{block}s have to be invoked with a \tt{call} method: | |
368 | ||
369 | \code | |
370 | {\ttcom{Io>} c := mkCounter | |
371 | \ttcom{Io>} c call | |
372 | 1 | |
373 | \ttcom{Io>} c call | |
374 | 2} | |
375 | ||
376 | For both \tt{method}s and \tt{block}s, the place where local variables are | |
377 | stored is just an object: they have a new fresh object to store new local variables, | |
378 | but the prototype of that object that corresponds to the | |
379 | location of the \tt{method} or the enclosing static scope around the \tt{block}. | |
380 | Looking up variables in those scopes uses the same \tt{getSlot(...)} operation | |
381 | to look up the prototype chain, and assigning to a slot uses the same | |
382 | \tt{setSlot(...)} or \tt{updateSlot(...)} operations. | |
383 | It really \em{is} objects (and messages) all the way down. | |
384 | ||
385 | At this point, I've explained almost all the core | |
386 | features of the Io language. There are some more dynamic features: | |
387 | an object can, for example, resend or forward messages to other | |
388 | objects, and Io has frankly \em{staggering} amounts of introspection. | |
389 | Methods and blocks (which are, themselves, objects) even have their own | |
390 | \tt{code} method, which gives us the source code of the method in a | |
391 | manipulable form at runtime, so we can | |
392 | \link{http://viewsourcecode.org/why/hackety.org/2008/01/05/ioHasAVeryCleanMirror.html|introspect on (and modify) the AST itself}. | |
393 | Additionally, Io has a well-designed standard library, a nice | |
394 | concurrency model (both actors and futures implemented via | |
395 | coroutines) and a clean, well-defined C interface, making | |
396 | it incredibly easy to embed into a larger project as a scripting language. | |
397 | ||
398 | But a major reason I like Io is that it builds \em{so much of itself} | |
399 | on top of so few, straightforward features: a | |
400 | \link{http://iolanguage.org/scm/io/docs/IoGuide.html#Appendix-Grammar|barebones grammar}, | |
401 | combined with cloning objects and dispatching messages gives us a | |
402 | lot of expressive power. This is much closer, language-wise, to what | |
403 | Kay had in mind: late-bound, message-passing-based interfaces that | |
404 | hide internal state behind public APIs, and very little else. |