| 1 |
Something I've felt for a long time is that structural types are underused and underappreciated in modern programming languages. They're not _unheard_ of: plenty of programming languages feature them in prominent or less-than-prominent ways! But I'm always surprised that they don't show up more prominently.
|
| 2 |
|
| 3 |
In particular, I feel that structural types combine about 95% of what I want out of dynamic duck typing with about 95% of what I want out of static type-checking: and honestly, that's a good amount of both! This post is intended to be a quick example of what structural types looks like and why I sometimes point to them as the static answer to duck typing.
|
| 4 |
|
| 5 |
# Structural Types vs. Nominal Types
|
| 6 |
|
| 7 |
If you're familiar with Java, you know what nominal types look like. Consider the following (contrived) example in Java:
|
| 8 |
|
| 9 |
```java
|
| 10 |
«java/nominal»
|
| 11 |
```
|
| 12 |
|
| 13 |
Both `Cat` and `Dog` are classes that implement the same interface: both of them have a zero-argument function `speak` that says something. However, as written, I can't easily write a function that accepts either a `Cat` or a `Dog` and statically rejects other objects which don't have a `speak` method. Java knows that these are different types. Why? Because they're named different things! I could manually tell Java that they implement a shared interface, but there are no features that can work over objects of the same "shape" without explicitly indicating that they share that shape.
|
| 14 |
|
| 15 |
The OCaml language has a different approach to objects. I can define analogous classes in OCaml like this:
|
| 16 |
|
| 17 |
```ocaml
|
| 18 |
«ocaml/classes»
|
| 19 |
```
|
| 20 |
|
| 21 |
However, OCaml has a crucial difference in how it treats the types of objects: OCaml cares more about the _interface_ than the name. Consider the function below:
|
| 22 |
|
| 23 |
```ocaml
|
| 24 |
«ocaml/uses»
|
| 25 |
```
|
| 26 |
|
| 27 |
If you're used to curly-brace languages, the syntax might be unfamiliar: the `#` operator works like the `.` operator in Java and other object-oriented languages, so the expression `obj#speak` is the OCaml equivalent of `obj.speak()` in most traditional object-oriented languages. If we load this file into an OCaml repl, we can observe the type that OCaml has inferred for this function:
|
| 28 |
|
| 29 |
```ocaml
|
| 30 |
# hear_what_it_has_to_say;;
|
| 31 |
- : < speak : unit; .. > -> unit = <fun>
|
| 32 |
```
|
| 33 |
|
| 34 |
This takes a bit of unpacking, so the way to read this type is as follows: `hear_what_it_has_to_say` is a function which takes a value of an object type (that's the stuff in angle brackets) which has a method called `speak` which returns `unit` (more or less like `void` in Java or C++). The object _may_ have other methods, which is what the `..` means. Finally, the function itself returns `unit`.
|
| 35 |
|
| 36 |
In short, this function takes an argument that must be an object that has at least a `speak` method which doesn't return anything. This means I can call it with both a `cat` and a `dog`: after all, they both fit that description!
|
| 37 |
|
| 38 |
```ocaml
|
| 39 |
«ocaml/calling»
|
| 40 |
```
|
| 41 |
Notice that at no point in the above code did I indicate that `cat` or `dog` shared an interface: in fact, I didn't define any interfaces at all! I simply created data that had a particular shape, and wrote code that assumed a particular shape, and put them together.
|
| 42 |
|
| 43 |
# Row Polymorphism
|
| 44 |
|
| 45 |
Sometimes when I talk about structural typing, I talk specifically about _row types_ or _row polymorphism_. This is a particular implementation of structural typing which happens to be convenient and easy to reason about, although others exist.[^1]
|
| 46 |
|
| 47 |
[^1] The most notable other approach to structural typing in this way is _structural subtyping_, which I'm not going to go over here, but which also exists in OCaml: the [Real World OCaml](http://shop.oreilly.com/product/0636920024743.do) book has a section on [Subtyping Versus Row Polymorphism](https://realworldocaml.org/v1/en/html/objects.html#subtyping-vs.-row-polymorphism) which explains it in a bit more detail.
|
| 48 |
|
| 49 |
You've already seen an example of row polymorphism: OCaml's object types! The `..` in the above type signature is what we would call a "row variable", a stand-in for "the rest of the object". In the above instance, both `dog` and `cat` had the same interface, but we could define a new object that features a different, larger interface:
|
| 50 |
|
| 51 |
```ocaml
|
| 52 |
«ocaml/bigger-object»
|
| 53 |
```
|
| 54 |
|
| 55 |
A `cow` now has a method that neither `cat` nor `dog` bothered to implement. However, we can still call our `hear_what_it_has_to_say` method on it without trouble, _even though its type is strictly larger than the types of both `cat` and `dog`_:
|
| 56 |
|
| 57 |
```ocaml
|
| 58 |
«ocaml/call-cow»
|
| 59 |
```
|
| 60 |
|
| 61 |
A powerful feature of row types is that we can give intermediate names to structures or parts of structures and use them accordingly. For example, I can write a function like the one above that calls a method and then returns the object it received:
|
| 62 |
|
| 63 |
```ocaml
|
| 64 |
«ocaml/speaker»
|
| 65 |
```
|
| 66 |
|
| 67 |
Here's the type that OCaml infers for this:
|
| 68 |
|
| 69 |
```ocaml
|
| 70 |
# ecce_orator;
|
| 71 |
- : (< speak : unit; .. > as 'a) -> 'a = <fun>
|
| 72 |
```
|
| 73 |
This types looks _mostly_ like the one before, except that we give the object type a temporary alias (here it is `'a`) which allows us to express that the return value of the function is exactly the same as what we got in. This is important, and is one of the things that separates systems like row typing from other approaches to structural types like structural subtyping.
|
| 74 |
|
| 75 |
# Why Does It Matter?
|
| 76 |
|
| 77 |
I said near the beginning that structural subtyping gives you 90% of what you want from duck typing and 90% of what you want from static typing. For a long time, I suspected that people who were fans of dynamic languages would start to find structurally-typed systems and incorporate them into languages which would try to take advantage of static types while retaining the flexibility of dynamic systems that permit duck-typed interfaces. I recently found a newer language which is a _perfect_ example of exactly this approach: the [Crystal](https://crystal-lang.org/) language. To demonstrate, here's the above OCaml snippets look like when written in Crystal:
|
| 78 |
|
| 79 |
```ruby
|
| 80 |
«crystal/program»
|
| 81 |
```
|
| 82 |
|
| 83 |
If you know Ruby, this program will look _very_ familiar: it's valid Ruby source! In particular, like the OCaml above, it's a program that can call `hear_what_it_has_to_say` on any object with a `speak` method through the magic of duck typing! Amazingly, it's also valid Crystal, and produces exactly the same output. There's an important difference, though: if I were to ammend this program with a line lik `hear_what_it_has_to_say(5)`, then the Crystal compiler gives me the following _compile-time_ error:
|
| 84 |
|
| 85 |
```
|
| 86 |
Error in src/main.cr:19: instantiating 'hear_what_it_has_to_say(Int32)'
|
| 87 |
|
| 88 |
hear_what_it_has_to_say(5)
|
| 89 |
^~~~~~~~~~~~~~~~~~~~~~~
|
| 90 |
|
| 91 |
in src/main.cr:12: undefined method 'speak' for Int32
|
| 92 |
|
| 93 |
obj.speak
|
| 94 |
^~~~~
|
| 95 |
|
| 96 |
Rerun with --error-trace to show a complete error trace.
|
| 97 |
```
|
| 98 |
|
| 99 |
This is a bit Crystal-specific, but what it's telling us is that the literal `5` (which Crystal takes as having the type `Int32`) doesn't have a method called `speak`, and therefore doesn't type-check. Crystal is doing something very much like what OCaml does here, but it's also doing it while presenting you with an interface that looks a lot like Ruby's: it's specifically designed to enable the use of duck typing while still preventing cases that would end up failing at runtime!
|
| 100 |
|
| 101 |
# But You Said 95% Up At The Top
|
| 102 |
|
| 103 |
Okay, there are a _few_ drawbacks to a system like this. One small cost relative to a more traditional nominally-typed system is performance: it's difficult to implement this kind of type system without _some_ kind of indirection, which is a small but nonetheless present cost. When a method is given an object which has a `speak` method, it needs to know where the code for that method lives, which means I'll need some kind of function pointer or vtable to tell it, and won't be able to statically call the method I know about, or I'll have to replicate the method's code several times to accomodate every data layout used in practice. This sort of indirection wouldn't necessarily be required in a system with more rigid types!
|
| 104 |
|
| 105 |
A slightly bigger cost on the type-system front is that structural systems like this have slightly weaker static guarantees. In a Java-like setting, if I accidentally try to call `myCat.speka()` instead of `myCat.speak()`, then the compiler can immediately spot a problem in the function definition: `Cat` objects don't have a `speka` method! In the analogous OCaml function, however, I might not get the problem so easily: if I mistyped `hear_what_it_has_to_say` above with a `speka` method, then the function itself would have been fine: it just means that it takes an object that has a `speka` method instead! Our program _as a whole_ still wouldn't compile, but the error wouldn't arise until later, when we try to pass a `cat` object to the method. In this case, we're probably safe, but when you start to look at programs across module or compilation unit boundaries, you can start seeing that it's possible for this sort of error to slip in unnoticed until later compilation units are presented with a nonsensical interface.
|
| 106 |
|
| 107 |
Finally, there's the cost relative to traditional dynamic type systems: these structurally-typed systems are often less expressive than pure duck typing! OCaml's approach, for example, doesn't let you branch on whether or not an object has a given method, like Python's `hasattr` or Ruby's `respond_to?`: you either use the interface you're given, or you don't. Crystal _does_ let you do this, but the type system becomes more complex and sporadically harder to reason about, and it will regularly (although it didn't come up in my example above) simply give up inference and require the you to fill in the types that you want.
|
| 108 |
|
| 109 |
Of course, in some of these cases, I'm also setting up a bit of an artificial divide: there's nothing wrong with having a system that has features of structural _and_ nominal typing! OCaml does, and this can be very powerful: we can have some parts of the program that work over types whose structure is inferred, and others that work over types whose structures is declared up-front, and both of these can happen in concert. We also can build gradually typed systems that allow full dynamic types and gradually use more static knowledge to move towards a system like this, for a full spectrum of flexibility and safety.
|
| 110 |
|
| 111 |
But even with the drawbacks as described, I contend that systems that use structural types as a modeling tool are a powerful middle step between dynamic and static languages, and they can definitely enable new powerful tools that allow the flexible experimentation of dynamically-typed languages while retaining some of the safety properties provided by statically-typed languages.
|