|  | 1 | \meta{("subjects-and-entities" "subjects and entities" ("programming"))} | 
|  | 2 |  | 
|  | 3 | Object-oriented programming is very bad for high-performance programs | 
|  | 4 | like video games. Or, at least, traditional object-oriented design is. | 
|  | 5 |  | 
|  | 6 | The main culprit is the cache. If I have some \em{thing} in a game—say, | 
|  | 7 | an enemy character—then it needs a lot of data associated with it: its | 
|  | 8 | 3D meshes and textures, its physics information, its current position, its | 
|  | 9 | inventory, its AI state… But not all that information is necessarily | 
|  | 10 | being used at any given time. For example, when I am running the | 
|  | 11 | AI processing step, or calculating physical forces acting on | 
|  | 12 | characters, or drawing things in the game, then I'm only using a small | 
|  | 13 | part of all the information associated with a thing. | 
|  | 14 |  | 
|  | 15 | In a proper object-oriented design, I would take all the information relevant | 
|  | 16 | to a thing and package it up in an object: for example, in a \tt{Game_Unit} | 
|  | 17 | object. That means, however, that in order | 
|  | 18 | to operate on \em{just a subset} of that information, I need to pull | 
|  | 19 | the object into the cache, which puts \em{all} the object's information | 
|  | 20 | in the cache. This means the cache gets full of extraneous data, | 
|  | 21 | and consequently cache misses become much more common, making the | 
|  | 22 | program slower overall. | 
|  | 23 |  | 
|  | 24 | One way of avoiding this problem is using so-called | 
|  | 25 | \link{http://gamesfromwithin.com/data-oriented-design|Data-Oriented Design}, | 
|  | 26 | which involves pulling apart the data stored in objects and storing it | 
|  | 27 | all separately. This has the advantage of being much more cache-friendly, | 
|  | 28 | but it generally doesn't have language support, and so has to be done | 
|  | 29 | manually. The advantages of typical object-oriented design—like encapsulation | 
|  | 30 | and data hiding—are much more difficult to retain, and the language you | 
|  | 31 | use might actively fight against you in these cases. | 
|  | 32 |  | 
|  | 33 | Following a similar path, some programmers have started using and advocating | 
|  | 34 | \link{http://www.gamedev.net/page/resources/_/technical/game-programming/understanding-component-entity-systems-r3013| | 
|  | 35 | Component-Entity Systems}, | 
|  | 36 | which are a few steps beyond Data-Oriented Design in roughly the same direction, | 
|  | 37 | but also a radically different way of structuring programs. At a high level, | 
|  | 38 | Component-Entity Systems work like this: | 
|  | 39 |  | 
|  | 40 | \ul{ | 
|  | 41 | \li{An \em{entity} is an abstract identifier. It has no | 
|  | 42 | other data \em{directly} associated with it.} | 
|  | 43 | \li{A \em{component} is a table which maps entities to a collection | 
|  | 44 | of data, ideally to pieces of simple scalar data. Not | 
|  | 45 | every entity needs to appear in a given component.} | 
|  | 46 | \li{An entity can be associated with several components. That | 
|  | 47 | means the entity can be used as an index into the component, | 
|  | 48 | allowing the programmer to retrieve or modify the associated | 
|  | 49 | data stored in the component.} | 
|  | 50 | \li{Operations are written in terms of one or more components. | 
|  | 51 | These are sometimes called \em{systems}.} | 
|  | 52 | } | 
|  | 53 |  | 
|  | 54 | This all sounds very abstract, so what does this look like in practice? Let's | 
|  | 55 | describe a basic game. For the sake of simplicity, let's assume a | 
|  | 56 | we have four salient pieces of information for an in-game unit: | 
|  | 57 | its position, its current health, its appearance, and the state of | 
|  | 58 | its AI (including current goals and actions): | 
|  | 59 |  | 
|  | 60 | \code | 
|  | 61 | {\ttkw{component} Position(int x, int y); | 
|  | 62 | \ttkw{component} Health(int x); | 
|  | 63 | \ttkw{component} Appearance(Image img); | 
|  | 64 | \ttkw{component} AI_State(State st); | 
|  | 65 | } | 
|  | 66 |  | 
|  | 67 | Every entity in the game will be associated with one or more | 
|  | 68 | of these components: scenery will have \tt{Position} and | 
|  | 69 | \tt{Appearance} data; invulnerable characters might have \tt{Position}, | 
|  | 70 | \tt{Appearance}, and \tt{AI_State}, but no \tt{Health}; a player | 
|  | 71 | character will have \tt{Position}, \tt{Health}, and \tt{Appearance}, | 
|  | 72 | but no \tt{AI_State}; and so forth. | 
|  | 73 | Well: I've been saying that entities \em{have} this data, but it's more | 
|  | 74 | appropriate to say they're \em{associated with} that data. All | 
|  | 75 | the relevant data is stored in the components, and the entity | 
|  | 76 | is being used as the index used to access that data. | 
|  | 77 | \ref{db} | 
|  | 78 | \sidenote{If this is hard to visualize, think of components as database | 
|  | 79 | tables, and your entity as the primary key used in all your tables, | 
|  | 80 | which you can use to access or update that data. | 
|  | 81 | You of course wouldn't want to \em{actually} implement a game like | 
|  | 82 | that, but it's similar in spirit.} | 
|  | 83 |  | 
|  | 84 | Now, to write the salient \em{operations} of our game, we can | 
|  | 85 | write them in terms of one or more components: when we draw | 
|  | 86 | our game, we write the draw operation in terms of \tt{Appearance}, | 
|  | 87 | which allows us to loop over everything that has image data | 
|  | 88 | associated with it. On the other hand, when we want to move units | 
|  | 89 | around, we'll write an operation in terms of both the \tt{AI_State} | 
|  | 90 | \em{and} \tt{Position} components, because we need to know what | 
|  | 91 | the unit plans to do in order to update its position. | 
|  | 92 |  | 
|  | 93 | The advantages of Data-Oriented Design that I described earlier are | 
|  | 94 | still in effect, because all the data associated with a component | 
|  | 95 | can be stored packed together: looping over the \tt{Appearance} | 
|  | 96 | component won't bring any non-\tt{Appearance} data into the cache. | 
|  | 97 | But Component-Entity Systems have an extra advantage | 
|  | 98 | over pure Data-Oriented Design: you gain \em{compositionality} in | 
|  | 99 | a way that's not necessarily present in other designs.\ref{bom} | 
|  | 100 | \sidenote{ | 
|  | 101 | \link{http://t-machine.org/index.php/2013/05/30/designing-bomberman-with-an-entity-system-which-components/| | 
|  | 102 | This blog post about applying component-entity design to Bomberman} | 
|  | 103 | has a section called | 
|  | 104 | \em{Consider the possibilities of your new Components}, in which | 
|  | 105 | the author explores the compositions of components as interesting avenues | 
|  | 106 | of discovering new gameplay. It also goes into a lot more detail | 
|  | 107 | about what a component-entity approach would look like. | 
|  | 108 | } | 
|  | 109 |  | 
|  | 110 | For example: | 
|  | 111 | you might design a game with a \tt{Health_Pickup} component for | 
|  | 112 | items that restore health when a player interacts with them. An | 
|  | 113 | entity that is associated with both the \tt{Health_Pickup} and | 
|  | 114 | \tt{AI_State} components will act as a mobile health powerup | 
|  | 115 | that can choose how to move around using some kind of AI routine. | 
|  | 116 | On the other hand, an entity associated with both | 
|  | 117 | the \tt{Health_Pickup} and \tt{Health} components is a health pickup | 
|  | 118 | which can be destroyed, perhaps so that it cannot be used by an | 
|  | 119 | opposing player. In both those cases, no extra implementation work | 
|  | 120 | would need to be done for these conjunctions of components: the new, | 
|  | 121 | interesting functionality falls out naturally from implementing | 
|  | 122 | each feature in isolation. | 
|  | 123 |  | 
|  | 124 | While Component-Entity Systems are interesting, | 
|  | 125 | there are no \em{languages} that are inherently component-entity-oriented.\ref{lng} | 
|  | 126 | \sidenote{At least, if there are, I don't know about them.} | 
|  | 127 | Component-Entity System are usually implemented using existing | 
|  | 128 | object-oriented languages. | 
|  | 129 | So my pseudocode examples above which used the \tt{\ttkw{component}} keyword | 
|  | 130 | were all pure fiction. But what \em{would} a component-entity language | 
|  | 131 | look like? | 
|  | 132 |  | 
|  | 133 | I'm going to change pace a bit and discuss an old and sadly | 
|  | 134 | mostly-forgotten paper: | 
|  | 135 | \link{http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4805&rep=rep1&type=pdf| | 
|  | 136 | Harrison and Ossher's 1993 OOPSLA paper, | 
|  | 137 | \em{Subject-Oriented Programming: A Critique of Pure Objects}}. | 
|  | 138 | The primary motivation behind the paper is addressing | 
|  | 139 | what they see as a flaw in object-oriented design: most | 
|  | 140 | object-oriented languages include inheritance, and therefore involve | 
|  | 141 | a tree of \em{is-a} relationships. However, objects don't exist in | 
|  | 142 | just a single place in a single ontology: a set of objects can be seen | 
|  | 143 | as occupying multiple places in multiple ontologies. | 
|  | 144 | As a concrete but slightly fanciful example: do we choose a culinary | 
|  | 145 | ontology for our program and write \tt{Tomato extends Vegetable}, | 
|  | 146 | or do we choose a biological ontology and write \tt{Tomato extends Fruit}? | 
|  | 147 | What if one part of the program needs one and the other needs | 
|  | 148 | the other? | 
|  | 149 |  | 
|  | 150 | More concisely, as Jorge Luis Borges\ref{borg} said: | 
|  | 151 | \sidenote{From the Jorge Luis Borges essay | 
|  | 152 | \link{http://www.crockford.com/wrrrld/wilkins.html|\em{The Analytical Language of John Wilkins}}.} | 
|  | 153 |  | 
|  | 154 | \blockquote | 
|  | 155 | {It is clear that there is no classification of the Universe that is | 
|  | 156 | not arbitrary and full of conjectures. The reason for this is very | 
|  | 157 | simple: we do not know what kind of thing the universe is.} | 
|  | 158 |  | 
|  | 159 | Harrison and Ossher propose that, instead of dealing with objects | 
|  | 160 | that are instances of classes, we deal with \em{subjects}, which | 
|  | 161 | \em{can be seen as} instances of classes. Any time an object is | 
|  | 162 | interacted with, it is interacted with in a subjective context, | 
|  | 163 | in which the data and operations associated with it can be different | 
|  | 164 | depending on the subjective context. The only thing shared between | 
|  | 165 | the context is the identity of the objects: | 
|  | 166 |  | 
|  | 167 | \blockquote | 
|  | 168 | {The essential characteristic of subject-orinted programming is | 
|  | 169 | that different subjects can separately define and operate upon | 
|  | 170 | shared objects, without any subject needing to know the details | 
|  | 171 | associated with those objects by other subjects. Only object | 
|  | 172 | identity is necessarily shared. | 
|  | 173 | } | 
|  | 174 |  | 
|  | 175 | Harrison and Ossher's proposed system is in some respects very | 
|  | 176 | similar to component-entity systems, but is in other respects quite | 
|  | 177 | different. There is certainly commonality: what a component-entity | 
|  | 178 | system calls an \em{entity}, a subject-oriented language calls an | 
|  | 179 | \em{object-identifier} or an \em{oid}, and both consider entities | 
|  | 180 | or oids to be abstract identifiers with no directly associated | 
|  | 181 | information. | 
|  | 182 |  | 
|  | 183 | In a subject-oriented language, an operation must exist within a | 
|  | 184 | given \em{subject activation}, which exposes pieces of information | 
|  | 185 | associated with the entity and a set of operations. An entity can, | 
|  | 186 | within a given subject activation, have fields and methods, and | 
|  | 187 | those fields and methods can be entirely distinct from the fields | 
|  | 188 | and methods exposed by a different subject activation. Every | 
|  | 189 | method invocation, therefore, has to exist within a subject | 
|  | 190 | activation, so that we know the actions and fields available | 
|  | 191 | within that subjective frame. | 
|  | 192 |  | 
|  | 193 | A salient \em{difference} is the way that subject-oriented programming allows | 
|  | 194 | certain pieces of information, or certain operations, to be shared | 
|  | 195 | among different subject activations. Harrison and Ossher's repeated | 
|  | 196 | example involves a \tt{Tree} object shared by, among others, | 
|  | 197 | a \tt{Woodcutter} subject and a \tt{Bird} subject. A \tt{Woodcutter}'s | 
|  | 198 | view of the tree has an estimated value, which the woodcutter | 
|  | 199 | might use to determine whether the tree is worth cutting down. On | 
|  | 200 | the other hand, a \tt{Bird}'s view of the tree involves its | 
|  | 201 | suitability for building a nest in. In both cases, though, they might | 
|  | 202 | care about a piece of information like the tree's height. | 
|  | 203 |  | 
|  | 204 | However, Harrison and Ossher's approach to this issue seems awkward: | 
|  | 205 | they suggest that, | 
|  | 206 | rather than straightforwardly sharing the height between the subjects, | 
|  | 207 | the \tt{Bird} subject and the \tt{Woodcutter} | 
|  | 208 | subject should both have \em{their own copy} of a field representing | 
|  | 209 | the tree's height, and | 
|  | 210 | that the two must be \em{made to} agree: they must return | 
|  | 211 | the same value, or some compatible value (for example, by returning | 
|  | 212 | some value which is commutative.) If they \em{fail} to agree, the | 
|  | 213 | program throws an exception. This is almost \em{certain} to be a | 
|  | 214 | source of frustration in practice, or at least a major source of | 
|  | 215 | gotchas. | 
|  | 216 |  | 
|  | 217 | The Harrison and Ossher approach also describes how to mediate | 
|  | 218 | two distinct object hierarchies, so that different subject | 
|  | 219 | activations can use inheritance over the same set of classes | 
|  | 220 | in very different ways. (The \tt{Cook} subject, for example, | 
|  | 221 | could use \tt{Tomato extends Vegetable}, while the \tt{Botanist} | 
|  | 222 | subject could use \tt{Tomato extends Fruit}, so a given | 
|  | 223 | oid can be seen by both as being situated within different | 
|  | 224 | hierarchies.) They then go on to describe how one subject's | 
|  | 225 | class hierarchy might be incomplete with respect to another | 
|  | 226 | subject's hierarchy, and describe how to match those hierarchies | 
|  | 227 | together, or infer class hierarchies based on interfaces or other | 
|  | 228 | mechanisms. | 
|  | 229 |  | 
|  | 230 | I would argue that the best thing to do is to combine the high-level | 
|  | 231 | details of Harrison and Ossher's subject-oriented language design | 
|  | 232 | with the specific mechanisms used in component-entity systems. | 
|  | 233 | They clearly have a common starting | 
|  | 234 | point and a similar approach to modeling the world, in which abstract | 
|  | 235 | entities \em{can be viewed in some context} as having associated operations | 
|  | 236 | and information. The Harrison and Ossher approach unfortunately gets caught | 
|  | 237 | in a quagmire of hierarchies and modeling, but much of that complexity can | 
|  | 238 | be alleviated if we treat subject activations like | 
|  | 239 | sets of components: suddenly, the \tt{Bird} and the \tt{Woodcutter} | 
|  | 240 | subjects/systems can simply \em{share} the \tt{TreeHeight} component, | 
|  | 241 | without having to resort to awkward and complicated agreement | 
|  | 242 | strategies on the hierarchies or results involved. | 
|  | 243 |  | 
|  | 244 | As for the specifics of what a component-entity language might | 
|  | 245 | look like, I leave that as a creative exercise for the reader. | 
|  | 246 | \ref{exc}\sidenote{I \em{do} have ideas. Someday I will implement them.} |