gdritter repos when-computer / 7d4b2b6
Added subjects-and-entities post Getty Ritter 8 years ago
2 changed file(s) with 247 addition(s) and 0 deletion(s). Collapse all Expand all
1 \meta{("subjects-and-entities" "subjects and entities" ("programming"))}
2
3 Object-oriented programming is very bad for high-performance programs
4 like video games. Or, at least, traditional object-oriented design is.
5
6 The main culprit is the cache. If I have some \em{thing} in a game—say,
7 an enemy character—then it needs a lot of data associated with it: its
8 3D meshes and textures, its physics information, its current position, its
9 inventory, its AI state… But not all that information is necessarily
10 being used at any given time. For example, when I am running the
11 AI processing step, or calculating physical forces acting on
12 characters, or drawing things in the game, then I'm only using a small
13 part of all the information associated with a thing.
14
15 In a proper object-oriented design, I would take all the information relevant
16 to a thing and package it up in an object: for example, in a \tt{Game_Unit}
17 object. That means, however, that in order
18 to operate on \em{just a subset} of that information, I need to pull
19 the object into the cache, which puts \em{all} the object's information
20 in the cache. This means the cache gets full of extraneous data,
21 and consequently cache misses become much more common, making the
22 program slower overall.
23
24 One way of avoiding this problem is using so-called
25 \link{http://gamesfromwithin.com/data-oriented-design|Data-Oriented Design},
26 which involves pulling apart the data stored in objects and storing it
27 all separately. This has the advantage of being much more cache-friendly,
28 but it generally doesn't have language support, and so has to be done
29 manually. The advantages of typical object-oriented design—like encapsulation
30 and data hiding—are much more difficult to retain, and the language you
31 use might actively fight against you in these cases.
32
33 Following a similar path, some programmers have started using and advocating
34 \link{http://www.gamedev.net/page/resources/_/technical/game-programming/understanding-component-entity-systems-r3013|
35 Component-Entity Systems},
36 which are a few steps beyond Data-Oriented Design in roughly the same direction,
37 but also a radically different way of structuring programs. At a high level,
38 Component-Entity Systems work like this:
39
40 \ul{
41 \li{An \em{entity} is an abstract identifier. It has no
42 other data \em{directly} associated with it.}
43 \li{A \em{component} is a table which maps entities to a collection
44 of data, ideally to pieces of simple scalar data. Not
45 every entity needs to appear in a given component.}
46 \li{An entity can be associated with several components. That
47 means the entity can be used as an index into the component,
48 allowing the programmer to retrieve or modify the associated
49 data stored in the component.}
50 \li{Operations are written in terms of one or more components.
51 These are sometimes called \em{systems}.}
52 }
53
54 This all sounds very abstract, so what does this look like in practice? Let's
55 describe a basic game. For the sake of simplicity, let's assume a
56 we have four salient pieces of information for an in-game unit:
57 its position, its current health, its appearance, and the state of
58 its AI (including current goals and actions):
59
60 \code
61 {\ttkw{component} Position(int x, int y);
62 \ttkw{component} Health(int x);
63 \ttkw{component} Appearance(Image img);
64 \ttkw{component} AI_State(State st);
65 }
66
67 Every entity in the game will be associated with one or more
68 of these components: scenery will have \tt{Position} and
69 \tt{Appearance} data; invulnerable characters might have \tt{Position},
70 \tt{Appearance}, and \tt{AI_State}, but no \tt{Health}; a player
71 character will have \tt{Position}, \tt{Health}, and \tt{Appearance},
72 but no \tt{AI_State}; and so forth.
73 Well: I've been saying that entities \em{have} this data, but it's more
74 appropriate to say they're \em{associated with} that data. All
75 the relevant data is stored in the components, and the entity
76 is being used as the index used to access that data.
77 \ref{db}
78 \sidenote{If this is hard to visualize, think of components as database
79 tables, and your entity as the primary key used in all your tables,
80 which you can use to access or update that data.
81 You of course wouldn't want to \em{actually} implement a game like
82 that, but it's similar in spirit.}
83
84 Now, to write the salient \em{operations} of our game, we can
85 write them in terms of one or more components: when we draw
86 our game, we write the draw operation in terms of \tt{Appearance},
87 which allows us to loop over everything that has image data
88 associated with it. On the other hand, when we want to move units
89 around, we'll write an operation in terms of both the \tt{AI_State}
90 \em{and} \tt{Position} components, because we need to know what
91 the unit plans to do in order to update its position.
92
93 The advantages of Data-Oriented Design that I described earlier are
94 still in effect, because all the data associated with a component
95 can be stored packed together: looping over the \tt{Appearance}
96 component won't bring any non-\tt{Appearance} data into the cache.
97 But Component-Entity Systems have an extra advantage
98 over pure Data-Oriented Design: you gain \em{compositionality} in
99 a way that's not necessarily present in other designs.\ref{bom}
100 \sidenote{
101 \link{http://t-machine.org/index.php/2013/05/30/designing-bomberman-with-an-entity-system-which-components/|
102 This blog post about applying component-entity design to Bomberman}
103 has a section called
104 \em{Consider the possibilities of your new Components}, in which
105 the author explores the compositions of components as interesting avenues
106 of discovering new gameplay. It also goes into a lot more detail
107 about what a component-entity approach would look like.
108 }
109
110 For example:
111 you might design a game with a \tt{Health_Pickup} component for
112 items that restore health when a player interacts with them. An
113 entity that is associated with both the \tt{Health_Pickup} and
114 \tt{AI_State} components will act as a mobile health powerup
115 that can choose how to move around using some kind of AI routine.
116 On the other hand, an entity associated with both
117 the \tt{Health_Pickup} and \tt{Health} components is a health pickup
118 which can be destroyed, perhaps so that it cannot be used by an
119 opposing player. In both those cases, no extra implementation work
120 would need to be done for these conjunctions of components: the new,
121 interesting functionality falls out naturally from implementing
122 each feature in isolation.
123
124 While Component-Entity Systems are interesting,
125 there are no \em{languages} that are inherently component-entity-oriented.\ref{lng}
126 \sidenote{At least, if there are, I don't know about them.}
127 Component-Entity System are usually implemented using existing
128 object-oriented languages.
129 So my pseudocode examples above which used the \tt{\ttkw{component}} keyword
130 were all pure fiction. But what \em{would} a component-entity language
131 look like?
132
133 I'm going to change pace a bit and discuss an old and sadly
134 mostly-forgotten paper:
135 \link{http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4805&rep=rep1&type=pdf|
136 Harrison and Ossher's 1993 OOPSLA paper,
137 \em{Subject-Oriented Programming: A Critique of Pure Objects}}.
138 The primary motivation behind the paper is addressing
139 what they see as a flaw in object-oriented design: most
140 object-oriented languages include inheritance, and therefore involve
141 a tree of \em{is-a} relationships. However, objects don't exist in
142 just a single place in a single ontology: a set of objects can be seen
143 as occupying multiple places in multiple ontologies.
144 As a concrete but slightly fanciful example: do we choose a culinary
145 ontology for our program and write \tt{Tomato extends Vegetable},
146 or do we choose a biological ontology and write \tt{Tomato extends Fruit}?
147 What if one part of the program needs one and the other needs
148 the other?
149
150 More concisely, as Jorge Luis Borges\ref{borg} said:
151 \sidenote{From the Jorge Luis Borges essay
152 \link{http://www.crockford.com/wrrrld/wilkins.html|\em{The Analytical Language of John Wilkins}}.}
153
154 \blockquote
155 {It is clear that there is no classification of the Universe that is
156 not arbitrary and full of conjectures. The reason for this is very
157 simple: we do not know what kind of thing the universe is.}
158
159 Harrison and Ossher propose that, instead of dealing with objects
160 that are instances of classes, we deal with \em{subjects}, which
161 \em{can be seen as} instances of classes. Any time an object is
162 interacted with, it is interacted with in a subjective context,
163 in which the data and operations associated with it can be different
164 depending on the subjective context. The only thing shared between
165 the context is the identity of the objects:
166
167 \blockquote
168 {The essential characteristic of subject-orinted programming is
169 that different subjects can separately define and operate upon
170 shared objects, without any subject needing to know the details
171 associated with those objects by other subjects. Only object
172 identity is necessarily shared.
173 }
174
175 Harrison and Ossher's proposed system is in some respects very
176 similar to component-entity systems, but is in other respects quite
177 different. There is certainly commonality: what a component-entity
178 system calls an \em{entity}, a subject-oriented language calls an
179 \em{object-identifier} or an \em{oid}, and both consider entities
180 or oids to be abstract identifiers with no directly associated
181 information.
182
183 In a subject-oriented language, an operation must exist within a
184 given \em{subject activation}, which exposes pieces of information
185 associated with the entity and a set of operations. An entity can,
186 within a given subject activation, have fields and methods, and
187 those fields and methods can be entirely distinct from the fields
188 and methods exposed by a different subject activation. Every
189 method invocation, therefore, has to exist within a subject
190 activation, so that we know the actions and fields available
191 within that subjective frame.
192
193 A salient \em{difference} is the way that subject-oriented programming allows
194 certain pieces of information, or certain operations, to be shared
195 among different subject activations. Harrison and Ossher's repeated
196 example involves a \tt{Tree} object shared by, among others,
197 a \tt{Woodcutter} subject and a \tt{Bird} subject. A \tt{Woodcutter}'s
198 view of the tree has an estimated value, which the woodcutter
199 might use to determine whether the tree is worth cutting down. On
200 the other hand, a \tt{Bird}'s view of the tree involves its
201 suitability for building a nest in. In both cases, though, they might
202 care about a piece of information like the tree's height.
203
204 However, Harrison and Ossher's approach to this issue seems awkward:
205 they suggest that,
206 rather than straightforwardly sharing the height between the subjects,
207 the \tt{Bird} subject and the \tt{Woodcutter}
208 subject should both have \em{their own copy} of a field representing
209 the tree's height, and
210 that the two must be \em{made to} agree: they must return
211 the same value, or some compatible value (for example, by returning
212 some value which is commutative.) If they \em{fail} to agree, the
213 program throws an exception. This is almost \em{certain} to be a
214 source of frustration in practice, or at least a major source of
215 gotchas.
216
217 The Harrison and Ossher approach also describes how to mediate
218 two distinct object hierarchies, so that different subject
219 activations can use inheritance over the same set of classes
220 in very different ways. (The \tt{Cook} subject, for example,
221 could use \tt{Tomato extends Vegetable}, while the \tt{Botanist}
222 subject could use \tt{Tomato extends Fruit}, so a given
223 oid can be seen by both as being situated within different
224 hierarchies.) They then go on to describe how one subject's
225 class hierarchy might be incomplete with respect to another
226 subject's hierarchy, and describe how to match those hierarchies
227 together, or infer class hierarchies based on interfaces or other
228 mechanisms.
229
230 I would argue that the best thing to do is to combine the high-level
231 details of Harrison and Ossher's subject-oriented language design
232 with the specific mechanisms used in component-entity systems.
233 They clearly have a common starting
234 point and a similar approach to modeling the world, in which abstract
235 entities \em{can be viewed in some context} as having associated operations
236 and information. The Harrison and Ossher approach unfortunately gets caught
237 in a quagmire of hierarchies and modeling, but much of that complexity can
238 be alleviated if we treat subject activations like
239 sets of components: suddenly, the \tt{Bird} and the \tt{Woodcutter}
240 subjects/systems can simply \em{share} the \tt{TreeHeight} component,
241 without having to resort to awkward and complicated agreement
242 strategies on the hierarchies or results involved.
243
244 As for the specifics of what a component-entity language might
245 look like, I leave that as a creative exercise for the reader.
246 \ref{exc}\sidenote{I \em{do} have ideas. Someday I will implement them.}
1 ../drafts/subjects-and-entities.telml