| 1 |
\meta{("subjects-and-entities" "subjects and entities" ("programming"))}
|
| 2 |
|
| 3 |
Object-oriented programming is very bad for high-performance programs
|
| 4 |
like video games. Or, at least, traditional object-oriented design is.
|
| 5 |
|
| 6 |
The main culprit is the cache. If I have some \em{thing} in a game—say,
|
| 7 |
an enemy character—then it needs a lot of data associated with it: its
|
| 8 |
3D meshes and textures, its physics information, its current position, its
|
| 9 |
inventory, its AI state… But not all that information is necessarily
|
| 10 |
being used at any given time. For example, when I am running the
|
| 11 |
AI processing step, or calculating physical forces acting on
|
| 12 |
characters, or drawing things in the game, then I'm only using a small
|
| 13 |
part of all the information associated with a thing.
|
| 14 |
|
| 15 |
In a proper object-oriented design, I would take all the information relevant
|
| 16 |
to a thing and package it up in an object: for example, in a \tt{Game_Unit}
|
| 17 |
object. That means, however, that in order
|
| 18 |
to operate on \em{just a subset} of that information, I need to pull
|
| 19 |
the object into the cache, which puts \em{all} the object's information
|
| 20 |
in the cache. This means the cache gets full of extraneous data,
|
| 21 |
and consequently cache misses become much more common, making the
|
| 22 |
program slower overall.
|
| 23 |
|
| 24 |
One way of avoiding this problem is using so-called
|
| 25 |
\link{http://gamesfromwithin.com/data-oriented-design|Data-Oriented Design},
|
| 26 |
which involves pulling apart the data stored in objects and storing it
|
| 27 |
all separately. This has the advantage of being much more cache-friendly,
|
| 28 |
but it generally doesn't have language support, and so has to be done
|
| 29 |
manually. The advantages of typical object-oriented design—like encapsulation
|
| 30 |
and data hiding—are much more difficult to retain, and the language you
|
| 31 |
use might actively fight against you in these cases.
|
| 32 |
|
| 33 |
Following a similar path, some programmers have started using and advocating
|
| 34 |
\link{http://www.gamedev.net/page/resources/_/technical/game-programming/understanding-component-entity-systems-r3013|
|
| 35 |
Component-Entity Systems},
|
| 36 |
which are a few steps beyond Data-Oriented Design in roughly the same direction,
|
| 37 |
but also a radically different way of structuring programs. At a high level,
|
| 38 |
Component-Entity Systems work like this:
|
| 39 |
|
| 40 |
\ul{
|
| 41 |
\li{An \em{entity} is an abstract identifier. It has no
|
| 42 |
other data \em{directly} associated with it.}
|
| 43 |
\li{A \em{component} is a table which maps entities to a collection
|
| 44 |
of data, ideally to pieces of simple scalar data. Not
|
| 45 |
every entity needs to appear in a given component.}
|
| 46 |
\li{An entity can be associated with several components. That
|
| 47 |
means the entity can be used as an index into the component,
|
| 48 |
allowing the programmer to retrieve or modify the associated
|
| 49 |
data stored in the component.}
|
| 50 |
\li{Operations are written in terms of one or more components.
|
| 51 |
These are sometimes called \em{systems}.}
|
| 52 |
}
|
| 53 |
|
| 54 |
This all sounds very abstract, so what does this look like in practice? Let's
|
| 55 |
describe a basic game. For the sake of simplicity, let's assume a
|
| 56 |
we have four salient pieces of information for an in-game unit:
|
| 57 |
its position, its current health, its appearance, and the state of
|
| 58 |
its AI (including current goals and actions):
|
| 59 |
|
| 60 |
\code
|
| 61 |
{\ttkw{component} Position(int x, int y);
|
| 62 |
\ttkw{component} Health(int x);
|
| 63 |
\ttkw{component} Appearance(Image img);
|
| 64 |
\ttkw{component} AI_State(State st);
|
| 65 |
}
|
| 66 |
|
| 67 |
Every entity in the game will be associated with one or more
|
| 68 |
of these components: scenery will have \tt{Position} and
|
| 69 |
\tt{Appearance} data; invulnerable characters might have \tt{Position},
|
| 70 |
\tt{Appearance}, and \tt{AI_State}, but no \tt{Health}; a player
|
| 71 |
character will have \tt{Position}, \tt{Health}, and \tt{Appearance},
|
| 72 |
but no \tt{AI_State}; and so forth.
|
| 73 |
Well: I've been saying that entities \em{have} this data, but it's more
|
| 74 |
appropriate to say they're \em{associated with} that data. All
|
| 75 |
the relevant data is stored in the components, and the entity
|
| 76 |
is being used as the index used to access that data.
|
| 77 |
\ref{db}
|
| 78 |
\sidenote{If this is hard to visualize, think of components as database
|
| 79 |
tables, and your entity as the primary key used in all your tables,
|
| 80 |
which you can use to access or update that data.
|
| 81 |
You of course wouldn't want to \em{actually} implement a game like
|
| 82 |
that, but it's similar in spirit.}
|
| 83 |
|
| 84 |
Now, to write the salient \em{operations} of our game, we can
|
| 85 |
write them in terms of one or more components: when we draw
|
| 86 |
our game, we write the draw operation in terms of \tt{Appearance},
|
| 87 |
which allows us to loop over everything that has image data
|
| 88 |
associated with it. On the other hand, when we want to move units
|
| 89 |
around, we'll write an operation in terms of both the \tt{AI_State}
|
| 90 |
\em{and} \tt{Position} components, because we need to know what
|
| 91 |
the unit plans to do in order to update its position.
|
| 92 |
|
| 93 |
The advantages of Data-Oriented Design that I described earlier are
|
| 94 |
still in effect, because all the data associated with a component
|
| 95 |
can be stored packed together: looping over the \tt{Appearance}
|
| 96 |
component won't bring any non-\tt{Appearance} data into the cache.
|
| 97 |
But Component-Entity Systems have an extra advantage
|
| 98 |
over pure Data-Oriented Design: you gain \em{compositionality} in
|
| 99 |
a way that's not necessarily present in other designs.\ref{bom}
|
| 100 |
\sidenote{
|
| 101 |
\link{http://t-machine.org/index.php/2013/05/30/designing-bomberman-with-an-entity-system-which-components/|
|
| 102 |
This blog post about applying component-entity design to Bomberman}
|
| 103 |
has a section called
|
| 104 |
\em{Consider the possibilities of your new Components}, in which
|
| 105 |
the author explores the compositions of components as interesting avenues
|
| 106 |
of discovering new gameplay. It also goes into a lot more detail
|
| 107 |
about what a component-entity approach would look like.
|
| 108 |
}
|
| 109 |
|
| 110 |
For example:
|
| 111 |
you might design a game with a \tt{Health_Pickup} component for
|
| 112 |
items that restore health when a player interacts with them. An
|
| 113 |
entity that is associated with both the \tt{Health_Pickup} and
|
| 114 |
\tt{AI_State} components will act as a mobile health powerup
|
| 115 |
that can choose how to move around using some kind of AI routine.
|
| 116 |
On the other hand, an entity associated with both
|
| 117 |
the \tt{Health_Pickup} and \tt{Health} components is a health pickup
|
| 118 |
which can be destroyed, perhaps so that it cannot be used by an
|
| 119 |
opposing player. In both those cases, no extra implementation work
|
| 120 |
would need to be done for these conjunctions of components: the new,
|
| 121 |
interesting functionality falls out naturally from implementing
|
| 122 |
each feature in isolation.
|
| 123 |
|
| 124 |
While Component-Entity Systems are interesting,
|
| 125 |
there are no \em{languages} that are inherently component-entity-oriented.\ref{lng}
|
| 126 |
\sidenote{At least, if there are, I don't know about them.}
|
| 127 |
Component-Entity System are usually implemented using existing
|
| 128 |
object-oriented languages.
|
| 129 |
So my pseudocode examples above which used the \tt{\ttkw{component}} keyword
|
| 130 |
were all pure fiction. But what \em{would} a component-entity language
|
| 131 |
look like?
|
| 132 |
|
| 133 |
I'm going to change pace a bit and discuss an old and sadly
|
| 134 |
mostly-forgotten paper:
|
| 135 |
\link{http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4805&rep=rep1&type=pdf|
|
| 136 |
Harrison and Ossher's 1993 OOPSLA paper,
|
| 137 |
\em{Subject-Oriented Programming: A Critique of Pure Objects}}.
|
| 138 |
The primary motivation behind the paper is addressing
|
| 139 |
what they see as a flaw in object-oriented design: most
|
| 140 |
object-oriented languages include inheritance, and therefore involve
|
| 141 |
a tree of \em{is-a} relationships. However, objects don't exist in
|
| 142 |
just a single place in a single ontology: a set of objects can be seen
|
| 143 |
as occupying multiple places in multiple ontologies.
|
| 144 |
As a concrete but slightly fanciful example: do we choose a culinary
|
| 145 |
ontology for our program and write \tt{Tomato extends Vegetable},
|
| 146 |
or do we choose a biological ontology and write \tt{Tomato extends Fruit}?
|
| 147 |
What if one part of the program needs one and the other needs
|
| 148 |
the other?
|
| 149 |
|
| 150 |
More concisely, as Jorge Luis Borges\ref{borg} said:
|
| 151 |
\sidenote{From the Jorge Luis Borges essay
|
| 152 |
\link{http://www.crockford.com/wrrrld/wilkins.html|\em{The Analytical Language of John Wilkins}}.}
|
| 153 |
|
| 154 |
\blockquote
|
| 155 |
{It is clear that there is no classification of the Universe that is
|
| 156 |
not arbitrary and full of conjectures. The reason for this is very
|
| 157 |
simple: we do not know what kind of thing the universe is.}
|
| 158 |
|
| 159 |
Harrison and Ossher propose that, instead of dealing with objects
|
| 160 |
that are instances of classes, we deal with \em{subjects}, which
|
| 161 |
\em{can be seen as} instances of classes. Any time an object is
|
| 162 |
interacted with, it is interacted with in a subjective context,
|
| 163 |
in which the data and operations associated with it can be different
|
| 164 |
depending on the subjective context. The only thing shared between
|
| 165 |
the context is the identity of the objects:
|
| 166 |
|
| 167 |
\blockquote
|
| 168 |
{The essential characteristic of subject-orinted programming is
|
| 169 |
that different subjects can separately define and operate upon
|
| 170 |
shared objects, without any subject needing to know the details
|
| 171 |
associated with those objects by other subjects. Only object
|
| 172 |
identity is necessarily shared.
|
| 173 |
}
|
| 174 |
|
| 175 |
Harrison and Ossher's proposed system is in some respects very
|
| 176 |
similar to component-entity systems, but is in other respects quite
|
| 177 |
different. There is certainly commonality: what a component-entity
|
| 178 |
system calls an \em{entity}, a subject-oriented language calls an
|
| 179 |
\em{object-identifier} or an \em{oid}, and both consider entities
|
| 180 |
or oids to be abstract identifiers with no directly associated
|
| 181 |
information.
|
| 182 |
|
| 183 |
In a subject-oriented language, an operation must exist within a
|
| 184 |
given \em{subject activation}, which exposes pieces of information
|
| 185 |
associated with the entity and a set of operations. An entity can,
|
| 186 |
within a given subject activation, have fields and methods, and
|
| 187 |
those fields and methods can be entirely distinct from the fields
|
| 188 |
and methods exposed by a different subject activation. Every
|
| 189 |
method invocation, therefore, has to exist within a subject
|
| 190 |
activation, so that we know the actions and fields available
|
| 191 |
within that subjective frame.
|
| 192 |
|
| 193 |
A salient \em{difference} is the way that subject-oriented programming allows
|
| 194 |
certain pieces of information, or certain operations, to be shared
|
| 195 |
among different subject activations. Harrison and Ossher's repeated
|
| 196 |
example involves a \tt{Tree} object shared by, among others,
|
| 197 |
a \tt{Woodcutter} subject and a \tt{Bird} subject. A \tt{Woodcutter}'s
|
| 198 |
view of the tree has an estimated value, which the woodcutter
|
| 199 |
might use to determine whether the tree is worth cutting down. On
|
| 200 |
the other hand, a \tt{Bird}'s view of the tree involves its
|
| 201 |
suitability for building a nest in. In both cases, though, they might
|
| 202 |
care about a piece of information like the tree's height.
|
| 203 |
|
| 204 |
However, Harrison and Ossher's approach to this issue seems awkward:
|
| 205 |
they suggest that,
|
| 206 |
rather than straightforwardly sharing the height between the subjects,
|
| 207 |
the \tt{Bird} subject and the \tt{Woodcutter}
|
| 208 |
subject should both have \em{their own copy} of a field representing
|
| 209 |
the tree's height, and
|
| 210 |
that the two must be \em{made to} agree: they must return
|
| 211 |
the same value, or some compatible value (for example, by returning
|
| 212 |
some value which is commutative.) If they \em{fail} to agree, the
|
| 213 |
program throws an exception. This is almost \em{certain} to be a
|
| 214 |
source of frustration in practice, or at least a major source of
|
| 215 |
gotchas.
|
| 216 |
|
| 217 |
The Harrison and Ossher approach also describes how to mediate
|
| 218 |
two distinct object hierarchies, so that different subject
|
| 219 |
activations can use inheritance over the same set of classes
|
| 220 |
in very different ways. (The \tt{Cook} subject, for example,
|
| 221 |
could use \tt{Tomato extends Vegetable}, while the \tt{Botanist}
|
| 222 |
subject could use \tt{Tomato extends Fruit}, so a given
|
| 223 |
oid can be seen by both as being situated within different
|
| 224 |
hierarchies.) They then go on to describe how one subject's
|
| 225 |
class hierarchy might be incomplete with respect to another
|
| 226 |
subject's hierarchy, and describe how to match those hierarchies
|
| 227 |
together, or infer class hierarchies based on interfaces or other
|
| 228 |
mechanisms.
|
| 229 |
|
| 230 |
I would argue that the best thing to do is to combine the high-level
|
| 231 |
details of Harrison and Ossher's subject-oriented language design
|
| 232 |
with the specific mechanisms used in component-entity systems.
|
| 233 |
They clearly have a common starting
|
| 234 |
point and a similar approach to modeling the world, in which abstract
|
| 235 |
entities \em{can be viewed in some context} as having associated operations
|
| 236 |
and information. The Harrison and Ossher approach unfortunately gets caught
|
| 237 |
in a quagmire of hierarchies and modeling, but much of that complexity can
|
| 238 |
be alleviated if we treat subject activations like
|
| 239 |
sets of components: suddenly, the \tt{Bird} and the \tt{Woodcutter}
|
| 240 |
subjects/systems can simply \em{share} the \tt{TreeHeight} component,
|
| 241 |
without having to resort to awkward and complicated agreement
|
| 242 |
strategies on the hierarchies or results involved.
|
| 243 |
|
| 244 |
As for the specifics of what a component-entity language might
|
| 245 |
look like, I leave that as a creative exercise for the reader.
|
| 246 |
\ref{exc}\sidenote{I \em{do} have ideas. Someday I will implement them.}
|