Commit 7d4b2b6960a535c99451e60202107598f9734107 - when-computer

Added subjects-and-entities post Getty Ritter 9 years ago

2 changed file(s) with 247 addition(s) and 0 deletion(s). Collapse all Expand all

+246

-0

drafts/subjects-and-entities.telml less more

	1	\meta{("subjects-and-entities" "subjects and entities" ("programming"))}
	2
	3	Object-oriented programming is very bad for high-performance programs
	4	like video games. Or, at least, traditional object-oriented design is.
	5
	6	The main culprit is the cache. If I have some \em{thing} in a game—say,
	7	an enemy character—then it needs a lot of data associated with it: its
	8	3D meshes and textures, its physics information, its current position, its
	9	inventory, its AI state… But not all that information is necessarily
	10	being used at any given time. For example, when I am running the
	11	AI processing step, or calculating physical forces acting on
	12	characters, or drawing things in the game, then I'm only using a small
	13	part of all the information associated with a thing.
	14
	15	In a proper object-oriented design, I would take all the information relevant
	16	to a thing and package it up in an object: for example, in a \tt{Game_Unit}
	17	object. That means, however, that in order
	18	to operate on \em{just a subset} of that information, I need to pull
	19	the object into the cache, which puts \em{all} the object's information
	20	in the cache. This means the cache gets full of extraneous data,
	21	and consequently cache misses become much more common, making the
	22	program slower overall.
	23
	24	One way of avoiding this problem is using so-called
	25	\link{http://gamesfromwithin.com/data-oriented-design\|Data-Oriented Design},
	26	which involves pulling apart the data stored in objects and storing it
	27	all separately. This has the advantage of being much more cache-friendly,
	28	but it generally doesn't have language support, and so has to be done
	29	manually. The advantages of typical object-oriented design—like encapsulation
	30	and data hiding—are much more difficult to retain, and the language you
	31	use might actively fight against you in these cases.
	32
	33	Following a similar path, some programmers have started using and advocating
	34	\link{http://www.gamedev.net/page/resources/_/technical/game-programming/understanding-component-entity-systems-r3013\|
	35	Component-Entity Systems},
	36	which are a few steps beyond Data-Oriented Design in roughly the same direction,
	37	but also a radically different way of structuring programs. At a high level,
	38	Component-Entity Systems work like this:
	39
	40	\ul{
	41	\li{An \em{entity} is an abstract identifier. It has no
	42	other data \em{directly} associated with it.}
	43	\li{A \em{component} is a table which maps entities to a collection
	44	of data, ideally to pieces of simple scalar data. Not
	45	every entity needs to appear in a given component.}
	46	\li{An entity can be associated with several components. That
	47	means the entity can be used as an index into the component,
	48	allowing the programmer to retrieve or modify the associated
	49	data stored in the component.}
	50	\li{Operations are written in terms of one or more components.
	51	These are sometimes called \em{systems}.}
	52	}
	53
	54	This all sounds very abstract, so what does this look like in practice? Let's
	55	describe a basic game. For the sake of simplicity, let's assume a
	56	we have four salient pieces of information for an in-game unit:
	57	its position, its current health, its appearance, and the state of
	58	its AI (including current goals and actions):
	59
	60	\code
	61	{\ttkw{component} Position(int x, int y);
	62	\ttkw{component} Health(int x);
	63	\ttkw{component} Appearance(Image img);
	64	\ttkw{component} AI_State(State st);
	65	}
	66
	67	Every entity in the game will be associated with one or more
	68	of these components: scenery will have \tt{Position} and
	69	\tt{Appearance} data; invulnerable characters might have \tt{Position},
	70	\tt{Appearance}, and \tt{AI_State}, but no \tt{Health}; a player
	71	character will have \tt{Position}, \tt{Health}, and \tt{Appearance},
	72	but no \tt{AI_State}; and so forth.
	73	Well: I've been saying that entities \em{have} this data, but it's more
	74	appropriate to say they're \em{associated with} that data. All
	75	the relevant data is stored in the components, and the entity
	76	is being used as the index used to access that data.
	77	\ref{db}
	78	\sidenote{If this is hard to visualize, think of components as database
	79	tables, and your entity as the primary key used in all your tables,
	80	which you can use to access or update that data.
	81	You of course wouldn't want to \em{actually} implement a game like
	82	that, but it's similar in spirit.}
	83
	84	Now, to write the salient \em{operations} of our game, we can
	85	write them in terms of one or more components: when we draw
	86	our game, we write the draw operation in terms of \tt{Appearance},
	87	which allows us to loop over everything that has image data
	88	associated with it. On the other hand, when we want to move units
	89	around, we'll write an operation in terms of both the \tt{AI_State}
	90	\em{and} \tt{Position} components, because we need to know what
	91	the unit plans to do in order to update its position.
	92
	93	The advantages of Data-Oriented Design that I described earlier are
	94	still in effect, because all the data associated with a component
	95	can be stored packed together: looping over the \tt{Appearance}
	96	component won't bring any non-\tt{Appearance} data into the cache.
	97	But Component-Entity Systems have an extra advantage
	98	over pure Data-Oriented Design: you gain \em{compositionality} in
	99	a way that's not necessarily present in other designs.\ref{bom}
	100	\sidenote{
	101	\link{http://t-machine.org/index.php/2013/05/30/designing-bomberman-with-an-entity-system-which-components/\|
	102	This blog post about applying component-entity design to Bomberman}
	103	has a section called
	104	\em{Consider the possibilities of your new Components}, in which
	105	the author explores the compositions of components as interesting avenues
	106	of discovering new gameplay. It also goes into a lot more detail
	107	about what a component-entity approach would look like.
	108	}
	109
	110	For example:
	111	you might design a game with a \tt{Health_Pickup} component for
	112	items that restore health when a player interacts with them. An
	113	entity that is associated with both the \tt{Health_Pickup} and
	114	\tt{AI_State} components will act as a mobile health powerup
	115	that can choose how to move around using some kind of AI routine.
	116	On the other hand, an entity associated with both
	117	the \tt{Health_Pickup} and \tt{Health} components is a health pickup
	118	which can be destroyed, perhaps so that it cannot be used by an
	119	opposing player. In both those cases, no extra implementation work
	120	would need to be done for these conjunctions of components: the new,
	121	interesting functionality falls out naturally from implementing
	122	each feature in isolation.
	123
	124	While Component-Entity Systems are interesting,
	125	there are no \em{languages} that are inherently component-entity-oriented.\ref{lng}
	126	\sidenote{At least, if there are, I don't know about them.}
	127	Component-Entity System are usually implemented using existing
	128	object-oriented languages.
	129	So my pseudocode examples above which used the \tt{\ttkw{component}} keyword
	130	were all pure fiction. But what \em{would} a component-entity language
	131	look like?
	132
	133	I'm going to change pace a bit and discuss an old and sadly
	134	mostly-forgotten paper:
	135	\link{http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.4805&rep=rep1&type=pdf\|
	136	Harrison and Ossher's 1993 OOPSLA paper,
	137	\em{Subject-Oriented Programming: A Critique of Pure Objects}}.
	138	The primary motivation behind the paper is addressing
	139	what they see as a flaw in object-oriented design: most
	140	object-oriented languages include inheritance, and therefore involve
	141	a tree of \em{is-a} relationships. However, objects don't exist in
	142	just a single place in a single ontology: a set of objects can be seen
	143	as occupying multiple places in multiple ontologies.
	144	As a concrete but slightly fanciful example: do we choose a culinary
	145	ontology for our program and write \tt{Tomato extends Vegetable},
	146	or do we choose a biological ontology and write \tt{Tomato extends Fruit}?
	147	What if one part of the program needs one and the other needs
	148	the other?
	149
	150	More concisely, as Jorge Luis Borges\ref{borg} said:
	151	\sidenote{From the Jorge Luis Borges essay
	152	\link{http://www.crockford.com/wrrrld/wilkins.html\|\em{The Analytical Language of John Wilkins}}.}
	153
	154	\blockquote
	155	{It is clear that there is no classification of the Universe that is
	156	not arbitrary and full of conjectures. The reason for this is very
	157	simple: we do not know what kind of thing the universe is.}
	158
	159	Harrison and Ossher propose that, instead of dealing with objects
	160	that are instances of classes, we deal with \em{subjects}, which
	161	\em{can be seen as} instances of classes. Any time an object is
	162	interacted with, it is interacted with in a subjective context,
	163	in which the data and operations associated with it can be different
	164	depending on the subjective context. The only thing shared between
	165	the context is the identity of the objects:
	166
	167	\blockquote
	168	{The essential characteristic of subject-orinted programming is
	169	that different subjects can separately define and operate upon
	170	shared objects, without any subject needing to know the details
	171	associated with those objects by other subjects. Only object
	172	identity is necessarily shared.
	173	}
	174
	175	Harrison and Ossher's proposed system is in some respects very
	176	similar to component-entity systems, but is in other respects quite
	177	different. There is certainly commonality: what a component-entity
	178	system calls an \em{entity}, a subject-oriented language calls an
	179	\em{object-identifier} or an \em{oid}, and both consider entities
	180	or oids to be abstract identifiers with no directly associated
	181	information.
	182
	183	In a subject-oriented language, an operation must exist within a
	184	given \em{subject activation}, which exposes pieces of information
	185	associated with the entity and a set of operations. An entity can,
	186	within a given subject activation, have fields and methods, and
	187	those fields and methods can be entirely distinct from the fields
	188	and methods exposed by a different subject activation. Every
	189	method invocation, therefore, has to exist within a subject
	190	activation, so that we know the actions and fields available
	191	within that subjective frame.
	192
	193	A salient \em{difference} is the way that subject-oriented programming allows
	194	certain pieces of information, or certain operations, to be shared
	195	among different subject activations. Harrison and Ossher's repeated
	196	example involves a \tt{Tree} object shared by, among others,
	197	a \tt{Woodcutter} subject and a \tt{Bird} subject. A \tt{Woodcutter}'s
	198	view of the tree has an estimated value, which the woodcutter
	199	might use to determine whether the tree is worth cutting down. On
	200	the other hand, a \tt{Bird}'s view of the tree involves its
	201	suitability for building a nest in. In both cases, though, they might
	202	care about a piece of information like the tree's height.
	203
	204	However, Harrison and Ossher's approach to this issue seems awkward:
	205	they suggest that,
	206	rather than straightforwardly sharing the height between the subjects,
	207	the \tt{Bird} subject and the \tt{Woodcutter}
	208	subject should both have \em{their own copy} of a field representing
	209	the tree's height, and
	210	that the two must be \em{made to} agree: they must return
	211	the same value, or some compatible value (for example, by returning
	212	some value which is commutative.) If they \em{fail} to agree, the
	213	program throws an exception. This is almost \em{certain} to be a
	214	source of frustration in practice, or at least a major source of
	215	gotchas.
	216
	217	The Harrison and Ossher approach also describes how to mediate
	218	two distinct object hierarchies, so that different subject
	219	activations can use inheritance over the same set of classes
	220	in very different ways. (The \tt{Cook} subject, for example,
	221	could use \tt{Tomato extends Vegetable}, while the \tt{Botanist}
	222	subject could use \tt{Tomato extends Fruit}, so a given
	223	oid can be seen by both as being situated within different
	224	hierarchies.) They then go on to describe how one subject's
	225	class hierarchy might be incomplete with respect to another
	226	subject's hierarchy, and describe how to match those hierarchies
	227	together, or infer class hierarchies based on interfaces or other
	228	mechanisms.
	229
	230	I would argue that the best thing to do is to combine the high-level
	231	details of Harrison and Ossher's subject-oriented language design
	232	with the specific mechanisms used in component-entity systems.
	233	They clearly have a common starting
	234	point and a similar approach to modeling the world, in which abstract
	235	entities \em{can be viewed in some context} as having associated operations
	236	and information. The Harrison and Ossher approach unfortunately gets caught
	237	in a quagmire of hierarchies and modeling, but much of that complexity can
	238	be alleviated if we treat subject activations like
	239	sets of components: suddenly, the \tt{Bird} and the \tt{Woodcutter}
	240	subjects/systems can simply \em{share} the \tt{TreeHeight} component,
	241	without having to resort to awkward and complicated agreement
	242	strategies on the hierarchies or results involved.
	243
	244	As for the specifics of what a component-entity language might
	245	look like, I leave that as a creative exercise for the reader.
	246	\ref{exc}\sidenote{I \em{do} have ideas. Someday I will implement them.}

-0

posts/subjects-and-entities.telml less more

../drafts/subjects-and-entities.telml⏎