Commit 8741fb2d2196d696831704e90d3cb872d4537949 - gcct

Initial commit of writeup Getty Ritter 11 years ago

1 changed file(s) with 425 addition(s) and 0 deletion(s). Collapse all Expand all

+425

-0

gcct.tex less more

	1	\documentclass{scrartcl}
	2	\usepackage[letterpaper]{geometry}
	3	\usepackage{fullpage}
	4	\usepackage{cmbright}
	5	\usepackage[usenames,dvipsnames]{color}
	6	% \newcommand{\both}[2]{\begin{tabular}{ p{4in} p{4in} } #1 && #2 \\ \end{tabular}}
	7	\newcommand{\pos}[1]{\textcolor{BrickRed}{#1}}
	8	\newcommand{\postt}[1]{\textcolor{BrickRed}{\texttt{#1}}}
	9	\newcommand{\dual}[1]{\textcolor{NavyBlue}{#1}}
	10	\newcommand{\dualtt}[1]{\textcolor{NavyBlue}{\texttt{#1}}}
	11	\newenvironment{both}
	12	{ \begin{tabular}{ p{3in} p{3in} } }
	13	{ \\ \end{tabular} }
	14	\begin{document}
	15	\section{\pos{Data} and \dual{Codata}}
	16
	17	\begin{both}
	18	\pos{Data} is defined by its \pos{introduction} rules.
	19	& \dual{Codata} is defined by its \dual{elimination} rules.
	20	\\
	21	We can define a \pos{data} type with
	22	& We can define a \dual{codata} type with
	23	\\
	24	\color{BrickRed}
	25	\begin{verbatim}
	26	data Either a b
	27	= Left a
	28	\| Right b\end{verbatim}
	29	&
	30	\color{NavyBlue}
	31	\begin{verbatim}
	32	codata Both a b
	33	= first a
	34	& second b\end{verbatim}
	35	\\
	36	This definition introduces two functions:
	37	& This definition introduces two functions:
	38	\\
	39	\color{BrickRed}
	40	\begin{verbatim}
	41	Left : a -> Either a b
	42	Right : b -> Either a b\end{verbatim}
	43	&
	44	\color{NavyBlue}
	45	\begin{verbatim}
	46	first : Both a b -> a
	47	second : Both a b -> b\end{verbatim}
	48	\\
	49	In order to \pos{destruct data} we have to use
	50	\pos{patterns} which let you match on \pos{constructors}.
	51	& In order to \dual{construct codata} we have to use
	52	\dual{copatterns} which let you match on \dual{destructors}.
	53	\\
	54	\color{BrickRed}
	55	\begin{verbatim}
	56	(case e : Either a b of
	57	Left (x : a) -> (e1 : c)
	58	Right (y : b) -> (e2 : c)) : c\end{verbatim}
	59	&
	60	\color{NavyBlue}
	61	\begin{verbatim}
	62	(merge x : Both a b from
	63	first x <- e1 : a
	64	second x <- e2 : b) : Both a b\end{verbatim}
	65	\\
	66	Here, \postt{e} represents the \pos{value being
	67	destructed}, and each branch represents a \pos{constructor
	68	with which it might have been constructed}. We are effectively
	69	\pos{dispatching on possible pasts} of
	70	\postt{e}.
	71	& Here, \dual{\texttt{x}} represents the \dual{value being
	72	constructed}, and each branch represents a \dual{destructor
	73	with which it might eventually be destructed}. We are effectively
	74	\dual{dispatching on possible futures} of
	75	\dualtt{x}.
	76	\\
	77	\end{both}
	78
	79	\section{Codata, Records, and Copatterns}
	80	\raggedright
	81
	82	In the same way that named sums are a natural way to represent data,
	83	records are a natural way to represent codata. In lieu of the above
	84	syntax, one often sees codata represented as something more like
	85
	86	{\color{NavyBlue}
	87	\begin{verbatim}
	88	record Both a b = { .first : a, .second : b }
	89
	90	x : Both Int Bool
	91	x = { .first = 2, .second = true }
	92
	93	x.first == 2, x.second == true
	94	\end{verbatim}
	95	}
	96
	97	The \dualtt{merge} syntax is used here for conceptual
	98	symmetry with \postt{case}. Additionally, the use of
	99	copatterns is nicely dual with the extra expressivity that
	100	patterns offer. For example, we can use nested patterns with
	101	constructors of various types, as in this function which
	102	processes a list of \postt{Either Int Bool} values
	103	by summing the integers in the list until it reaches a
	104	\postt{false} value or the end of the list:
	105
	106	{\color{BrickRed}
	107	\begin{verbatim}
	108	f : List (Either Int Bool) -> Int
	109	f lst = case lst of
	110	Cons (Left i) xs -> i + f xs
	111	Cons (Right b) xs -> if b then f xs else 0
	112	Nil -> 0
	113	\end{verbatim}
	114	}
	115
	116	Similarly, we can define an infinite stream of pairs using
	117	nested copatterns as so:
	118
	119	{\color{NavyBlue}
	120	\begin{verbatim}
	121	s : Int -> Stream (Both Int Bool)
	122	s n = merge x from
	123	first (head x) <- n
	124	second (head x) <- n > 0
	125	tail x <- x
	126	\end{verbatim}
	127	}
	128
	129	Copatterns are also practically expressive, as in this concise
	130	and readable definition of the fibonacci numbers in terms of
	131	the \dual{\texttt{merge}} expression:
	132
	133	{\color{NavyBlue}
	134	\begin{verbatim}
	135	fibs : Stream Int
	136	fibs = merge x from
	137	head x <- 0
	138	head (tail x) <- 1
	139	tail (tail x) <- zipWith (+) x (tail x)
	140	\end{verbatim}
	141	}
	142
	143	\section{Generalizing \dual{Co}data}
	144
	145	\begin{both}
	146	A \pos{Generalized Algebraic Data Type} adds extra
	147	expressivity to \pos{data} types by allowing extra
	148	restrictions on the value being \pos{constructed}.
	149	& A \dual{Generalized Coalgebraic Codata Type} adds extra
	150	expressivity to \dual{codata} types by allowing extra
	151	restrictions on the value being \dual{destructed}.
	152	\\
	153	\color{BrickRed}
	154	\begin{verbatim}
	155	data Join a where
	156	Unit : Int -> Join Int
	157	Join : (Join a, Join a) ->
	158	Join (a, a)\end{verbatim}
	159	&
	160	\color{NavyBlue}
	161	\begin{verbatim}
	162	codata Split a where
	163	unit : Split Int -> Int
	164	split : Split (a, a) ->
	165	(Split a, Split a)\end{verbatim}
	166	\\
	167	This definition precludes the use of any value of the
	168	type \pos{Join(Int, Bool)} because there is no way
	169	to \pos{construct} a value of this type---we can only
	170	\pos{construct} values with pairs of the same type,
	171	or containing an \texttt{Int}.
	172	&
	173	This definition precludes the use of any value of the
	174	type \dual{Split(Int, Bool)} because there is no way
	175	to \dual{destruct} a value of this type---we can only
	176	\dual{destruct} values with pairs of the same type,
	177	or containing an \texttt{Int}.
	178	\\
	179	This gives us extra type information to omit nonsensical
	180	branches in \postt{case}, as well. The following code
	181	matches over a value of type \postt{Join Int}, and as
	182	such only has an arm for the \pos{constructor}
	183	\postt{Unit}---because it is the only \pos{constructor}
	184	which can \pos{create} a value of that type!
	185	&
	186	This gives us extra type information to omit nonsensical
	187	branches in \dualtt{merge}, as well. The following code
	188	matches over a value of type \dualtt{Split Int}, and as
	189	such only has an arm for the \dual{destructor}
	190	\dualtt{unit}---because it is the only \dual{destructor}
	191	which can \dual{destruct} a value of that type!
	192	\\
	193	\color{BrickRed}
	194	\begin{verbatim}
	195	f : Join Int -> Int
	196	f x = case x of
	197	Unit n -> n\end{verbatim}
	198	&
	199	\color{NavyBlue}
	200	\begin{verbatim}
	201	g : Int -> Split Int
	202	g n = merge x from
	203	unit x <- n\end{verbatim}
	204	\\
	205	We can \pos{construct} a value of type \postt{Join Int}
	206	in conjunction with using \postt{f} like so:
	207	& We can \dual{destruct} a value of type \dualtt{Split Int}
	208	in conjunction with using \dualtt{g} like so:
	209	\\
	210	\color{BrickRed}
	211	\begin{verbatim}
	212	let y : Join Int
	213	y = Unit 5
	214	in
	215	f y\end{verbatim}
	216	&
	217	\color{NavyBlue}
	218	\begin{verbatim}
	219	let y : Split Int
	220	y = g 5
	221	in
	222	unit y\end{verbatim}
	223	\\
	224	\end{both}
	225
	226	\section{Focus on GCCTs}
	227
	228	To dispense with the dual presentation: a simple, high-level, handwavey
	229	description of GCCTs is that they are \textit{records whose available
	230	projections may be restricted by
	231	the type of the record}. Like GADTs, they can introduce and use extra
	232	typing information. For example, consider the following GCCT and its
	233	instantiations:
	234
	235	{\color{NavyBlue}
	236	\begin{verbatim}
	237	codata Wrapper a where
	238	contents : Wrapper a -> a
	239	isZero : Wrapper Int -> Bool
	240
	241	mkStringW : String -> Wrapper String
	242	mkStringW x = merge r from
	243	contents r <- x
	244
	245	mkIntW : Int -> Wrapper Int
	246	mkIntW x = merge r from
	247	contents r <- x
	248	isZero r <- x == 0
	249	\end{verbatim}
	250	}
	251
	252	A \dualtt{Wrapper} value is a wrapper over some other value, which can always be
	253	extracted using the projection \dualtt{contents}. However, if the value
	254	contained in the ``record'' is an \texttt{Int}, we have a second projection
	255	available to us: we can also use \dualtt{isZero} to project out a boolean.
	256
	257	\subsection{Finite State Automata}
	258
	259	We can use these in a few interesting ways: for example, consider the situation
	260	in which we have a finite state automaton for recognizing a language. For
	261	simplicity's sake, we'll restrict the states to \texttt{A}, \texttt{B}, and
	262	\texttt{C}, and allow the following valid transitions:
	263
	264	\begin{verbatim}
	265	A -> B,C
	266	B -> C
	267	C -> A
	268	\end{verbatim}
	269
	270	We want a data structure which represents the transitions available. We can do
	271	this, even in a strongly normalizing setting, by representing the FSA using
	272	a piece of infinite codata with projections for each state transition:
	273
	274	{\color{NavyBlue}
	275	\begin{verbatim}
	276	data State = A \| B \| C
	277	codata NFA (a : State) where
	278	aToB : NFA A -> NFA B
	279	aToC : NFA A -> NFA C
	280	bToC : NFA B -> NFA C
	281	cToA : NFA C -> NFA A
	282	\end{verbatim}
	283	}
	284
	285	This means that the following expressions (which use \texttt{.} to mean
	286	function composition) type-check as they represent valid
	287	state transitions
	288
	289	\begin{verbatim}
	290	aState : NFA A
	291	aToB aState : NFA B
	292	(bToC . aToB) aState : NFA C
	293	(cToA . bToC . aToB) aState : NFA A
	294	\end{verbatim}
	295
	296	whereas the following do not
	297
	298	\begin{verbatim}
	299	bToC aState -- cannot unify `NFA A` with `NFA B`
	300	(aToC . aToB) aState -- cannot unify `NFA B` with `NFA A`
	301	\end{verbatim}
	302
	303	We can construct an initial \dualtt{aState} value using the following
	304	\dualtt{merge} expression, using an extra \texttt{let}-binding to improve
	305	the sharing in this structure:
	306
	307	{\color{NavyBlue}
	308	\begin{verbatim}
	309	aState : NFA A
	310	aState =
	311	merge as from
	312	let cState = (merge cs from aToB cs <- as) in
	313	aToC as <- cState
	314	aToB as <- (merge bs from bToC bs <- cState)
	315	\end{verbatim}
	316	}
	317
	318	or much more concisely by using nested copatterns:
	319
	320	{\color{NavyBlue}
	321	\begin{verbatim}
	322	aState : NFA A
	323	aState = merge as from
	324	cToA (aToC as) <- as
	325	cToA (bToC (aToB as)) <- as
	326	\end{verbatim}
	327	}
	328
	329	We can make this more interesting (and more useful) by introducing an extra
	330	operation: suppose we're using this to recognize a language equivalent to the
	331	regular expression \texttt{/a(b?ca)*/}, and we want to be able to extract
	332	the string we've matched, but \textit{only} in the accept state
	333	\texttt{A}. We can change the codata definition to have such an accessor:
	334
	335	{\color{NavyBlue}
	336	\begin{verbatim}
	337	codata NFA (a : State) where
	338	aToB : NFA A -> NFA B
	339	aToC : NFA A -> NFA C
	340	bToC : NFA B -> NFA C
	341	cToA : NFA C -> NFA A
	342	matchedString : NFA A -> String
	343	\end{verbatim}
	344	}
	345
	346	and change the construction of the NFA to this:
	347
	348	{\color{NavyBlue}
	349	\begin{verbatim}
	350	aState : NFA A
	351	aState =
	352	let f str =
	353	merge as from
	354	matchedString as <- str
	355	cToA (aToC as) <- f (str ++ "ca")
	356	cToA (bToC (aToB as)) <- f (str ++ "bca")
	357	in f "a"
	358	\end{verbatim}
	359	}
	360
	361	Consequently the following holds true:
	362
	363	\begin{verbatim}
	364	(matchedString . cToA . atoC . cToA . bToC . aToB) aState == "abcaca"
	365	\end{verbatim}
	366
	367	whereas the following does not type-check:
	368
	369	\begin{verbatim}
	370	(matchedString . bToC . aToB) aState
	371	\end{verbatim}
	372
	373	\subsection{A Forth-Like Embedded Language}
	374
	375	Consider the following codata type:
	376
	377	{\color{NavyBlue}
	378	\begin{verbatim}
	379	codata Forth a where
	380	push : Forth a -> b -> Forth (b, a)
	381	get : Forth (a, b) -> a
	382	drop : Forth (a, b) -> Forth b
	383	swap : Forth (a, (b, c)) -> Forth (b, (a, c))
	384	add : Forth (Int, (Int, c)) -> Forth (Int, c)
	385	\end{verbatim}
	386	}
	387
	388	This corresponds to a small embedded Forth-like sublanguage, in which
	389	a value of type \dualtt{Forth} corresponds to the state of a stack whose
	390	type is given by the type parameter. This means that not only is it an
	391	embedded Forth-like language, it is a well-typed language that can take
	392	arbitrary values in the host language and manipulate them on the stack.
	393	Assuming the use of the operator \texttt{x \# f = f x}, we can write a
	394	well-typed subprogram
	395
	396	{\color{NavyBlue}
	397	\begin{verbatim}
	398	initialState # push 5
	399	# push 2
	400	# push 3
	401	# swap
	402	# drop
	403	# add
	404	# get
	405	\end{verbatim}
	406	}
	407
	408	and also reject this ill-typed subprogram, where \texttt{add} requires
	409	two integers but is given a stack with a boolean on top
	410
	411	{\color{NavyBlue}
	412	\begin{verbatim}
	413	initialState # push 5 # push true # add # get
	414	\end{verbatim}
	415	}
	416
	417	but also this one, in which \texttt{add} is only given a single argument
	418
	419	{\color{NavyBlue}
	420	\begin{verbatim}
	421	initialState # push 5 # add # get
	422	\end{verbatim}
	423	}
	424
	425	\end{document}