gdritter repos gcct / 8741fb2
Initial commit of writeup Getty Ritter 10 years ago
1 changed file(s) with 425 addition(s) and 0 deletion(s). Collapse all Expand all
1 \documentclass{scrartcl}
2 \usepackage[letterpaper]{geometry}
3 \usepackage{fullpage}
4 \usepackage{cmbright}
5 \usepackage[usenames,dvipsnames]{color}
6 % \newcommand{\both}[2]{\begin{tabular}{ p{4in} p{4in} } #1 && #2 \\ \end{tabular}}
7 \newcommand{\pos}[1]{\textcolor{BrickRed}{#1}}
8 \newcommand{\postt}[1]{\textcolor{BrickRed}{\texttt{#1}}}
9 \newcommand{\dual}[1]{\textcolor{NavyBlue}{#1}}
10 \newcommand{\dualtt}[1]{\textcolor{NavyBlue}{\texttt{#1}}}
11 \newenvironment{both}
12 { \begin{tabular}{ p{3in} p{3in} } }
13 { \\ \end{tabular} }
14 \begin{document}
15 \section{\pos{Data} and \dual{Codata}}
16
17 \begin{both}
18 \pos{Data} is defined by its \pos{introduction} rules.
19 & \dual{Codata} is defined by its \dual{elimination} rules.
20 \\
21 We can define a \pos{data} type with
22 & We can define a \dual{codata} type with
23 \\
24 \color{BrickRed}
25 \begin{verbatim}
26 data Either a b
27 = Left a
28 | Right b\end{verbatim}
29 &
30 \color{NavyBlue}
31 \begin{verbatim}
32 codata Both a b
33 = first a
34 & second b\end{verbatim}
35 \\
36 This definition introduces two functions:
37 & This definition introduces two functions:
38 \\
39 \color{BrickRed}
40 \begin{verbatim}
41 Left : a -> Either a b
42 Right : b -> Either a b\end{verbatim}
43 &
44 \color{NavyBlue}
45 \begin{verbatim}
46 first : Both a b -> a
47 second : Both a b -> b\end{verbatim}
48 \\
49 In order to \pos{destruct data} we have to use
50 \pos{patterns} which let you match on \pos{constructors}.
51 & In order to \dual{construct codata} we have to use
52 \dual{copatterns} which let you match on \dual{destructors}.
53 \\
54 \color{BrickRed}
55 \begin{verbatim}
56 (case e : Either a b of
57 Left (x : a) -> (e1 : c)
58 Right (y : b) -> (e2 : c)) : c\end{verbatim}
59 &
60 \color{NavyBlue}
61 \begin{verbatim}
62 (merge x : Both a b from
63 first x <- e1 : a
64 second x <- e2 : b) : Both a b\end{verbatim}
65 \\
66 Here, \postt{e} represents the \pos{value being
67 destructed}, and each branch represents a \pos{constructor
68 with which it might have been constructed}. We are effectively
69 \pos{dispatching on possible pasts} of
70 \postt{e}.
71 & Here, \dual{\texttt{x}} represents the \dual{value being
72 constructed}, and each branch represents a \dual{destructor
73 with which it might eventually be destructed}. We are effectively
74 \dual{dispatching on possible futures} of
75 \dualtt{x}.
76 \\
77 \end{both}
78
79 \section{Codata, Records, and Copatterns}
80 \raggedright
81
82 In the same way that named sums are a natural way to represent data,
83 records are a natural way to represent codata. In lieu of the above
84 syntax, one often sees codata represented as something more like
85
86 {\color{NavyBlue}
87 \begin{verbatim}
88 record Both a b = { .first : a, .second : b }
89
90 x : Both Int Bool
91 x = { .first = 2, .second = true }
92
93 x.first == 2, x.second == true
94 \end{verbatim}
95 }
96
97 The \dualtt{merge} syntax is used here for conceptual
98 symmetry with \postt{case}. Additionally, the use of
99 copatterns is nicely dual with the extra expressivity that
100 patterns offer. For example, we can use nested patterns with
101 constructors of various types, as in this function which
102 processes a list of \postt{Either Int Bool} values
103 by summing the integers in the list until it reaches a
104 \postt{false} value or the end of the list:
105
106 {\color{BrickRed}
107 \begin{verbatim}
108 f : List (Either Int Bool) -> Int
109 f lst = case lst of
110 Cons (Left i) xs -> i + f xs
111 Cons (Right b) xs -> if b then f xs else 0
112 Nil -> 0
113 \end{verbatim}
114 }
115
116 Similarly, we can define an infinite stream of pairs using
117 nested copatterns as so:
118
119 {\color{NavyBlue}
120 \begin{verbatim}
121 s : Int -> Stream (Both Int Bool)
122 s n = merge x from
123 first (head x) <- n
124 second (head x) <- n > 0
125 tail x <- x
126 \end{verbatim}
127 }
128
129 Copatterns are also practically expressive, as in this concise
130 and readable definition of the fibonacci numbers in terms of
131 the \dual{\texttt{merge}} expression:
132
133 {\color{NavyBlue}
134 \begin{verbatim}
135 fibs : Stream Int
136 fibs = merge x from
137 head x <- 0
138 head (tail x) <- 1
139 tail (tail x) <- zipWith (+) x (tail x)
140 \end{verbatim}
141 }
142
143 \section{Generalizing \dual{Co}data}
144
145 \begin{both}
146 A \pos{Generalized Algebraic Data Type} adds extra
147 expressivity to \pos{data} types by allowing extra
148 restrictions on the value being \pos{constructed}.
149 & A \dual{Generalized Coalgebraic Codata Type} adds extra
150 expressivity to \dual{codata} types by allowing extra
151 restrictions on the value being \dual{destructed}.
152 \\
153 \color{BrickRed}
154 \begin{verbatim}
155 data Join a where
156 Unit : Int -> Join Int
157 Join : (Join a, Join a) ->
158 Join (a, a)\end{verbatim}
159 &
160 \color{NavyBlue}
161 \begin{verbatim}
162 codata Split a where
163 unit : Split Int -> Int
164 split : Split (a, a) ->
165 (Split a, Split a)\end{verbatim}
166 \\
167 This definition precludes the use of any value of the
168 type \pos{Join(Int, Bool)} because there is no way
169 to \pos{construct} a value of this type---we can only
170 \pos{construct} values with pairs of the same type,
171 or containing an \texttt{Int}.
172 &
173 This definition precludes the use of any value of the
174 type \dual{Split(Int, Bool)} because there is no way
175 to \dual{destruct} a value of this type---we can only
176 \dual{destruct} values with pairs of the same type,
177 or containing an \texttt{Int}.
178 \\
179 This gives us extra type information to omit nonsensical
180 branches in \postt{case}, as well. The following code
181 matches over a value of type \postt{Join Int}, and as
182 such only has an arm for the \pos{constructor}
183 \postt{Unit}---because it is the only \pos{constructor}
184 which can \pos{create} a value of that type!
185 &
186 This gives us extra type information to omit nonsensical
187 branches in \dualtt{merge}, as well. The following code
188 matches over a value of type \dualtt{Split Int}, and as
189 such only has an arm for the \dual{destructor}
190 \dualtt{unit}---because it is the only \dual{destructor}
191 which can \dual{destruct} a value of that type!
192 \\
193 \color{BrickRed}
194 \begin{verbatim}
195 f : Join Int -> Int
196 f x = case x of
197 Unit n -> n\end{verbatim}
198 &
199 \color{NavyBlue}
200 \begin{verbatim}
201 g : Int -> Split Int
202 g n = merge x from
203 unit x <- n\end{verbatim}
204 \\
205 We can \pos{construct} a value of type \postt{Join Int}
206 in conjunction with using \postt{f} like so:
207 & We can \dual{destruct} a value of type \dualtt{Split Int}
208 in conjunction with using \dualtt{g} like so:
209 \\
210 \color{BrickRed}
211 \begin{verbatim}
212 let y : Join Int
213 y = Unit 5
214 in
215 f y\end{verbatim}
216 &
217 \color{NavyBlue}
218 \begin{verbatim}
219 let y : Split Int
220 y = g 5
221 in
222 unit y\end{verbatim}
223 \\
224 \end{both}
225
226 \section{Focus on GCCTs}
227
228 To dispense with the dual presentation: a simple, high-level, handwavey
229 description of GCCTs is that they are \textit{records whose available
230 projections may be restricted by
231 the type of the record}. Like GADTs, they can introduce and use extra
232 typing information. For example, consider the following GCCT and its
233 instantiations:
234
235 {\color{NavyBlue}
236 \begin{verbatim}
237 codata Wrapper a where
238 contents : Wrapper a -> a
239 isZero : Wrapper Int -> Bool
240
241 mkStringW : String -> Wrapper String
242 mkStringW x = merge r from
243 contents r <- x
244
245 mkIntW : Int -> Wrapper Int
246 mkIntW x = merge r from
247 contents r <- x
248 isZero r <- x == 0
249 \end{verbatim}
250 }
251
252 A \dualtt{Wrapper} value is a wrapper over some other value, which can always be
253 extracted using the projection \dualtt{contents}. However, if the value
254 contained in the ``record'' is an \texttt{Int}, we have a second projection
255 available to us: we can also use \dualtt{isZero} to project out a boolean.
256
257 \subsection{Finite State Automata}
258
259 We can use these in a few interesting ways: for example, consider the situation
260 in which we have a finite state automaton for recognizing a language. For
261 simplicity's sake, we'll restrict the states to \texttt{A}, \texttt{B}, and
262 \texttt{C}, and allow the following valid transitions:
263
264 \begin{verbatim}
265 A -> B,C
266 B -> C
267 C -> A
268 \end{verbatim}
269
270 We want a data structure which represents the transitions available. We can do
271 this, even in a strongly normalizing setting, by representing the FSA using
272 a piece of infinite codata with projections for each state transition:
273
274 {\color{NavyBlue}
275 \begin{verbatim}
276 data State = A | B | C
277 codata NFA (a : State) where
278 aToB : NFA A -> NFA B
279 aToC : NFA A -> NFA C
280 bToC : NFA B -> NFA C
281 cToA : NFA C -> NFA A
282 \end{verbatim}
283 }
284
285 This means that the following expressions (which use \texttt{.} to mean
286 function composition) type-check as they represent valid
287 state transitions
288
289 \begin{verbatim}
290 aState : NFA A
291 aToB aState : NFA B
292 (bToC . aToB) aState : NFA C
293 (cToA . bToC . aToB) aState : NFA A
294 \end{verbatim}
295
296 whereas the following do not
297
298 \begin{verbatim}
299 bToC aState -- cannot unify `NFA A` with `NFA B`
300 (aToC . aToB) aState -- cannot unify `NFA B` with `NFA A`
301 \end{verbatim}
302
303 We can construct an initial \dualtt{aState} value using the following
304 \dualtt{merge} expression, using an extra \texttt{let}-binding to improve
305 the sharing in this structure:
306
307 {\color{NavyBlue}
308 \begin{verbatim}
309 aState : NFA A
310 aState =
311 merge as from
312 let cState = (merge cs from aToB cs <- as) in
313 aToC as <- cState
314 aToB as <- (merge bs from bToC bs <- cState)
315 \end{verbatim}
316 }
317
318 or much more concisely by using nested copatterns:
319
320 {\color{NavyBlue}
321 \begin{verbatim}
322 aState : NFA A
323 aState = merge as from
324 cToA (aToC as) <- as
325 cToA (bToC (aToB as)) <- as
326 \end{verbatim}
327 }
328
329 We can make this more interesting (and more useful) by introducing an extra
330 operation: suppose we're using this to recognize a language equivalent to the
331 regular expression \texttt{/a(b?ca)*/}, and we want to be able to extract
332 the string we've matched, but \textit{only} in the accept state
333 \texttt{A}. We can change the codata definition to have such an accessor:
334
335 {\color{NavyBlue}
336 \begin{verbatim}
337 codata NFA (a : State) where
338 aToB : NFA A -> NFA B
339 aToC : NFA A -> NFA C
340 bToC : NFA B -> NFA C
341 cToA : NFA C -> NFA A
342 matchedString : NFA A -> String
343 \end{verbatim}
344 }
345
346 and change the construction of the NFA to this:
347
348 {\color{NavyBlue}
349 \begin{verbatim}
350 aState : NFA A
351 aState =
352 let f str =
353 merge as from
354 matchedString as <- str
355 cToA (aToC as) <- f (str ++ "ca")
356 cToA (bToC (aToB as)) <- f (str ++ "bca")
357 in f "a"
358 \end{verbatim}
359 }
360
361 Consequently the following holds true:
362
363 \begin{verbatim}
364 (matchedString . cToA . atoC . cToA . bToC . aToB) aState == "abcaca"
365 \end{verbatim}
366
367 whereas the following does not type-check:
368
369 \begin{verbatim}
370 (matchedString . bToC . aToB) aState
371 \end{verbatim}
372
373 \subsection{A Forth-Like Embedded Language}
374
375 Consider the following codata type:
376
377 {\color{NavyBlue}
378 \begin{verbatim}
379 codata Forth a where
380 push : Forth a -> b -> Forth (b, a)
381 get : Forth (a, b) -> a
382 drop : Forth (a, b) -> Forth b
383 swap : Forth (a, (b, c)) -> Forth (b, (a, c))
384 add : Forth (Int, (Int, c)) -> Forth (Int, c)
385 \end{verbatim}
386 }
387
388 This corresponds to a small embedded Forth-like sublanguage, in which
389 a value of type \dualtt{Forth} corresponds to the state of a stack whose
390 type is given by the type parameter. This means that not only is it an
391 embedded Forth-like language, it is a well-typed language that can take
392 arbitrary values in the host language and manipulate them on the stack.
393 Assuming the use of the operator \texttt{x \# f = f x}, we can write a
394 well-typed subprogram
395
396 {\color{NavyBlue}
397 \begin{verbatim}
398 initialState # push 5
399 # push 2
400 # push 3
401 # swap
402 # drop
403 # add
404 # get
405 \end{verbatim}
406 }
407
408 and also reject this ill-typed subprogram, where \texttt{add} requires
409 two integers but is given a stack with a boolean on top
410
411 {\color{NavyBlue}
412 \begin{verbatim}
413 initialState # push 5 # push true # add # get
414 \end{verbatim}
415 }
416
417 but also this one, in which \texttt{add} is only given a single argument
418
419 {\color{NavyBlue}
420 \begin{verbatim}
421 initialState # push 5 # add # get
422 \end{verbatim}
423 }
424
425 \end{document}