Initial commit of writeup
Getty Ritter
10 years ago
1 | \documentclass{scrartcl} | |
2 | \usepackage[letterpaper]{geometry} | |
3 | \usepackage{fullpage} | |
4 | \usepackage{cmbright} | |
5 | \usepackage[usenames,dvipsnames]{color} | |
6 | % \newcommand{\both}[2]{\begin{tabular}{ p{4in} p{4in} } #1 && #2 \\ \end{tabular}} | |
7 | \newcommand{\pos}[1]{\textcolor{BrickRed}{#1}} | |
8 | \newcommand{\postt}[1]{\textcolor{BrickRed}{\texttt{#1}}} | |
9 | \newcommand{\dual}[1]{\textcolor{NavyBlue}{#1}} | |
10 | \newcommand{\dualtt}[1]{\textcolor{NavyBlue}{\texttt{#1}}} | |
11 | \newenvironment{both} | |
12 | { \begin{tabular}{ p{3in} p{3in} } } | |
13 | { \\ \end{tabular} } | |
14 | \begin{document} | |
15 | \section{\pos{Data} and \dual{Codata}} | |
16 | ||
17 | \begin{both} | |
18 | \pos{Data} is defined by its \pos{introduction} rules. | |
19 | & \dual{Codata} is defined by its \dual{elimination} rules. | |
20 | \\ | |
21 | We can define a \pos{data} type with | |
22 | & We can define a \dual{codata} type with | |
23 | \\ | |
24 | \color{BrickRed} | |
25 | \begin{verbatim} | |
26 | data Either a b | |
27 | = Left a | |
28 | | Right b\end{verbatim} | |
29 | & | |
30 | \color{NavyBlue} | |
31 | \begin{verbatim} | |
32 | codata Both a b | |
33 | = first a | |
34 | & second b\end{verbatim} | |
35 | \\ | |
36 | This definition introduces two functions: | |
37 | & This definition introduces two functions: | |
38 | \\ | |
39 | \color{BrickRed} | |
40 | \begin{verbatim} | |
41 | Left : a -> Either a b | |
42 | Right : b -> Either a b\end{verbatim} | |
43 | & | |
44 | \color{NavyBlue} | |
45 | \begin{verbatim} | |
46 | first : Both a b -> a | |
47 | second : Both a b -> b\end{verbatim} | |
48 | \\ | |
49 | In order to \pos{destruct data} we have to use | |
50 | \pos{patterns} which let you match on \pos{constructors}. | |
51 | & In order to \dual{construct codata} we have to use | |
52 | \dual{copatterns} which let you match on \dual{destructors}. | |
53 | \\ | |
54 | \color{BrickRed} | |
55 | \begin{verbatim} | |
56 | (case e : Either a b of | |
57 | Left (x : a) -> (e1 : c) | |
58 | Right (y : b) -> (e2 : c)) : c\end{verbatim} | |
59 | & | |
60 | \color{NavyBlue} | |
61 | \begin{verbatim} | |
62 | (merge x : Both a b from | |
63 | first x <- e1 : a | |
64 | second x <- e2 : b) : Both a b\end{verbatim} | |
65 | \\ | |
66 | Here, \postt{e} represents the \pos{value being | |
67 | destructed}, and each branch represents a \pos{constructor | |
68 | with which it might have been constructed}. We are effectively | |
69 | \pos{dispatching on possible pasts} of | |
70 | \postt{e}. | |
71 | & Here, \dual{\texttt{x}} represents the \dual{value being | |
72 | constructed}, and each branch represents a \dual{destructor | |
73 | with which it might eventually be destructed}. We are effectively | |
74 | \dual{dispatching on possible futures} of | |
75 | \dualtt{x}. | |
76 | \\ | |
77 | \end{both} | |
78 | ||
79 | \section{Codata, Records, and Copatterns} | |
80 | \raggedright | |
81 | ||
82 | In the same way that named sums are a natural way to represent data, | |
83 | records are a natural way to represent codata. In lieu of the above | |
84 | syntax, one often sees codata represented as something more like | |
85 | ||
86 | {\color{NavyBlue} | |
87 | \begin{verbatim} | |
88 | record Both a b = { .first : a, .second : b } | |
89 | ||
90 | x : Both Int Bool | |
91 | x = { .first = 2, .second = true } | |
92 | ||
93 | x.first == 2, x.second == true | |
94 | \end{verbatim} | |
95 | } | |
96 | ||
97 | The \dualtt{merge} syntax is used here for conceptual | |
98 | symmetry with \postt{case}. Additionally, the use of | |
99 | copatterns is nicely dual with the extra expressivity that | |
100 | patterns offer. For example, we can use nested patterns with | |
101 | constructors of various types, as in this function which | |
102 | processes a list of \postt{Either Int Bool} values | |
103 | by summing the integers in the list until it reaches a | |
104 | \postt{false} value or the end of the list: | |
105 | ||
106 | {\color{BrickRed} | |
107 | \begin{verbatim} | |
108 | f : List (Either Int Bool) -> Int | |
109 | f lst = case lst of | |
110 | Cons (Left i) xs -> i + f xs | |
111 | Cons (Right b) xs -> if b then f xs else 0 | |
112 | Nil -> 0 | |
113 | \end{verbatim} | |
114 | } | |
115 | ||
116 | Similarly, we can define an infinite stream of pairs using | |
117 | nested copatterns as so: | |
118 | ||
119 | {\color{NavyBlue} | |
120 | \begin{verbatim} | |
121 | s : Int -> Stream (Both Int Bool) | |
122 | s n = merge x from | |
123 | first (head x) <- n | |
124 | second (head x) <- n > 0 | |
125 | tail x <- x | |
126 | \end{verbatim} | |
127 | } | |
128 | ||
129 | Copatterns are also practically expressive, as in this concise | |
130 | and readable definition of the fibonacci numbers in terms of | |
131 | the \dual{\texttt{merge}} expression: | |
132 | ||
133 | {\color{NavyBlue} | |
134 | \begin{verbatim} | |
135 | fibs : Stream Int | |
136 | fibs = merge x from | |
137 | head x <- 0 | |
138 | head (tail x) <- 1 | |
139 | tail (tail x) <- zipWith (+) x (tail x) | |
140 | \end{verbatim} | |
141 | } | |
142 | ||
143 | \section{Generalizing \dual{Co}data} | |
144 | ||
145 | \begin{both} | |
146 | A \pos{Generalized Algebraic Data Type} adds extra | |
147 | expressivity to \pos{data} types by allowing extra | |
148 | restrictions on the value being \pos{constructed}. | |
149 | & A \dual{Generalized Coalgebraic Codata Type} adds extra | |
150 | expressivity to \dual{codata} types by allowing extra | |
151 | restrictions on the value being \dual{destructed}. | |
152 | \\ | |
153 | \color{BrickRed} | |
154 | \begin{verbatim} | |
155 | data Join a where | |
156 | Unit : Int -> Join Int | |
157 | Join : (Join a, Join a) -> | |
158 | Join (a, a)\end{verbatim} | |
159 | & | |
160 | \color{NavyBlue} | |
161 | \begin{verbatim} | |
162 | codata Split a where | |
163 | unit : Split Int -> Int | |
164 | split : Split (a, a) -> | |
165 | (Split a, Split a)\end{verbatim} | |
166 | \\ | |
167 | This definition precludes the use of any value of the | |
168 | type \pos{Join(Int, Bool)} because there is no way | |
169 | to \pos{construct} a value of this type---we can only | |
170 | \pos{construct} values with pairs of the same type, | |
171 | or containing an \texttt{Int}. | |
172 | & | |
173 | This definition precludes the use of any value of the | |
174 | type \dual{Split(Int, Bool)} because there is no way | |
175 | to \dual{destruct} a value of this type---we can only | |
176 | \dual{destruct} values with pairs of the same type, | |
177 | or containing an \texttt{Int}. | |
178 | \\ | |
179 | This gives us extra type information to omit nonsensical | |
180 | branches in \postt{case}, as well. The following code | |
181 | matches over a value of type \postt{Join Int}, and as | |
182 | such only has an arm for the \pos{constructor} | |
183 | \postt{Unit}---because it is the only \pos{constructor} | |
184 | which can \pos{create} a value of that type! | |
185 | & | |
186 | This gives us extra type information to omit nonsensical | |
187 | branches in \dualtt{merge}, as well. The following code | |
188 | matches over a value of type \dualtt{Split Int}, and as | |
189 | such only has an arm for the \dual{destructor} | |
190 | \dualtt{unit}---because it is the only \dual{destructor} | |
191 | which can \dual{destruct} a value of that type! | |
192 | \\ | |
193 | \color{BrickRed} | |
194 | \begin{verbatim} | |
195 | f : Join Int -> Int | |
196 | f x = case x of | |
197 | Unit n -> n\end{verbatim} | |
198 | & | |
199 | \color{NavyBlue} | |
200 | \begin{verbatim} | |
201 | g : Int -> Split Int | |
202 | g n = merge x from | |
203 | unit x <- n\end{verbatim} | |
204 | \\ | |
205 | We can \pos{construct} a value of type \postt{Join Int} | |
206 | in conjunction with using \postt{f} like so: | |
207 | & We can \dual{destruct} a value of type \dualtt{Split Int} | |
208 | in conjunction with using \dualtt{g} like so: | |
209 | \\ | |
210 | \color{BrickRed} | |
211 | \begin{verbatim} | |
212 | let y : Join Int | |
213 | y = Unit 5 | |
214 | in | |
215 | f y\end{verbatim} | |
216 | & | |
217 | \color{NavyBlue} | |
218 | \begin{verbatim} | |
219 | let y : Split Int | |
220 | y = g 5 | |
221 | in | |
222 | unit y\end{verbatim} | |
223 | \\ | |
224 | \end{both} | |
225 | ||
226 | \section{Focus on GCCTs} | |
227 | ||
228 | To dispense with the dual presentation: a simple, high-level, handwavey | |
229 | description of GCCTs is that they are \textit{records whose available | |
230 | projections may be restricted by | |
231 | the type of the record}. Like GADTs, they can introduce and use extra | |
232 | typing information. For example, consider the following GCCT and its | |
233 | instantiations: | |
234 | ||
235 | {\color{NavyBlue} | |
236 | \begin{verbatim} | |
237 | codata Wrapper a where | |
238 | contents : Wrapper a -> a | |
239 | isZero : Wrapper Int -> Bool | |
240 | ||
241 | mkStringW : String -> Wrapper String | |
242 | mkStringW x = merge r from | |
243 | contents r <- x | |
244 | ||
245 | mkIntW : Int -> Wrapper Int | |
246 | mkIntW x = merge r from | |
247 | contents r <- x | |
248 | isZero r <- x == 0 | |
249 | \end{verbatim} | |
250 | } | |
251 | ||
252 | A \dualtt{Wrapper} value is a wrapper over some other value, which can always be | |
253 | extracted using the projection \dualtt{contents}. However, if the value | |
254 | contained in the ``record'' is an \texttt{Int}, we have a second projection | |
255 | available to us: we can also use \dualtt{isZero} to project out a boolean. | |
256 | ||
257 | \subsection{Finite State Automata} | |
258 | ||
259 | We can use these in a few interesting ways: for example, consider the situation | |
260 | in which we have a finite state automaton for recognizing a language. For | |
261 | simplicity's sake, we'll restrict the states to \texttt{A}, \texttt{B}, and | |
262 | \texttt{C}, and allow the following valid transitions: | |
263 | ||
264 | \begin{verbatim} | |
265 | A -> B,C | |
266 | B -> C | |
267 | C -> A | |
268 | \end{verbatim} | |
269 | ||
270 | We want a data structure which represents the transitions available. We can do | |
271 | this, even in a strongly normalizing setting, by representing the FSA using | |
272 | a piece of infinite codata with projections for each state transition: | |
273 | ||
274 | {\color{NavyBlue} | |
275 | \begin{verbatim} | |
276 | data State = A | B | C | |
277 | codata NFA (a : State) where | |
278 | aToB : NFA A -> NFA B | |
279 | aToC : NFA A -> NFA C | |
280 | bToC : NFA B -> NFA C | |
281 | cToA : NFA C -> NFA A | |
282 | \end{verbatim} | |
283 | } | |
284 | ||
285 | This means that the following expressions (which use \texttt{.} to mean | |
286 | function composition) type-check as they represent valid | |
287 | state transitions | |
288 | ||
289 | \begin{verbatim} | |
290 | aState : NFA A | |
291 | aToB aState : NFA B | |
292 | (bToC . aToB) aState : NFA C | |
293 | (cToA . bToC . aToB) aState : NFA A | |
294 | \end{verbatim} | |
295 | ||
296 | whereas the following do not | |
297 | ||
298 | \begin{verbatim} | |
299 | bToC aState -- cannot unify `NFA A` with `NFA B` | |
300 | (aToC . aToB) aState -- cannot unify `NFA B` with `NFA A` | |
301 | \end{verbatim} | |
302 | ||
303 | We can construct an initial \dualtt{aState} value using the following | |
304 | \dualtt{merge} expression, using an extra \texttt{let}-binding to improve | |
305 | the sharing in this structure: | |
306 | ||
307 | {\color{NavyBlue} | |
308 | \begin{verbatim} | |
309 | aState : NFA A | |
310 | aState = | |
311 | merge as from | |
312 | let cState = (merge cs from aToB cs <- as) in | |
313 | aToC as <- cState | |
314 | aToB as <- (merge bs from bToC bs <- cState) | |
315 | \end{verbatim} | |
316 | } | |
317 | ||
318 | or much more concisely by using nested copatterns: | |
319 | ||
320 | {\color{NavyBlue} | |
321 | \begin{verbatim} | |
322 | aState : NFA A | |
323 | aState = merge as from | |
324 | cToA (aToC as) <- as | |
325 | cToA (bToC (aToB as)) <- as | |
326 | \end{verbatim} | |
327 | } | |
328 | ||
329 | We can make this more interesting (and more useful) by introducing an extra | |
330 | operation: suppose we're using this to recognize a language equivalent to the | |
331 | regular expression \texttt{/a(b?ca)*/}, and we want to be able to extract | |
332 | the string we've matched, but \textit{only} in the accept state | |
333 | \texttt{A}. We can change the codata definition to have such an accessor: | |
334 | ||
335 | {\color{NavyBlue} | |
336 | \begin{verbatim} | |
337 | codata NFA (a : State) where | |
338 | aToB : NFA A -> NFA B | |
339 | aToC : NFA A -> NFA C | |
340 | bToC : NFA B -> NFA C | |
341 | cToA : NFA C -> NFA A | |
342 | matchedString : NFA A -> String | |
343 | \end{verbatim} | |
344 | } | |
345 | ||
346 | and change the construction of the NFA to this: | |
347 | ||
348 | {\color{NavyBlue} | |
349 | \begin{verbatim} | |
350 | aState : NFA A | |
351 | aState = | |
352 | let f str = | |
353 | merge as from | |
354 | matchedString as <- str | |
355 | cToA (aToC as) <- f (str ++ "ca") | |
356 | cToA (bToC (aToB as)) <- f (str ++ "bca") | |
357 | in f "a" | |
358 | \end{verbatim} | |
359 | } | |
360 | ||
361 | Consequently the following holds true: | |
362 | ||
363 | \begin{verbatim} | |
364 | (matchedString . cToA . atoC . cToA . bToC . aToB) aState == "abcaca" | |
365 | \end{verbatim} | |
366 | ||
367 | whereas the following does not type-check: | |
368 | ||
369 | \begin{verbatim} | |
370 | (matchedString . bToC . aToB) aState | |
371 | \end{verbatim} | |
372 | ||
373 | \subsection{A Forth-Like Embedded Language} | |
374 | ||
375 | Consider the following codata type: | |
376 | ||
377 | {\color{NavyBlue} | |
378 | \begin{verbatim} | |
379 | codata Forth a where | |
380 | push : Forth a -> b -> Forth (b, a) | |
381 | get : Forth (a, b) -> a | |
382 | drop : Forth (a, b) -> Forth b | |
383 | swap : Forth (a, (b, c)) -> Forth (b, (a, c)) | |
384 | add : Forth (Int, (Int, c)) -> Forth (Int, c) | |
385 | \end{verbatim} | |
386 | } | |
387 | ||
388 | This corresponds to a small embedded Forth-like sublanguage, in which | |
389 | a value of type \dualtt{Forth} corresponds to the state of a stack whose | |
390 | type is given by the type parameter. This means that not only is it an | |
391 | embedded Forth-like language, it is a well-typed language that can take | |
392 | arbitrary values in the host language and manipulate them on the stack. | |
393 | Assuming the use of the operator \texttt{x \# f = f x}, we can write a | |
394 | well-typed subprogram | |
395 | ||
396 | {\color{NavyBlue} | |
397 | \begin{verbatim} | |
398 | initialState # push 5 | |
399 | # push 2 | |
400 | # push 3 | |
401 | # swap | |
402 | # drop | |
403 | # add | |
404 | # get | |
405 | \end{verbatim} | |
406 | } | |
407 | ||
408 | and also reject this ill-typed subprogram, where \texttt{add} requires | |
409 | two integers but is given a stack with a boolean on top | |
410 | ||
411 | {\color{NavyBlue} | |
412 | \begin{verbatim} | |
413 | initialState # push 5 # push true # add # get | |
414 | \end{verbatim} | |
415 | } | |
416 | ||
417 | but also this one, in which \texttt{add} is only given a single argument | |
418 | ||
419 | {\color{NavyBlue} | |
420 | \begin{verbatim} | |
421 | initialState # push 5 # add # get | |
422 | \end{verbatim} | |
423 | } | |
424 | ||
425 | \end{document} |