Initial commit of writeup
Getty Ritter
11 years ago
| 1 | \documentclass{scrartcl} | |
| 2 | \usepackage[letterpaper]{geometry} | |
| 3 | \usepackage{fullpage} | |
| 4 | \usepackage{cmbright} | |
| 5 | \usepackage[usenames,dvipsnames]{color} | |
| 6 | % \newcommand{\both}[2]{\begin{tabular}{ p{4in} p{4in} } #1 && #2 \\ \end{tabular}} | |
| 7 | \newcommand{\pos}[1]{\textcolor{BrickRed}{#1}} | |
| 8 | \newcommand{\postt}[1]{\textcolor{BrickRed}{\texttt{#1}}} | |
| 9 | \newcommand{\dual}[1]{\textcolor{NavyBlue}{#1}} | |
| 10 | \newcommand{\dualtt}[1]{\textcolor{NavyBlue}{\texttt{#1}}} | |
| 11 | \newenvironment{both} | |
| 12 | { \begin{tabular}{ p{3in} p{3in} } } | |
| 13 | { \\ \end{tabular} } | |
| 14 | \begin{document} | |
| 15 | \section{\pos{Data} and \dual{Codata}} | |
| 16 | ||
| 17 | \begin{both} | |
| 18 | \pos{Data} is defined by its \pos{introduction} rules. | |
| 19 | & \dual{Codata} is defined by its \dual{elimination} rules. | |
| 20 | \\ | |
| 21 | We can define a \pos{data} type with | |
| 22 | & We can define a \dual{codata} type with | |
| 23 | \\ | |
| 24 | \color{BrickRed} | |
| 25 | \begin{verbatim} | |
| 26 | data Either a b | |
| 27 | = Left a | |
| 28 | | Right b\end{verbatim} | |
| 29 | & | |
| 30 | \color{NavyBlue} | |
| 31 | \begin{verbatim} | |
| 32 | codata Both a b | |
| 33 | = first a | |
| 34 | & second b\end{verbatim} | |
| 35 | \\ | |
| 36 | This definition introduces two functions: | |
| 37 | & This definition introduces two functions: | |
| 38 | \\ | |
| 39 | \color{BrickRed} | |
| 40 | \begin{verbatim} | |
| 41 | Left : a -> Either a b | |
| 42 | Right : b -> Either a b\end{verbatim} | |
| 43 | & | |
| 44 | \color{NavyBlue} | |
| 45 | \begin{verbatim} | |
| 46 | first : Both a b -> a | |
| 47 | second : Both a b -> b\end{verbatim} | |
| 48 | \\ | |
| 49 | In order to \pos{destruct data} we have to use | |
| 50 | \pos{patterns} which let you match on \pos{constructors}. | |
| 51 | & In order to \dual{construct codata} we have to use | |
| 52 | \dual{copatterns} which let you match on \dual{destructors}. | |
| 53 | \\ | |
| 54 | \color{BrickRed} | |
| 55 | \begin{verbatim} | |
| 56 | (case e : Either a b of | |
| 57 | Left (x : a) -> (e1 : c) | |
| 58 | Right (y : b) -> (e2 : c)) : c\end{verbatim} | |
| 59 | & | |
| 60 | \color{NavyBlue} | |
| 61 | \begin{verbatim} | |
| 62 | (merge x : Both a b from | |
| 63 | first x <- e1 : a | |
| 64 | second x <- e2 : b) : Both a b\end{verbatim} | |
| 65 | \\ | |
| 66 | Here, \postt{e} represents the \pos{value being | |
| 67 | destructed}, and each branch represents a \pos{constructor | |
| 68 | with which it might have been constructed}. We are effectively | |
| 69 | \pos{dispatching on possible pasts} of | |
| 70 | \postt{e}. | |
| 71 | & Here, \dual{\texttt{x}} represents the \dual{value being | |
| 72 | constructed}, and each branch represents a \dual{destructor | |
| 73 | with which it might eventually be destructed}. We are effectively | |
| 74 | \dual{dispatching on possible futures} of | |
| 75 | \dualtt{x}. | |
| 76 | \\ | |
| 77 | \end{both} | |
| 78 | ||
| 79 | \section{Codata, Records, and Copatterns} | |
| 80 | \raggedright | |
| 81 | ||
| 82 | In the same way that named sums are a natural way to represent data, | |
| 83 | records are a natural way to represent codata. In lieu of the above | |
| 84 | syntax, one often sees codata represented as something more like | |
| 85 | ||
| 86 | {\color{NavyBlue} | |
| 87 | \begin{verbatim} | |
| 88 | record Both a b = { .first : a, .second : b } | |
| 89 | ||
| 90 | x : Both Int Bool | |
| 91 | x = { .first = 2, .second = true } | |
| 92 | ||
| 93 | x.first == 2, x.second == true | |
| 94 | \end{verbatim} | |
| 95 | } | |
| 96 | ||
| 97 | The \dualtt{merge} syntax is used here for conceptual | |
| 98 | symmetry with \postt{case}. Additionally, the use of | |
| 99 | copatterns is nicely dual with the extra expressivity that | |
| 100 | patterns offer. For example, we can use nested patterns with | |
| 101 | constructors of various types, as in this function which | |
| 102 | processes a list of \postt{Either Int Bool} values | |
| 103 | by summing the integers in the list until it reaches a | |
| 104 | \postt{false} value or the end of the list: | |
| 105 | ||
| 106 | {\color{BrickRed} | |
| 107 | \begin{verbatim} | |
| 108 | f : List (Either Int Bool) -> Int | |
| 109 | f lst = case lst of | |
| 110 | Cons (Left i) xs -> i + f xs | |
| 111 | Cons (Right b) xs -> if b then f xs else 0 | |
| 112 | Nil -> 0 | |
| 113 | \end{verbatim} | |
| 114 | } | |
| 115 | ||
| 116 | Similarly, we can define an infinite stream of pairs using | |
| 117 | nested copatterns as so: | |
| 118 | ||
| 119 | {\color{NavyBlue} | |
| 120 | \begin{verbatim} | |
| 121 | s : Int -> Stream (Both Int Bool) | |
| 122 | s n = merge x from | |
| 123 | first (head x) <- n | |
| 124 | second (head x) <- n > 0 | |
| 125 | tail x <- x | |
| 126 | \end{verbatim} | |
| 127 | } | |
| 128 | ||
| 129 | Copatterns are also practically expressive, as in this concise | |
| 130 | and readable definition of the fibonacci numbers in terms of | |
| 131 | the \dual{\texttt{merge}} expression: | |
| 132 | ||
| 133 | {\color{NavyBlue} | |
| 134 | \begin{verbatim} | |
| 135 | fibs : Stream Int | |
| 136 | fibs = merge x from | |
| 137 | head x <- 0 | |
| 138 | head (tail x) <- 1 | |
| 139 | tail (tail x) <- zipWith (+) x (tail x) | |
| 140 | \end{verbatim} | |
| 141 | } | |
| 142 | ||
| 143 | \section{Generalizing \dual{Co}data} | |
| 144 | ||
| 145 | \begin{both} | |
| 146 | A \pos{Generalized Algebraic Data Type} adds extra | |
| 147 | expressivity to \pos{data} types by allowing extra | |
| 148 | restrictions on the value being \pos{constructed}. | |
| 149 | & A \dual{Generalized Coalgebraic Codata Type} adds extra | |
| 150 | expressivity to \dual{codata} types by allowing extra | |
| 151 | restrictions on the value being \dual{destructed}. | |
| 152 | \\ | |
| 153 | \color{BrickRed} | |
| 154 | \begin{verbatim} | |
| 155 | data Join a where | |
| 156 | Unit : Int -> Join Int | |
| 157 | Join : (Join a, Join a) -> | |
| 158 | Join (a, a)\end{verbatim} | |
| 159 | & | |
| 160 | \color{NavyBlue} | |
| 161 | \begin{verbatim} | |
| 162 | codata Split a where | |
| 163 | unit : Split Int -> Int | |
| 164 | split : Split (a, a) -> | |
| 165 | (Split a, Split a)\end{verbatim} | |
| 166 | \\ | |
| 167 | This definition precludes the use of any value of the | |
| 168 | type \pos{Join(Int, Bool)} because there is no way | |
| 169 | to \pos{construct} a value of this type---we can only | |
| 170 | \pos{construct} values with pairs of the same type, | |
| 171 | or containing an \texttt{Int}. | |
| 172 | & | |
| 173 | This definition precludes the use of any value of the | |
| 174 | type \dual{Split(Int, Bool)} because there is no way | |
| 175 | to \dual{destruct} a value of this type---we can only | |
| 176 | \dual{destruct} values with pairs of the same type, | |
| 177 | or containing an \texttt{Int}. | |
| 178 | \\ | |
| 179 | This gives us extra type information to omit nonsensical | |
| 180 | branches in \postt{case}, as well. The following code | |
| 181 | matches over a value of type \postt{Join Int}, and as | |
| 182 | such only has an arm for the \pos{constructor} | |
| 183 | \postt{Unit}---because it is the only \pos{constructor} | |
| 184 | which can \pos{create} a value of that type! | |
| 185 | & | |
| 186 | This gives us extra type information to omit nonsensical | |
| 187 | branches in \dualtt{merge}, as well. The following code | |
| 188 | matches over a value of type \dualtt{Split Int}, and as | |
| 189 | such only has an arm for the \dual{destructor} | |
| 190 | \dualtt{unit}---because it is the only \dual{destructor} | |
| 191 | which can \dual{destruct} a value of that type! | |
| 192 | \\ | |
| 193 | \color{BrickRed} | |
| 194 | \begin{verbatim} | |
| 195 | f : Join Int -> Int | |
| 196 | f x = case x of | |
| 197 | Unit n -> n\end{verbatim} | |
| 198 | & | |
| 199 | \color{NavyBlue} | |
| 200 | \begin{verbatim} | |
| 201 | g : Int -> Split Int | |
| 202 | g n = merge x from | |
| 203 | unit x <- n\end{verbatim} | |
| 204 | \\ | |
| 205 | We can \pos{construct} a value of type \postt{Join Int} | |
| 206 | in conjunction with using \postt{f} like so: | |
| 207 | & We can \dual{destruct} a value of type \dualtt{Split Int} | |
| 208 | in conjunction with using \dualtt{g} like so: | |
| 209 | \\ | |
| 210 | \color{BrickRed} | |
| 211 | \begin{verbatim} | |
| 212 | let y : Join Int | |
| 213 | y = Unit 5 | |
| 214 | in | |
| 215 | f y\end{verbatim} | |
| 216 | & | |
| 217 | \color{NavyBlue} | |
| 218 | \begin{verbatim} | |
| 219 | let y : Split Int | |
| 220 | y = g 5 | |
| 221 | in | |
| 222 | unit y\end{verbatim} | |
| 223 | \\ | |
| 224 | \end{both} | |
| 225 | ||
| 226 | \section{Focus on GCCTs} | |
| 227 | ||
| 228 | To dispense with the dual presentation: a simple, high-level, handwavey | |
| 229 | description of GCCTs is that they are \textit{records whose available | |
| 230 | projections may be restricted by | |
| 231 | the type of the record}. Like GADTs, they can introduce and use extra | |
| 232 | typing information. For example, consider the following GCCT and its | |
| 233 | instantiations: | |
| 234 | ||
| 235 | {\color{NavyBlue} | |
| 236 | \begin{verbatim} | |
| 237 | codata Wrapper a where | |
| 238 | contents : Wrapper a -> a | |
| 239 | isZero : Wrapper Int -> Bool | |
| 240 | ||
| 241 | mkStringW : String -> Wrapper String | |
| 242 | mkStringW x = merge r from | |
| 243 | contents r <- x | |
| 244 | ||
| 245 | mkIntW : Int -> Wrapper Int | |
| 246 | mkIntW x = merge r from | |
| 247 | contents r <- x | |
| 248 | isZero r <- x == 0 | |
| 249 | \end{verbatim} | |
| 250 | } | |
| 251 | ||
| 252 | A \dualtt{Wrapper} value is a wrapper over some other value, which can always be | |
| 253 | extracted using the projection \dualtt{contents}. However, if the value | |
| 254 | contained in the ``record'' is an \texttt{Int}, we have a second projection | |
| 255 | available to us: we can also use \dualtt{isZero} to project out a boolean. | |
| 256 | ||
| 257 | \subsection{Finite State Automata} | |
| 258 | ||
| 259 | We can use these in a few interesting ways: for example, consider the situation | |
| 260 | in which we have a finite state automaton for recognizing a language. For | |
| 261 | simplicity's sake, we'll restrict the states to \texttt{A}, \texttt{B}, and | |
| 262 | \texttt{C}, and allow the following valid transitions: | |
| 263 | ||
| 264 | \begin{verbatim} | |
| 265 | A -> B,C | |
| 266 | B -> C | |
| 267 | C -> A | |
| 268 | \end{verbatim} | |
| 269 | ||
| 270 | We want a data structure which represents the transitions available. We can do | |
| 271 | this, even in a strongly normalizing setting, by representing the FSA using | |
| 272 | a piece of infinite codata with projections for each state transition: | |
| 273 | ||
| 274 | {\color{NavyBlue} | |
| 275 | \begin{verbatim} | |
| 276 | data State = A | B | C | |
| 277 | codata NFA (a : State) where | |
| 278 | aToB : NFA A -> NFA B | |
| 279 | aToC : NFA A -> NFA C | |
| 280 | bToC : NFA B -> NFA C | |
| 281 | cToA : NFA C -> NFA A | |
| 282 | \end{verbatim} | |
| 283 | } | |
| 284 | ||
| 285 | This means that the following expressions (which use \texttt{.} to mean | |
| 286 | function composition) type-check as they represent valid | |
| 287 | state transitions | |
| 288 | ||
| 289 | \begin{verbatim} | |
| 290 | aState : NFA A | |
| 291 | aToB aState : NFA B | |
| 292 | (bToC . aToB) aState : NFA C | |
| 293 | (cToA . bToC . aToB) aState : NFA A | |
| 294 | \end{verbatim} | |
| 295 | ||
| 296 | whereas the following do not | |
| 297 | ||
| 298 | \begin{verbatim} | |
| 299 | bToC aState -- cannot unify `NFA A` with `NFA B` | |
| 300 | (aToC . aToB) aState -- cannot unify `NFA B` with `NFA A` | |
| 301 | \end{verbatim} | |
| 302 | ||
| 303 | We can construct an initial \dualtt{aState} value using the following | |
| 304 | \dualtt{merge} expression, using an extra \texttt{let}-binding to improve | |
| 305 | the sharing in this structure: | |
| 306 | ||
| 307 | {\color{NavyBlue} | |
| 308 | \begin{verbatim} | |
| 309 | aState : NFA A | |
| 310 | aState = | |
| 311 | merge as from | |
| 312 | let cState = (merge cs from aToB cs <- as) in | |
| 313 | aToC as <- cState | |
| 314 | aToB as <- (merge bs from bToC bs <- cState) | |
| 315 | \end{verbatim} | |
| 316 | } | |
| 317 | ||
| 318 | or much more concisely by using nested copatterns: | |
| 319 | ||
| 320 | {\color{NavyBlue} | |
| 321 | \begin{verbatim} | |
| 322 | aState : NFA A | |
| 323 | aState = merge as from | |
| 324 | cToA (aToC as) <- as | |
| 325 | cToA (bToC (aToB as)) <- as | |
| 326 | \end{verbatim} | |
| 327 | } | |
| 328 | ||
| 329 | We can make this more interesting (and more useful) by introducing an extra | |
| 330 | operation: suppose we're using this to recognize a language equivalent to the | |
| 331 | regular expression \texttt{/a(b?ca)*/}, and we want to be able to extract | |
| 332 | the string we've matched, but \textit{only} in the accept state | |
| 333 | \texttt{A}. We can change the codata definition to have such an accessor: | |
| 334 | ||
| 335 | {\color{NavyBlue} | |
| 336 | \begin{verbatim} | |
| 337 | codata NFA (a : State) where | |
| 338 | aToB : NFA A -> NFA B | |
| 339 | aToC : NFA A -> NFA C | |
| 340 | bToC : NFA B -> NFA C | |
| 341 | cToA : NFA C -> NFA A | |
| 342 | matchedString : NFA A -> String | |
| 343 | \end{verbatim} | |
| 344 | } | |
| 345 | ||
| 346 | and change the construction of the NFA to this: | |
| 347 | ||
| 348 | {\color{NavyBlue} | |
| 349 | \begin{verbatim} | |
| 350 | aState : NFA A | |
| 351 | aState = | |
| 352 | let f str = | |
| 353 | merge as from | |
| 354 | matchedString as <- str | |
| 355 | cToA (aToC as) <- f (str ++ "ca") | |
| 356 | cToA (bToC (aToB as)) <- f (str ++ "bca") | |
| 357 | in f "a" | |
| 358 | \end{verbatim} | |
| 359 | } | |
| 360 | ||
| 361 | Consequently the following holds true: | |
| 362 | ||
| 363 | \begin{verbatim} | |
| 364 | (matchedString . cToA . atoC . cToA . bToC . aToB) aState == "abcaca" | |
| 365 | \end{verbatim} | |
| 366 | ||
| 367 | whereas the following does not type-check: | |
| 368 | ||
| 369 | \begin{verbatim} | |
| 370 | (matchedString . bToC . aToB) aState | |
| 371 | \end{verbatim} | |
| 372 | ||
| 373 | \subsection{A Forth-Like Embedded Language} | |
| 374 | ||
| 375 | Consider the following codata type: | |
| 376 | ||
| 377 | {\color{NavyBlue} | |
| 378 | \begin{verbatim} | |
| 379 | codata Forth a where | |
| 380 | push : Forth a -> b -> Forth (b, a) | |
| 381 | get : Forth (a, b) -> a | |
| 382 | drop : Forth (a, b) -> Forth b | |
| 383 | swap : Forth (a, (b, c)) -> Forth (b, (a, c)) | |
| 384 | add : Forth (Int, (Int, c)) -> Forth (Int, c) | |
| 385 | \end{verbatim} | |
| 386 | } | |
| 387 | ||
| 388 | This corresponds to a small embedded Forth-like sublanguage, in which | |
| 389 | a value of type \dualtt{Forth} corresponds to the state of a stack whose | |
| 390 | type is given by the type parameter. This means that not only is it an | |
| 391 | embedded Forth-like language, it is a well-typed language that can take | |
| 392 | arbitrary values in the host language and manipulate them on the stack. | |
| 393 | Assuming the use of the operator \texttt{x \# f = f x}, we can write a | |
| 394 | well-typed subprogram | |
| 395 | ||
| 396 | {\color{NavyBlue} | |
| 397 | \begin{verbatim} | |
| 398 | initialState # push 5 | |
| 399 | # push 2 | |
| 400 | # push 3 | |
| 401 | # swap | |
| 402 | # drop | |
| 403 | # add | |
| 404 | # get | |
| 405 | \end{verbatim} | |
| 406 | } | |
| 407 | ||
| 408 | and also reject this ill-typed subprogram, where \texttt{add} requires | |
| 409 | two integers but is given a stack with a boolean on top | |
| 410 | ||
| 411 | {\color{NavyBlue} | |
| 412 | \begin{verbatim} | |
| 413 | initialState # push 5 # push true # add # get | |
| 414 | \end{verbatim} | |
| 415 | } | |
| 416 | ||
| 417 | but also this one, in which \texttt{add} is only given a single argument | |
| 418 | ||
| 419 | {\color{NavyBlue} | |
| 420 | \begin{verbatim} | |
| 421 | initialState # push 5 # add # get | |
| 422 | \end{verbatim} | |
| 423 | } | |
| 424 | ||
| 425 | \end{document} |