overview/gcct.tex - gcct (master)

Tree @master (Download .tar.gz)

gcct.tex @master — raw · history · blame

\documentclass{scrartcl}
\usepackage[letterpaper]{geometry}
\usepackage{fullpage}
\usepackage{cmbright}
\usepackage[usenames,dvipsnames]{color}
% \newcommand{\both}[2]{\begin{tabular}{ p{4in} p{4in} } #1 && #2 \\ \end{tabular}}
\newcommand{\pos}[1]{\textcolor{BrickRed}{#1}}
\newcommand{\postt}[1]{\textcolor{BrickRed}{\texttt{#1}}}
\newcommand{\dual}[1]{\textcolor{NavyBlue}{#1}}
\newcommand{\dualtt}[1]{\textcolor{NavyBlue}{\texttt{#1}}}
\newenvironment{both}
  { \begin{tabular}{ p{3in}  p{3in} } }
  { \\ \end{tabular} }
\begin{document}
\section{\pos{Data} and \dual{Codata}}

\begin{both}
    \pos{Data} is defined by its \pos{introduction} rules.
  & \dual{Codata} is defined by its \dual{elimination} rules.
\\
    We can define a \pos{data} type with
  & We can define a \dual{codata} type with
\\
  \color{BrickRed}
  \begin{verbatim}
  data Either a b
    = Left  a
    | Right b\end{verbatim}
  &
  \color{NavyBlue}
  \begin{verbatim}
  codata Both a b
    = first  a
    & second b\end{verbatim}
\\
    This definition introduces two functions:
  & This definition introduces two functions:
\\
  \color{BrickRed}
  \begin{verbatim}
  Left  : a -> Either a b
  Right : b -> Either a b\end{verbatim}
&
  \color{NavyBlue}
  \begin{verbatim}
  first  : Both a b -> a
  second : Both a b -> b\end{verbatim}
\\
    In order to \pos{destruct data} we have to use
    \pos{patterns} which let you match on \pos{constructors}.
  & In order to \dual{construct codata} we have to use
    \dual{copatterns} which let you match on \dual{destructors}.
\\
  \color{BrickRed}
  \begin{verbatim}
  (case e : Either a b of
     Left  (x : a) -> (e1 : c)
     Right (y : b) -> (e2 : c)) : c\end{verbatim}
&
  \color{NavyBlue}
  \begin{verbatim}
  (merge x : Both a b from
     first  x <- e1 : a
     second x <- e2 : b) : Both a b\end{verbatim}
\\
    Here, \postt{e} represents the \pos{value being
    destructed}, and each branch represents a \pos{constructor
    with which it might have been constructed}. We are effectively
    \pos{dispatching on possible pasts} of
    \postt{e}.
  & Here, \dual{\texttt{x}} represents the \dual{value being
    constructed}, and each branch represents a \dual{destructor
    with which it might eventually be destructed}. We are effectively
    \dual{dispatching on possible futures} of
    \dualtt{x}.
\\
\end{both}

\section{Codata, Records, and Copatterns}
\raggedright

In the same way that named sums are a natural way to represent data,
records are a natural way to represent codata. In lieu of the above
syntax, one often sees codata represented as something more like

{\color{NavyBlue}
\begin{verbatim}
record Both a b = { .first : a, .second : b }

x : Both Int Bool
x = { .first = 2, .second = true }

x.first == 2, x.second == true
\end{verbatim}
}

The \dualtt{merge} syntax is used here for conceptual
symmetry with \postt{case}. Additionally, the use of
copatterns is nicely dual with the extra expressivity that
patterns offer. For example, we can use nested patterns with
constructors of various types, as in this function which
processes a list of \postt{Either Int Bool} values
by summing the integers in the list until it reaches a
\postt{false} value or the end of the list:

{\color{BrickRed}
\begin{verbatim}
f : List (Either Int Bool) -> Int
f lst = case lst of
  Cons (Left i) xs  -> i + f xs
  Cons (Right b) xs -> if b then f xs else 0
  Nil               -> 0
\end{verbatim}
}

Similarly, we can define an infinite stream of pairs using
nested copatterns as so:

{\color{NavyBlue}
\begin{verbatim}
s : Int -> Stream (Both Int Bool)
s n = merge x from
  first (head x)  <- n
  second (head x) <- n > 0
  tail x          <- x
\end{verbatim}
}

Copatterns are also practically expressive, as in this concise
and readable definition of the fibonacci numbers in terms of
the \dualtt{merge} expression:

{\color{NavyBlue}
\begin{verbatim}
codata Stream a
  = head a
  & tail (Stream a)

zipWith : (a -> b -> c) -> Stream a -> Stream b -> Stream c
zipWith f s1 s2 = merge x from
  head x <- f (head s1) (head s2)
  tail x <- zipWith f (tail s1) (tail s2)

fibs : Stream Int
fibs = merge x from
  head x        <- 0
  head (tail x) <- 1
  tail (tail x) <- zipWith (+) x (tail x)
\end{verbatim}
}

\section{Generalizing \dual{Co}data}

\begin{both}
    A \pos{Generalized Algebraic Data Type} adds extra
    expressivity to \pos{data} types by allowing extra
    restrictions on the value being \pos{constructed}.
  & A \dual{Generalized Coalgebraic Codata Type} adds extra
    expressivity to \dual{codata} types by allowing extra
    restrictions on the value being \dual{destructed}.
\\
  \color{BrickRed}
  \begin{verbatim}
  data Join a where
    Unit : Int -> Join Int
    Join : (Join a, Join a) ->
             Join (a, a)\end{verbatim}
  &
  \color{NavyBlue}
  \begin{verbatim}
  codata Split a where
    unit  : Split Int -> Int
    split : Split (a, a) ->
              (Split a, Split a)\end{verbatim}
\\
    This definition precludes the use of any value of the
    type \pos{Join(Int, Bool)} because there is no way
    to \pos{construct} a value of this type---we can only
    \pos{construct} values with pairs of the same type,
    or containing an \texttt{Int}.
  &
    This definition precludes the use of any value of the
    type \dual{Split(Int, Bool)} because there is no way
    to \dual{destruct} a value of this type---we can only
    \dual{destruct} values with pairs of the same type,
    or containing an \texttt{Int}.
\\
    This gives us extra type information to omit nonsensical
    branches in \postt{case}, as well. The following code
    matches over a value of type \postt{Join Int}, and as
    such only has an arm for the \pos{constructor}
    \postt{Unit}---because it is the only \pos{constructor}
    which can \pos{create} a value of that type!
  &
    This gives us extra type information to omit nonsensical
    branches in \dualtt{merge}, as well. The following code
    matches over a value of type \dualtt{Split Int}, and as
    such only has an arm for the \dual{destructor}
    \dualtt{unit}---because it is the only \dual{destructor}
    which can \dual{destruct} a value of that type!
\\
  \color{BrickRed}
  \begin{verbatim}
  f : Join Int -> Int
  f x = case x of
    Unit n -> n\end{verbatim}
  &
  \color{NavyBlue}
  \begin{verbatim}
  g : Int -> Split Int
  g n = merge x from
    unit x <- n\end{verbatim}
\\
    We can \pos{construct} a value of type \postt{Join Int}
    in conjunction with using \postt{f} like so:
  & We can \dual{destruct} a value of type \dualtt{Split Int}
    in conjunction with using \dualtt{g} like so:
\\
  \color{BrickRed}
  \begin{verbatim}
  let y : Join Int
      y = Unit 5
  in
      f y\end{verbatim}
  &
  \color{NavyBlue}
  \begin{verbatim}
  let y : Split Int
      y = g 5
  in
      unit y\end{verbatim}
\\
\end{both}

\section{Focus on GCCTs}

To dispense with the dual presentation: a simple, high-level, handwavey
description of GCCTs is that they are \textit{records whose available
projections may be restricted by
the type of the record}. Like GADTs, they can introduce and use extra
typing information. For example, consider the following GCCT and its
instantiations:

{\color{NavyBlue}
\begin{verbatim}
codata Wrapper a where
  contents : Wrapper a   -> a
  isZero   : Wrapper Int -> Bool

mkStringW : String -> Wrapper String
mkStringW x = merge r from
  contents r <- x

mkIntW : Int -> Wrapper Int
mkIntW x = merge r from
  contents r <- x
  isZero   r <- x == 0
\end{verbatim}
}

A \dualtt{Wrapper} value is a wrapper over some other value, which can always be
extracted using the projection \dualtt{contents}. However, if the value
contained in the ``record'' is an \texttt{Int}, we have a second projection
available to us: we can also use \dualtt{isZero} to project out a boolean.

In fact, because of the type machinery afforded us by GCCTs, we can
actually collapse both \dualtt{mkStringW} and \dualtt{mkIntW} into a
single function, \dualtt{mkWrapper}, like so:

{\color{NavyBlue}
\begin{verbatim}
mkWrapper : a -> Wrapper a
mkWrapper x = merge r from
  contents r <- x
  isZero   r <- x == 0
\end{verbatim}
}

The reason this is valid is that, like with GADTs, copattern matching on
\dualtt{isZero} gives us extra type information about \dualtt{x}, and so
will only be a valid branch when \dualtt{mkWrapper} is called with a
value of type \dualtt{Int}; otherwise it is merely a redundant branch
and is ignored.

\subsection{Finite State Automata}

We can use these in a few interesting ways: for example, consider the situation
in which we have a finite state automaton for recognizing a language. For
simplicity's sake, we'll restrict the states to \texttt{A}, \texttt{B}, and
\texttt{C}, and allow the following valid transitions:

\begin{verbatim}
  A -> B,C
  B -> C
  C -> A
\end{verbatim}

We want a data structure which represents the transitions available. We can do
this, even in a strongly normalizing setting, by representing the FSA using
a piece of infinite codata with projections for each state transition:

{\color{NavyBlue}
\begin{verbatim}
data State = A | B | C
codata NFA (a : State) where
  aToB : NFA A -> NFA B
  aToC : NFA A -> NFA C
  bToC : NFA B -> NFA C
  cToA : NFA C -> NFA A
\end{verbatim}
}

This means that the following expressions (which use \texttt{.} to mean
function composition) type-check as they represent valid
state transitions

\begin{verbatim}
                     aState : NFA A
aToB                 aState : NFA B
(bToC . aToB)        aState : NFA C
(cToA . bToC . aToB) aState : NFA A
\end{verbatim}

whereas the following do not

\begin{verbatim}
bToC aState          -- cannot unify `NFA A` with `NFA B`
(aToC . aToB) aState -- cannot unify `NFA B` with `NFA A`
\end{verbatim}

We can construct an initial \dualtt{aState} value using the following
\dualtt{merge} expression, using an extra \texttt{let}-binding to improve
the sharing in this structure:

{\color{NavyBlue}
\begin{verbatim}
aState : NFA A
aState =
  merge as from
    let cState = (merge cs from cToA cs <- as) in
      aToC as <- cState
      aToB as <- (merge bs from bToC bs <- cState)
\end{verbatim}
}

or much more concisely by using nested copatterns:

{\color{NavyBlue}
\begin{verbatim}
aState : NFA A
aState = merge as from
  cToA (aToC as)        <- as
  cToA (bToC (aToB as)) <- as
\end{verbatim}
}

We can make this more interesting (and more useful) by introducing an extra
operation: suppose we're using this to recognize a language equivalent to the
regular expression \texttt{/a(b?ca)*/}, and we want to be able to extract
the string we've matched, but \textit{only} in the accept state
\texttt{A}. We can change the codata definition to have such an accessor:

{\color{NavyBlue}
\begin{verbatim}
codata NFA (a : State) where
  aToB : NFA A -> NFA B
  aToC : NFA A -> NFA C
  bToC : NFA B -> NFA C
  cToA : NFA C -> NFA A
  matchedString : NFA A -> String
\end{verbatim}
}

and change the construction of the NFA to this:

{\color{NavyBlue}
\begin{verbatim}
aState : NFA A
aState =
  let f str =
    merge
      matchedString _      <- str
      cToA (aToC _)        <- f (str ++ "ca")
      cToA (bToC (aToB _)) <- f (str ++ "bca")
  in f "a"
\end{verbatim}
}

Consequently the following holds true:

\begin{verbatim}
  (matchedString . cToA . atoC . cToA . bToC . aToB) aState == "abcaca"
\end{verbatim}

whereas the following does not type-check:

\begin{verbatim}
  (matchedString . bToC . aToB) aState
\end{verbatim}

\subsection{A Forth-Like Embedded Language}

Consider the following codata type:

{\color{NavyBlue}
\begin{verbatim}
codata Forth a where
  push : Forth a -> b -> Forth (b, a)
  get  : Forth (a, b) -> a
  drop : Forth (a, b) -> Forth b
  swap : Forth (a, (b, c)) -> Forth (b, (a, c))
  app  : Forth (a, (b, c)) -> (a -> b -> d) -> Forth (d, c)
\end{verbatim}
}

This corresponds to a small embedded Forth-like sublanguage, in which
a value of type \dualtt{Forth} corresponds to the state of a stack whose
type is given by the type parameter. This means that not only is it an
embedded Forth-like language, it is a well-typed language that can take
arbitrary values in the host language and manipulate them on the stack.
Assuming the use of the operator \texttt{x \# f = f x}, we can write a
well-typed subprogram

{\color{NavyBlue}
\begin{verbatim}
initialState # push 5
             # push 2
             # push 3
             # swap
             # drop
             # add
             # get
\end{verbatim}
}

and also reject this ill-typed subprogram, where \texttt{add} requires
two integers but is given a stack with a boolean on top

{\color{NavyBlue}
\begin{verbatim}
initialState # push 5 # push true # add # get
\end{verbatim}
}

but also this one, in which \texttt{add} is only given a single argument

{\color{NavyBlue}
\begin{verbatim}
initialState # push 5 # add # get
\end{verbatim}
}

We have to create a helper function in order to create a value of this
type, which is the helper function \dualtt{push'}:

{\color{NavyBlue}
\begin{verbatim}
doPush : Forth a -> b -> Forth (b, a)
doPush s x = merge s' from
  push s' <- doPush s'
  get  s' <- x
  drop s' <- s
  swap s' <- push (get s) (push x (drop s))
  app  s' <-
    let doApp f = push (f x (get s)) (drop s)
    in doApp
\end{verbatim}
}

And then we can create an actual value of type \dualtt{Forth ()}:

{\color{NavyBlue}
\begin{verbatim}
initialState : Forth ()
initialState = merge s'
  push s' <- doPush s'
\end{verbatim}
}

\subsection{The Co\"operational Comonad}

Two convenient minimal definitions of a \texttt{Monad} are in terms
of \texttt{return} and either \texttt{join} and \texttt{fmap} or
\texttt{bind}:

\begin{verbatim}
class Monad m where
  return : a -> m a
  join   : m (m a) -> m a
  map    : (a -> b) -> m a -> m b

bind : Monad m => (a -> m b) -> m a -> m b
bind f x = join (map f x)
\end{verbatim}

\begin{verbatim}
class Monad m where
  return : a -> m a
  bind   : (a -> m b) -> m a -> m b

join : Monad m => m (m a) -> m a
join = map return

map : (a -> b) -> m a -> m b
map f x = bind (return . f) x
\end{verbatim}

This implies two ways of constructing a data structure that is
trivially a monad. The first is the traditional construction of
a free monad, which has two constructors which correspond to
\texttt{return} and \texttt{Join}:

\begin{verbatim}
data Free : (* -> *) -> * -> *
  = Pure a
  | Join (f (Free f a))
\end{verbatim}

In order to be a \texttt{Monad}, then the first parameter given
to \texttt{Free} must be a \texttt{Functor}, so that we get the
ability to \texttt{map} as well:

\begin{verbatim}
instance Functor f => Monad (Free f) where
  return = Pure

  map f (Pure x)  = Pure (f x)
  map f (Join xs) = Join (fmap f xs)

  join (Pure x)   = x
  join (Join xs)  = Join (fmap join xs)
\end{verbatim}

However, we could also write a data structure which is trivially
a monad whose type corresponds to \texttt{return} and \texttt{bind},
in which case we needn't have a functor constraint on the type
parameter. For this, we need GADTs:

\begin{verbatim}
data Oper : (* -> *) -> * -> * where
  Return : a -> Oper f a
  Bind   : f a -> (a -> Oper f b) -> Oper f b

instance Monad (Oper f) where
  return = Return

  bind g (Return x) = g x
  bind g (Bind x f) = Bind x (f >=> g)
\end{verbatim}

We can use the operational monad to combine various instructions
together: for example, we can construct a stack-like language
with the following instructions:

\begin{verbatim}
data StackInstr a where
  Pop  : StackInstr Int
  Push : Int -> StackInstr ()

evaluate : Oper StackInstr a -> List Int -> a
evaluate (Return x) _         = x
evaluate (Bind Pop f) (s:ss)  = evaluate (f s) ss
evaluate (Bind (Push s) f) ss = evaluate (f ()) (s:ss)

push : Oper StackInstr ()
push = return . Push

pop : Oper StackInstr Int
pop = return Pop

add : Oper StackInstr ()
add = bind pop (\ x ->
      bind pop (\ y ->
      push (x + y)))

sampleProgram : Int
sampleProgram = evalute program []
  where program = 
\end{verbatim}

\end{document}