gdritter repos documents / master posts / cherenkov.md
master

Tree @master (Download .tar.gz)

cherenkov.md @masterview markup · raw · history · blame

Cherenkov: Basics

A Cherenkov program evalutes to a JSON value, and in fact should include a standalone program which can be used to evaluate a Cherenkov program to its JSON equivalent. Cherenkov also includes a little bit of YAML-like sugar but does not include the full generality of YAML. (For example, references are not valid Cherenkov: it has something better!)

A simple configuration file might look like

# the first host
- hostname: "foo"
  forward-agent: true
  user: "u1"
# the second host
- hostname: "bar"
  local-command: "/bin/setup-shell -q"
  user: "u2"

This evaluates to the JSON document

[ { "hostname": "foo"
  , "forward-agent": true
  , "user": "u1"
  }
, { "hostname": "bar"
  , "local-command": "/bin/setup-shell -q"
  , "user": "u2"
  }
]

Note that the comments are removed in the final output, as comments are not valid JSON.

Cherenkov: Expressions

Expressions can be introduced using the single-quote character. These will be evaluated and their result spliced into the final output. If the expression is a string, a number, or a bool, it will result in a value of the corresponding JSON type:

- 1
- '(0.5 + 1.5)
- '("foo" ++ "bar")

This evaluates to

[1, 2.0, "foobar"]

If the result is a constructor, then it will be represented as an object with a single key that corresponds to the name of the constructor:

- '(Left 5)

evaluates to

[{"Left": 5}]

If the result is a record, then it will be represented as a JSON object:

- '{x: 2, y: 3}

evalutes to

[{"x": 2, "y": 3}]

The expression language includes several functional features, such as let-bindings, anonymous functions with a Haskell-like syntax, and function calls:

foo: '(let lst = [1,2,3] in map (\x.x+1) lst)`

evaluates to

{"foo": [2,3,4]}

Cherenkov: Declarations

Global declarations can be written by prefacing a declaration with the keyword with. These declarations do not appear in the output:

with incrAll lst = map (\x.x+1) lst

'(incrAll [1,2,3])

evaluates to

[2,3,4]

Functions can be defined by cases, and can have type signatures as well. The cases are given in an indented block:

with intersperse : List String -> String -> String
  (x::xs) i = x ++ i ++ intersperse xs i
  []      i = ""

Values and types can also be defined in with-blocks:

with type Server =
  { hostname String
  , username String
  , portNum  Int
  }

with defaultServer : Server =
  { hostname: "foo.bar"
  , username: "guest"
  , portNum: 9999
  }

servers:
  - 'defaultServer
  - '(defaultServer { hostname: "blah" })
  - '(defaultServer { port: 9876 })

evaluates to

{"servers":
  [ { "hostname": "foo.bar"
    , "username": "guest"
    , "postNum": 9999
    }
  , { "hostname": "blah"
    , "username": "guest"
    , "postNum": 9999
    }
  , { "hostname": "foo.bar"
    , "username": "guest"
    , "postNum": 9876
    }
  ]

Cherenkov: Type System

If you've used Haskell, OCaml, or SML, the type system in Cherenkov should look vaguely similar. We have records and sums, the former denoted by curly braces and the latter by square brackets:

# defining a sum type
with type MyVal =
  [ MyInt  Int
  , MyBool Bool
  ]

# using a sum type
with f : MyVal -> Int
  MyInt  n = n
  MyBool b = if | b    -> 1
                | else -> 0

# defining a product type
with type MyRec =
  { myString String
  , myFloat  Float
  }

# using a product type
with g : MyRec -> String
  r = r.myString ++ " and " ++ (if | r.myFloat > 0 -> "positive"
                                   | else          -> "negative")

You can also define polymorphic types:

# defining a polymorphic sum type
with type Either a b = [ Left a, Right b ]

# defining a polymorphic product type
with type Both a b = { first a, second b }

Cherenkov has a much looser type system than some other strongly-typed langauges. It technically has no type declarations---it has only type aliases. You can use as many or as few type declarations as you'd like, as types are inferred if you don't use them. I can use the above functions

# using a sum type without type annotations
with f
  MyInt  n = n
  MyBool b = if | b    -> 1
                | else -> 0

# using a product type
with g r =
  r.myString ++ " and " ++ (if | r.myFloat > 0 -> "positive"
                               | else          -> "negative")

And they will have the inferred types

f : [ MyInt Int, MyBool Bool ] -> Int
g : { myString String, myFloat Float, rest } -> String

That second type deserves a little bit more attention. Cherenkov has row polymorphism, which means that the second version of g will accept as argument not just a record with myString and myFloat fields, but a record with any number of additional fields, which is what the rest in the type represents. This can also appear in the return position: for example, a function like

with f r = r { z: r.x + r.y }

has an inferred type

f : { x: Int, y: Int, rest } -> { x: Int, y: Int, z: Int, rest }

as the rest of the fields are passed through unchanged. The rest variable also occurs when we underspecify a constructor: consider the function g

with g
  Left x = Left (x == 0)
  rs     = rs

which has the inferred type

g : [ Left: Int, rest ] -> [ Left: Bool, rest ]

i.e. this can be called on any constructor and it will pass it through unchanged. If this is too lax for us—if we'd prefer to limit the constructors to a particular set—then we can list them explicitly:

with g'
  Left x  = Left (x == 0)
  Right y = Right (y : Bool)

which has the inferred type

g' : [ Left: Int, Right: Bool ] -> [ Left: Bool, Right: Bool ]

Because all types are aliases, it means that recursive types and polymorphic types are also aliases for explicit type fixpoints and quantification, so the type alias

type MyList a =
  [ MyCons (a, MyList a)
  , MyNil
  ]

will desugar to the type \a. b is [ MyCons (a, b), MyNil ], where \ var . typ represents quantification as var is typ represents a fixpoint. It is possible but ill-advised to use these in actual config files—in practice, these exist becuase they provide a nice underlying theory to a terse and implicit system that preserves type-safety while requiring a minimum of explicit direction from the programmer.

Cherenkov: Totality

Cherenkov is strongly-typed and total. This means that certain invalid operations are not allowed in Cherenkov programs, and that Cherenkov programs will never loop forever. Certain restrictions are therefore placed on Cherenkov programs:

  • Cherenkov values may not be recursive, and therefore must be finite in size
  • Cherenkov functions may only be recursive if they recurse on some subpart of their input, or (as a special case) if they recurse with some numeric parameter that trivially moves towards a terminal condition.
  • Cherenkov sum types may be be recursive, but Cherenkov product types must not be. (The can be underneath another sum, so e.g. a type like rec is { a Int, b rec } is disallowed as it would involve an infinite value, whereas rec is { a Int, b [ Nothing, Just rec ] } is fine. This only works if the sum type in question has a non-recursive constructor: we cannot use the type rec is { a Int, b [ Left rec, Right rec ] }.

For example, in the following code, foo is well-typed and will also pass the totality checker, and bar is well-typed but would be rejected by the totality checker as it is not trivially recursing towards a terminal condition:

with foo : List Int -> Int
  (x::xs) = x + foo xs
  []      = 0

with bar : List Int -> Int
  (x::xs) = bar (x+1::xs)
  []      = bar []

This means that some functions which may be well-founded are not expressible, because it is not trivial to see that they terminate:

with collatz : Int -> Bool
  n = if | n == 1       -> true
         | n < 1        -> false
         | mod n 2 == 0 -> collatz (div n 2)
         | else         -> collatz (n * 3 + 1)

All sum and product types are row-polymorphic, and therefore do not need to be declared, as those types can be inferred from use. You can, however, declare aliases for those, if you'd like (especially as recursive types can get quite complicated without aliases!)