Updated README to reflect library changes
Getty Ritter
10 years ago
| 13 | 13 | s-expressions or to extend it in various ways to accomodate new |
| 14 | 14 | flavors. |
| 15 | 15 | |
| 16 | ## What Are S-Expressions? | |
| 17 | ||
| 18 | S-expressions were originally the data representation format in | |
| 19 | Lisp implementations, but have found broad uses outside of that as | |
| 20 | a data representation and storage format. S-expressions are often | |
| 21 | understood as a representation for binary trees with optional values | |
| 22 | in the leaf nodes: an empty leaf is represented with empty | |
| 23 | parens `()`, a non-empty leaf is represented as the scalar value | |
| 24 | it contains (often tokens like `x` or other programming language | |
| 25 | literals), and an internal node is represented as `(x . y)` where | |
| 26 | `x` and `y` are standing in for other s-expressions. In Lisp | |
| 27 | parlance, an internal node is called a _cons cell_, and the first | |
| 28 | and second elements inside it are called the _car_ and the _cdr_, | |
| 29 | for historical reasons. Non-empty lef nodes are referred to | |
| 30 | in the s-cargot library as _atoms_. | |
| 31 | ||
| 32 | Often, s-expressions are used to represent lists, in which case | |
| 33 | the list is treated as a right-branching tree with an empty leaf as | |
| 34 | the far right child of the tree. S-expression languages have a | |
| 35 | shorthand way of representing these lists: instead of writing successsively | |
| 36 | nested pairs, as in `(1 . (2 . (3 . ()))`, they allow the sugar | |
| 37 | `(1 2 3)`. This is the most common way of writing s-expressions, | |
| 38 | even in languages that allow raw cons cells (or "dotted pairs") to | |
| 39 | be written. | |
| 40 | ||
| 41 | The s-cargot library refers to expressions where every right-branching | |
| 42 | sequence ends in an empty leaf as _well-formed s-expressions_. Note that | |
| 43 | any s-expression which can be written without using a dotted pair is | |
| 44 | necessarily well-formed. | |
| 45 | ||
| 46 | Unfortunately, while in common use, s-expressions do not have a single | |
| 47 | formal standard. They are often defined in an ad-hoc way, which means | |
| 48 | that s-expressions used in different contexts will, despite sharing a common | |
| 49 | parentheses-delimited structure, differ in various respects. Additionally, | |
| 50 | because s-expressions are used as the concrete syntax for languages of | |
| 51 | the Lisp family, they often have conveniences (such as comment syntaxes) | |
| 52 | and other bits of syntactic sugar (such as _reader macros_, which are | |
| 53 | described more fully later) that make parsing them much more complicated. | |
| 54 | Even ignoring those features, the _atoms_ recognized by a given | |
| 55 | s-expression variation can differ widely. | |
| 56 | ||
| 57 | The s-cargot library was designed to accomodate several different kinds | |
| 58 | of s-expression formats, so that an s-expression format can be easily | |
| 59 | expressed as a combination of existing features. It includes a few basic | |
| 60 | variations on s-expression languages as well as the tools for parsing | |
| 61 | and emitting more elaborate s-expressions variations without having to | |
| 62 | reimplement the basic plumbing yourself. | |
| 63 | ||
| 64 | ## Using the Library | |
| 65 | ||
| 16 | 66 | The central way of interacting with the S-Cargot library is by creating |
| 17 | and modifying a _spec_, which is a value that represents a given | |
| 18 | family of S-expressions. A _spec_, which is of type `SExprSpec`, | |
| 19 | contains the information necessary to implement reader macros, arbitrary | |
| 20 | kinds of comments, and various processing steps. A `SExprSpec` has two | |
| 21 | type parameters: | |
| 22 | ||
| 23 | ~~~~ | |
| 24 | +------ the type that represents a SExpr atom | |
| 25 | | | |
| 26 | | +- the Haskell representation of the SExpr value | |
| 27 | | | | |
| 28 | someSpec :: SExprSpec atom carrier | |
| 67 | and modifying datatypes which represent specifications for parsing and | |
| 68 | printing s-expressions. Each of those types has two type parameters, which | |
| 69 | are often called @atom@ and @carrier@: | |
| 70 | ||
| 71 | ~~~~ | |
| 72 | +------ the type that represents an atom or value | |
| 73 | | | |
| 74 | | +- the Haskell representation of the SExpr itself | |
| 75 | | | | |
| 76 | parser :: SExprParser atom carrier | |
| 77 | printer :: SExprPrinter atom carrier | |
| 29 | 78 | ~~~~ |
| 30 | 79 | |
| 31 | 80 | Various functions will be provided that modify the carrier type (i.e. the |
| 32 | 81 | output type of parsing or input type of serialization) or the language |
| 33 |
recognized by the parsing. |
|
| 82 | recognized by the parsing. | |
| 34 | 83 | |
| 35 | 84 | ## Representing S-expressions |
| 36 | 85 | |
| 37 | 86 | There are three built-in representations of S-expression lists: two of them |
| 38 | 87 | are isomorphic, as one or the other might be better for processing |
| 39 | 88 | S-expression data in a particular circumstance, and the third represents |
| 40 |
only |
|
| 89 | only the well-formed subset of possible S-expressions. | |
| 41 | 90 | |
| 42 | 91 | ~~~~.haskell |
| 43 | 92 | -- cons-based representation |
| 71 | 120 | functions. |
| 72 | 121 | |
| 73 | 122 | ~~~~.haskell |
| 74 |
> |
|
| 123 | >>> decode basicParser "(a b)" | |
| 75 | 124 | Right [SCons (SAtom "a") (SCons (SAtom "b") SNil)] |
| 76 |
> |
|
| 125 | >>> decode (asRich basicParser) "(a b)" | |
| 77 | 126 | Right [RSList [RSAtom "a",RSAtom "b"]] |
| 78 |
> |
|
| 127 | >>> decode (asWellFormed basicParser) "(a b)" | |
| 79 | 128 | Right [WFSList [WFSAtom "a",WFSAtom "b"]] |
| 80 |
> |
|
| 129 | >>> decode basicParser "(a . b)" | |
| 81 | 130 | Right [SCons (SAtom "a") (SAtom "b")] |
| 82 |
> |
|
| 131 | >>> decode (asRich basicParser) "(a . b)" | |
| 83 | 132 | Right [RSDotted [RSAtom "a"] "b"] |
| 84 |
> |
|
| 133 | >>> decode (asWellFormed basicParser) "(a . b)" | |
| 85 | 134 | Left "Found atom in cdr position" |
| 86 | 135 | ~~~~ |
| 87 | 136 | |
| 93 | 142 | you plan on working with: |
| 94 | 143 | |
| 95 | 144 | ~~~~.haskell |
| 96 | > A 2 ::: A 3 ::: A 4 ::: Nil | |
| 97 | SCons (SCons (SCons (SAtom 2) (SAtom 3)) (SAtom 4)) SNil | |
| 98 | ~~~~ | |
| 99 | ||
| 100 | ~~~~.haskell | |
| 101 | > L [A 1,A 2,A 3] | |
| 145 | >>> import Data.SCargot.Repr.Basic | |
| 146 | >>> A 2 ::: A 3 ::: A 4 ::: Nil | |
| 147 | SCons (SAtom 2) (SCons (SAtom 3) (SComs (SAtom 4) SNil)) | |
| 148 | ~~~~ | |
| 149 | ||
| 150 | ~~~~.haskell | |
| 151 | >>> import Data.SCargot.Repr.WellFormed | |
| 152 | >>> L [A 1,A 2,A 3] | |
| 102 | 153 | WFSList [WFSAtom 1,WFSAtom 2,WFSAtom 3] |
| 103 | > let sexprSum (L xs) = sum (map sexprSum xs); sexprSum (A n) = n | |
| 104 | > :t sexprSum | |
| 154 | >>> let sexprSum (L xs) = sum (map sexprSum xs); sexprSum (A n) = n | |
| 155 | >>> :t sexprSum | |
| 105 | 156 | sexprSum :: Num a => WellFormedSExpr a -> a |
| 106 |
> |
|
| 157 | >>> sexprSum (L [A 2, L [A 3, A 4]]) | |
| 107 | 158 | 9 |
| 108 | 159 | ~~~~ |
| 109 | 160 | |
| 161 | If you are using GHC 7.10, several of these will be powerful | |
| 162 | bidirectional pattern synonyms that allow both constructing and | |
| 163 | pattern-matchhing on s-expressions in non-trivial ways: | |
| 164 | ||
| 165 | ~~~~.haskell | |
| 166 | >>> import Data.SCargot.Repr.Basic | |
| 167 | >>> L [ A 2, A 3, A 4 ] | |
| 168 | SCons (SAtom 2) (SCons (SAtom 3) (SComs (SAtom 4) SNil)) | |
| 169 | ~~~~ | |
| 170 | ||
| 110 | 171 | ## Atom Types |
| 111 | 172 | |
| 112 | 173 | Any type can serve as an underlying atom type provided that it has |
| 113 |
a |
|
| 174 | a Parsec parser or a serializer (i.e. a way of turning it | |
| 114 | 175 | into `Text`.) For these examples, I'm going to use a very simple |
| 115 | 176 | serializer that is roughly like the one found in `Data.SCargot.Basic`, |
| 116 | 177 | which parses symbolic tokens of letters, numbers, and some |
| 118 | 179 | is just the identity function: |
| 119 | 180 | |
| 120 | 181 | ~~~~.haskell |
| 121 | spec :: SExprSpec Text (SExpr Text) | |
| 122 | spec = mkSpec (pack <$> many1 (alphaNum <|> oneOf "+-*/!?")) id | |
| 182 | parser :: SExprParser Text (SExpr Text) | |
| 183 | parser = mkParser (pack <$> many1 (alphaNum <|> oneOf "+-*/!?")) | |
| 184 | ||
| 185 | printer :: SExprPrinter Text (SExpr Text) | |
| 186 | printer = flatPrint id | |
| 123 | 187 | ~~~~ |
| 124 | 188 | |
| 125 | 189 | A more elaborate atom type would distinguish between different |
| 139 | 203 | sAtom (Ident t) = t |
| 140 | 204 | sAtom (Num n) = pack (show n) |
| 141 | 205 | |
| 142 | mySpec :: SExprSpec Atom (SExpr Atom) | |
| 143 | mySpec = mkSpec pAtom sAtom | |
| 206 | myParser :: SExprParser Atom (SExpr Atom) | |
| 207 | myParser = mkParser pAtom | |
| 208 | ||
| 209 | myPrinter :: SExprPrinter Atom (SExpr Atom) | |
| 210 | myPrinter = flatPrint sAtom | |
| 144 | 211 | ~~~~ |
| 145 | 212 | |
| 146 | 213 | We can then use this newly created atom type within an S-expression |
| 147 | 214 | for both parsing and serialization: |
| 148 | 215 | |
| 149 | 216 | ~~~~.haskell |
| 150 |
> |
|
| 217 | >>> decode myParser "(foo 1)" | |
| 151 | 218 | Right [SCons (SAtom (Ident "foo")) (SCons (SAtom (Num 1)) SNil)] |
| 152 | > encode mySpec [SCons (SAtom (Num 0)) SNil] | |
| 153 | "(0)" | |
| 219 | >>> encode mySpec [L [A (Num 0), A (Ident "bar")]] | |
| 220 | "(0 bar)" | |
| 154 | 221 | ~~~~ |
| 155 | 222 | |
| 156 | 223 | ## Carrier Types |
| 188 | 255 | the `SExprSpec`: |
| 189 | 256 | |
| 190 | 257 | ~~~~.haskell |
| 191 |
> |
|
| 258 | >>> let parser' = setCarrier toExpr (asRich myParser) | |
| 259 | >>> :t parser' | |
| 260 | SExprParser Atom Expr | |
| 261 | >>> decode parser' "(+ 1 2)" | |
| 192 | 262 | Right [Add (Num 1) (Num 2)] |
| 193 |
> |
|
| 263 | >>> decode parser' "(0 1 2)" | |
| 194 | 264 | Left "Unrecognized s-expr" |
| 195 | 265 | ~~~~ |
| 196 | 266 | |
| 197 | 267 | ## Comments |
| 198 | 268 | |
| 199 |
By default, an S-expression |
|
| 269 | By default, an S-expression parser does not include a comment syntax, but | |
| 200 | 270 | the provided `withLispComments` function will cause it to understand |
| 201 | 271 | traditional Lisp line-oriented comments that begin with a semicolon: |
| 202 | 272 | |
| 203 | 273 | ~~~~.haskell |
| 204 |
> |
|
| 274 | >>> decode basicParser "(this ; has a comment\n inside)\n" | |
| 205 | 275 | Left "(line 1, column 7):\nunexpected \";\"\nexpecting space or atom" |
| 206 |
> |
|
| 276 | >>> decode (withLispComments basicParser) "(this ; has a comment\n inside)\n" | |
| 207 | 277 | Right [SCons (SAtom "this") (SCons (SAtom "inside") SNil)] |
| 208 | 278 | ~~~~ |
| 209 | 279 | |
| 218 | 288 | For example, the following adds C++-style comments to an S-expression format: |
| 219 | 289 | |
| 220 | 290 | ~~~~.haskell |
| 221 | > let cppComment = string "//" >> manyTill newline >> return () | |
| 222 | > decode (setComment cppComment spec) "(a //comment\n b)\n" | |
| 291 | >>> let cppComment = string "//" >> manyTill newline >> return () | |
| 292 | >>> decode (setComment cppComment basicParser) "(a //comment\n b)\n" | |
| 223 | 293 | Right [SCons (SAtom "a") (SCons (SAtom "b") SNil)] |
| 224 | 294 | ~~~~ |
| 225 | 295 | |
| 226 | 296 | ## Reader Macros |
| 227 | 297 | |
| 228 |
A _reader macro_ is a Lisp macro |
|
| 298 | A _reader macro_ is a Lisp macro---a function that operates on syntactic | |
| 299 | structures---which is invoked during the scanning phase of a Lisp parser. This | |
| 229 | 300 | allows the _lexical_ syntax of a Lisp to be modified. The most commonly |
| 230 | seen reader macro is the quote, which allows the syntax `'expr` to stand | |
| 231 | in for the s-expression `(quote expr)`. The S-Cargot library accomodates | |
| 232 | this by keeping a map of characters to Parsec parsers that can be used as | |
| 301 | seen reader macro is the quote, which allows the syntax `'expr` to stand as sugar | |
| 302 | for the s-expression `(quote expr)`. The S-Cargot library accomodates | |
| 303 | this by keeping a map from characters to Haskell functions that can be used as | |
| 233 | 304 | readers. There is a special case for the aforementioned quote, but that |
| 234 | 305 | could easily be written by hand as |
| 235 | 306 | |
| 236 | 307 | ~~~~.haskell |
| 237 | > let quoteExpr c = SCons (SAtom "quote") (SCons c SNil) | |
| 238 | > let withQuote = addReader '\'' (\ p -> fmap quoteExpr p) | |
| 239 |
> |
|
| 308 | >>> let quote expr = SCons (SAtom "quote") (SCons expr SNil) | |
| 309 | >>> let addQuoteReader = addReader '\'' (\ parse -> fmap quoteExpr parse) | |
| 310 | >>> decode (addQuoteReader basicParser) "'foo" | |
| 240 | 311 | Right [SCons (SAtom "quote") (SCons (SAtom "foo") SNil)] |
| 241 | 312 | ~~~~ |
| 242 | 313 | |
| 243 | 314 | A reader macro is passed the parser that invoked it, so that it can |
| 244 |
perform recursive calls |
|
| 315 | perform recursive calls into the parser, and can return any `SExpr` it would like. It | |
| 245 | 316 | may also take as much or as little of the remaining parse stream as it |
| 246 | 317 | would like; for example, the following reader macro does not bother |
| 247 | 318 | parsing anything else and merely returns a new token: |
| 248 | 319 | |
| 249 | 320 | ~~~~.haskell |
| 250 | > let qmReader = addReader '?' (\ _ -> pure (SAtom "huh")) | |
| 251 | > decode (qmReader mySpec) "(?1 2)" | |
| 321 | >>> let qmReader = addReader '?' (\ _ -> pure (SAtom "huh")) | |
| 322 | >>> decode (qmReader basicParser) "(?1 2)" | |
| 252 | 323 | Right [SCons (SAtom "huh") (SCons (SAtom "1") (SCons (SAtom "2") SNil))] |
| 253 | 324 | ~~~~ |
| 254 | 325 | |
| 255 |
Reader macros in S-Cargot can be used to define |
|
| 326 | Reader macros in S-Cargot can be used to define bits of Lisp | |
| 256 | 327 | syntax that are not typically considered the purview of S-expression |
| 257 | 328 | parsers. For example, to allow square brackets as a subsitute for |
| 258 |
proper lists, we could define a reader macro that is in |
|
| 329 | proper lists, we could define a reader macro that is indicated by the | |
| 259 | 330 | `[` character and repeatedly calls the parser until a `]` character |
| 260 | 331 | is reached: |
| 261 | 332 | |
| 262 | 333 | ~~~~.haskell |
| 263 | > let pVec p = (char ']' *> pure SNil) <|> (SCons <$> p <*> pVec p) | |
| 264 | > let vec = addReader '[' pVec | |
| 265 |
> |
|
| 334 | >>> let vec p = (char ']' *> pure SNil) <|> (SCons <$> p <*> vec p) | |
| 335 | >>> let withVecReader = addReader '[' vec | |
| 336 | >>> decode (asRich (withVecReader basicParser)) "(1 [2 3])" | |
| 266 | 337 | Right [RSList [RSAtom "1",RSList [RSAtom "2",RSAtom "3"]]] |
| 267 | 338 | ~~~~ |
| 268 | 339 | |
| 340 | ## Pretty-Printing and Indentation | |
| 341 | ||
| 342 | The s-cargot library also includes a simple but often adequate | |
| 343 | pretty-printing system for s-expressions. A printer that prints a | |
| 344 | single-line s-expression is created with `flatPrint`: | |
| 345 | ||
| 346 | ~~~~.haskell | |
| 347 | >>> let printer = flatPrint id | |
| 348 | >>> :t printer | |
| 349 | SExprPrinter Text (SCargot Text) | |
| 350 | >>> Text.putStrLn $ encode printer [L [A "foo", A "bar"]] | |
| 351 | (foo bar) | |
| 352 | ~~~~ | |
| 353 | ||
| 354 | A printer that tries to pretty-print an s-expression to fit | |
| 355 | attractively within an 80-character limit can be created with | |
| 356 | `basicPrint`: | |
| 357 | ||
| 358 | ~~~~.haskell | |
| 359 | >>> let printer = basicPrint id | |
| 360 | >>> let sentence = "this stupendously preposterously supercalifragilisticexpialidociously long s-expression" | |
| 361 | >>> let longSexpr = L [A word | word <- Text.words sentence ] | |
| 362 | >>> Text.putStrLn $ encodeOne printer longSexpr | |
| 363 | (this | |
| 364 | stupendously | |
| 365 | preposterously | |
| 366 | supercalifragilisticexpialidociously | |
| 367 | long | |
| 368 | s-expression) | |
| 369 | ~~~~ | |
| 370 | ||
| 371 | A printer created with `basicPrint` will "swing" things that are too | |
| 372 | long onto the subsequent line, indenting it a fixed number of spaces. | |
| 373 | We can modify the number of spaces with `setIndentAmount`: | |
| 374 | ||
| 375 | ~~~~.haskell | |
| 376 | >>> let printer = setIndentAmount 4 (basicPrint id) | |
| 377 | >>> Text.putStrLn $ encodeOne printer longSexpr | |
| 378 | (this | |
| 379 | stupendously | |
| 380 | preposterously | |
| 381 | supercalifragilisticexpialidociously | |
| 382 | long | |
| 383 | s-expression) | |
| 384 | ~~~~ | |
| 385 | ||
| 386 | We can also modify what counts as the 'maximum width', which for a | |
| 387 | `basicPrint` printer is 80 by default: | |
| 388 | ||
| 389 | ~~~~.haskell | |
| 390 | >>> let printer = setMaxWidth 8 (basicPrint id) | |
| 391 | >>> Text.putStrLn $ encodeOne printer (L [A "one", A "two", A "three"]) | |
| 392 | (one | |
| 393 | two | |
| 394 | three) | |
| 395 | ~~~~ | |
| 396 | ||
| 397 | Or remove the maximum, which will put the whole s-expression onto one | |
| 398 | line, regardless of its length: | |
| 399 | ||
| 400 | ~~~~.haskell | |
| 401 | >>> let printer = removeMaxWidth (basicPrint id) | |
| 402 | >>> Text.putStrLn $ encodeOne printer longSexpr | |
| 403 | (this stupendously preposterously supercalifragilisticexpialidociously long s-expression) | |
| 404 | ~~~~ | |
| 405 | ||
| 406 | We can also specify an _indentation strategy_, which decides how to | |
| 407 | indent subsequent expressions based on the head of a given | |
| 408 | expression. The default is to always "swing" subsequent expressions | |
| 409 | to the next line, but we could also specify the `Align` constructor, which | |
| 410 | will print the first two expressions on the same line and then any subsequent | |
| 411 | expressions horizontally aligned with the second one, like so: | |
| 412 | ||
| 413 | ~~~~.haskell | |
| 414 | >>> let printer = setIndentStrategy (\ _ -> Align) (setMaxWidth 8 (basicPrint id)) | |
| 415 | >>> Text.putStrLn $ encodeOne printer (L [A "one", A "two", A "three", A "four"]) | |
| 416 | (one two | |
| 417 | three | |
| 418 | four) | |
| 419 | ~~~~ | |
| 420 | ||
| 421 | Or we could choose to keep some number of expressions on the same line and afterwards | |
| 422 | swing the subsequent ones: | |
| 423 | ||
| 424 | ~~~~.haskell | |
| 425 | >>> let printer = setIndentStrategy (\ _ -> SwingAfter 1) (setMaxWidth 8 (basicPrint id)) | |
| 426 | >>> Text.putStrLn $ encodeOne printer (L [A "one", A "two", A "three", A "four"]) | |
| 427 | (one two | |
| 428 | three | |
| 429 | four) | |
| 430 | ~~~~ | |
| 431 | ||
| 432 | For lots of situations, we might want to choose a different indentation strategy based | |
| 433 | on the first expression within a proper list: for example, Common Lisp source code is often | |
| 434 | formatted so that, following a `defun` token, we have the function name and arguments | |
| 435 | on the same line, and then the body of the function indented some amount subsequently. | |
| 436 | We can express an approximation of that strategy like this: | |
| 437 | ||
| 438 | ~~~~.haskell | |
| 439 | >>> let strategy (A ident) | "def" `Text.isPrefixOf` ident = SwingAfter 2; strategy _ = Align | |
| 440 | >>> let printer = setIndentStrategy strategy (setMaxWidth 20 (basicPrint id)) | |
| 441 | >>> let fact = L [A "defun", A "fact", L [A "x"], L [A "product", L [A "range", A "1", A "x"]]] | |
| 442 | >>> Text.putStrLn $ encodeOne printer fact | |
| 443 | (defun fact (x) | |
| 444 | (product (range 1 x))) | |
| 445 | >>> let app = L [A "apply", L [A "lambda", L [A "y"], L [A "fact", A "y"]], L [A "+", A "2", A "3"]] | |
| 446 | (apply (lambda (y) (fact y) | |
| 447 | (+ 2 3)) | |
| 448 | ~~~~ | |
| 449 | ||
| 269 | 450 | ## Putting It All Together |
| 270 | 451 | |
| 271 | 452 | Here is a final example which implements a limited arithmetic language |
| 272 |
with Haskell-style line comments and a special reader |
|
| 453 | with Haskell-style line comments and a special reader macro to understand hex | |
| 273 | 454 | literals: |
| 274 | 455 | |
| 275 | 456 | ~~~~.haskell |
| 457 | -- Our operators are going to represent addition, subtraction, or | |
| 458 | -- multiplication | |
| 276 | 459 | data Op = Add | Sub | Mul deriving (Eq, Show) |
| 460 | ||
| 461 | -- The atoms of our language are either one of the aforementioned | |
| 462 | -- operators, or positive integers | |
| 277 | 463 | data Atom = AOp Op | ANum Int deriving (Eq, Show) |
| 464 | ||
| 465 | -- Once parsed, our language will consist of the applications of | |
| 466 | -- binary operators with literal integers at the leaves | |
| 278 | 467 | data Expr = EOp Op Expr Expr | ENum Int deriving (Eq, Show) |
| 279 | 468 | |
| 280 |
-- Conversions |
|
| 469 | -- Conversions to and from our Expr type | |
| 281 | 470 | toExpr :: SExpr Atom -> Either String Expr |
| 282 | 471 | toExpr (A (AOp op) ::: l ::: r ::: Nil) = EOp op <$> l <*> r |
| 283 | 472 | toExpr (A (ANum n)) = pure (ENum n) |
| 284 |
toExpr sexpr = Left (" |
|
| 473 | toExpr sexpr = Left ("Unable to parse expression: " ++ show sexpr) | |
| 285 | 474 | |
| 286 | 475 | fromExpr :: Expr -> SExpr Atom |
| 287 | 476 | fromExpr (EOp op l r) = A (AOp op) ::: fromExpr l ::: fromExpr r ::: Nil |
| 300 | 489 | sAtom (AOp Mul) = "*" |
| 301 | 490 | sAtom (ANum n) = T.pack (show n) |
| 302 | 491 | |
| 303 |
-- Our comment syntax |
|
| 492 | -- Our comment syntax is going to be Haskell-like: | |
| 304 | 493 | hsComment :: Parser () |
| 305 | 494 | hsComment = string "--" >> manyTill newline >> return () |
| 306 | 495 | |
| 307 |
-- Our custom reader macro |
|
| 496 | -- Our custom reader macro: grab the parse stream and read a | |
| 497 | -- hexadecimal number from it: | |
| 308 | 498 | hexReader :: Reader Atom |
| 309 | 499 | hexReader _ = (Num . readHex . T.unpack) <$> takeWhile1 isHexDigit |
| 310 | 500 | where isHexDigit c = isDigit c || c `elem` "AaBbCcDdEeFf" |
| 311 | 501 | rd = readHex . head . fst |
| 312 | 502 | |
| 313 | -- Our final s-expression family | |
| 314 | myLangSpec :: SExprSpec Atom Expr | |
| 315 | myLangSpec | |
| 503 | -- Our final s-expression parser and printer: | |
| 504 | myLangParser :: SExprParser Atom Expr | |
| 505 | myLangParser | |
| 316 | 506 | = setComment hsComment -- set comment syntax to be Haskell-style |
| 317 | 507 | $ addReader '#' hexReader -- add hex reader |
| 318 | $ convertSpec toExpr fromExpr -- convert final repr to Expr | |
| 319 | $ mkSpec pAtom sAtom -- create spec with Atom type | |
| 508 | $ setCarrier toExpr -- convert final repr to Expr | |
| 509 | $ mkParser pAtom -- create spec with Atom type | |
| 510 | ||
| 511 | mkLangPrinter :: SexprPrinter Atom Expr | |
| 512 | mkLangPrinter | |
| 513 | = setFromCarrier fromExpr | |
| 514 | $ setIndentStrategy (const Align) | |
| 515 | $ basicPrint sAtom | |
| 320 | 516 | ~~~~ |
| 321 | 517 | |
| 322 | 518 | Keep in mind that you often won't need to write all this by hand, |