More explanation on the README
Getty Ritter
9 years ago
25 | 25 | someSpec :: SExprSpec atom carrier |
26 | 26 | ~~~~ |
27 | 27 | |
28 | Various functions will be provided that modify the carrier type (i.e. the | |
29 | output type of parsing or input type of serialization) or the language | |
30 | recognized by the parsing. Examples will be shown below. | |
31 | ||
32 | ## Representing S-expressions | |
33 | ||
28 | 34 | There are three built-in representations of S-expression lists: two of them |
29 | 35 | are isomorphic, as one or the other might be better for processing |
30 | 36 | S-expression data, and the third represents only a subset of possible |
32 | 38 | |
33 | 39 | ~~~~ |
34 | 40 | -- cons-based representation |
35 |
data SExpr atom |
|
41 | data SExpr atom | |
42 | = SCons (SExpr atom) (SExpr atom) | |
43 | | SNil | |
44 | | SAtom atom | |
36 | 45 | |
37 | 46 | -- list-based representation |
38 | 47 | data RichSExpr atom |
39 | 48 | = RSList [RichSExpr atom] |
49 | | RSDotList [RichSExpr atom] atom | |
50 | | RSAtom atom | |
51 | ||
52 | -- well-formed representation | |
53 | data WellFormedSExpr atom | |
54 | = WFSList [WellFormedSExpr atom] | |
55 | | WFSAtom atom | |
40 | 56 | ~~~~ |
57 | ||
58 | In the above, an `RSList [a, b, c]` and a | |
59 | `WFList [a, b, c]` both correspond to the structure | |
60 | `SCons a (SCons b (SCons d SNil))`, which corresponds to an | |
61 | S-expression which can be written as | |
62 | `(a b c)` or as `(a . (b . (c . ())))`. A `RSDotList` | |
63 | corresponds to an sequence of conses that does not terminate | |
64 | in an empty list, e.g. `RSDotList [a, b] c` corresponds to | |
65 | `SCons a (SCons b (SAtom c))`, which in turn corresponds to | |
66 | a structure like `(a b . c)` or `(a . (b . c))`. | |
67 | ||
68 | Functions for converting back and forth between | |
69 | representations are provided, but you can also modify a | |
70 | `SExprSpec` to parse to or serialize from a particular | |
71 | representation using the `asRich` and `asWellFormed` | |
72 | functions. | |
73 | ||
74 | ~~~~ | |
75 | *Data.SCargot.General> decode spec "(a b c)" | |
76 | Right [SCons (SAtom "a") (SCons (SAtom "b") (SCons (SAtom "c") SNil))] | |
77 | *Data.SCargot.General> decode (asRich spec) "(a b c)" | |
78 | Right [RSList [RSAtom "a",RSAtom "b",RSAtom "c"]] | |
79 | *Data.SCargot.General> decode (asWellFormed spec) "(a b c)" | |
80 | Right [WFSList [WFSAtom "a",WFSAtom "b",WFSAtom "c"]] | |
81 | *Data.SCargot.General> decode spec "(a . b)" | |
82 | Right [SCons (SAtom "a") (SAtom "b")] | |
83 | *Data.SCargot.General> decode (asRich spec) "(a . b)" | |
84 | Right [RSDotted [RSAtom "a"] "b"] | |
85 | *Data.SCargot.General> decode (asWellFormed spec) "(a . b)" | |
86 | Left "Found atom in cdr position" | |
87 | ~~~~ | |
88 | ||
89 | # Comments | |
90 | ||
91 | By default, an S-expression spec does not include a comment syntax, but | |
92 | the provided `withSemicolonComments` function will cause it to understand | |
93 | traditional Lisp line-oriented comments that begin with a semicolon: | |
94 | ||
95 | ~~~~ | |
96 | *Data.SCargot.General> decode spec "(this ; has a comment\n inside)\n" | |
97 | Left "Failed reading: takeWhile1" | |
98 | *Data.SCargot.General> decode (withSemicolonComments spec) "(this ; has a comment\n inside)\n" | |
99 | Right [SCons (SAtom "this") (SCons (SAtom "inside") SNil)] | |
100 | ~~~~ | |
101 | ||
102 | Additionally, you can provide your own comment syntax in the form of an | |
103 | AttoParsec parser. Any AttoParsec parser can be used, so long as it meets | |
104 | the following criteria: | |
105 | - it is capable of failing (as is called until SCargot believes that there | |
106 | are no more comments) | |
107 | - it does not consume any input in the case of failure, which may involve | |
108 | wrapping the parser in a call to `try` | |
109 | ||
110 | For example, the following adds C++-style comments to an S-expression format: | |
111 | ||
112 | ~~~~ | |
113 | *Data.SCargot.General> let cppComment = string "//" >> takeWhile (/= '\n') >> return () | |
114 | *Data.SCargot.General> decode (setComment cppComment spec) "(a //comment\n b)\n" | |
115 | Right [SCons (SAtom "a") (SCons (SAtom "b") SNil)] | |
116 | ~~~~ | |
117 | ||
118 | # Reader Macros | |
119 | ||
120 | A _reader macro_ is a Lisp macro which is invoked during read time. This | |
121 | allows the _lexical_ syntax of a Lisp to be modified. The most commonly | |
122 | seen reader macro is the quote, which allows the syntax `'expr` to stand | |
123 | in for the s-expression `(quote expr)`. The S-Cargot library enables this | |
124 | by keeping a map of characters to AttoParsec parsers that can be used as | |
125 | readers. There is a special case for the aforementioned quote, but that | |
126 | could easily be written by hand as | |
127 | ||
128 | ~~~~ | |
129 | *Data.SCargot.General> let mySpec = addReader '\'' (fmap go) spec | |
130 | where go c = SCons (SAtom "quote") (SCons c SNil) | |
131 | *Data.SCargot.General> decode (asRich mySpec) "(1 2 '(3 4))" | |
132 | Right [RSList [RSAtom "1",RSAtom "2",RSList [RSAtom "quote",RSList [RSAtom "3",RSAtom "4"]]]] | |
133 | ~~~~ | |
134 | ||
135 | A reader macro is passed the parser that invoked it, so that it can | |
136 | perform recursive calls, and can return any `SExpr` it would like. It | |
137 | may also take as much or as little of the remaining parse stream as it | |
138 | would like; for example, the following reader macro does not bother | |
139 | parsing anything else and merely returns a new token: | |
140 | ||
141 | ~~~~ | |
142 | *Data.SCargot.General> decode (addReader '?' (const (pure (SAtom "huh"))) mySpec) "(?1 2)" | |
143 | Right [SCons (SAtom "huh") (SCons (SAtom "1") (SCons (SAtom "2") SNil))] | |
144 | ~~~~ | |
145 | ||
146 | Reader macros in S-Cargot can be used to define common bits of Lisp | |
147 | syntax that are not typically considered the purview of S-expression | |
148 | parsers. For example, to allow square brackets as a subsitute for | |
149 | proper lists, we could define a reader macro that is initialized by the | |
150 | `[` character and repeatedly calls the parser until a `]` character | |
151 | is reached: | |
152 | ||
153 | ~~~~ | |
154 | *Data.SCargot.General> let pVec p = (char ']' *> pure SNil) <|> (SCons <$> p <*> pVec p) | |
155 | *Data.SCargot.General> let vec = addReader '[' pVec | |
156 | *Data.SCargot.General> decode (asRich (vec mySpec)) "(1 [2 3])" | |
157 | Right [RSList [RSAtom "1",RSList [RSAtom "2",RSAtom "3"]]] | |
158 | ~~~~ |