gdritter repos s-cargot / 04fa58a
More explanation on the README Getty Ritter 9 years ago
1 changed file(s) with 119 addition(s) and 1 deletion(s). Collapse all Expand all
2525 someSpec :: SExprSpec atom carrier
2626 ~~~~
2727
28 Various functions will be provided that modify the carrier type (i.e. the
29 output type of parsing or input type of serialization) or the language
30 recognized by the parsing. Examples will be shown below.
31
32 ## Representing S-expressions
33
2834 There are three built-in representations of S-expression lists: two of them
2935 are isomorphic, as one or the other might be better for processing
3036 S-expression data, and the third represents only a subset of possible
3238
3339 ~~~~
3440 -- cons-based representation
35 data SExpr atom = SAtom atom | SCons (SExpr atom) (SExpr atom) | SNil
41 data SExpr atom
42 = SCons (SExpr atom) (SExpr atom)
43 | SNil
44 | SAtom atom
3645
3746 -- list-based representation
3847 data RichSExpr atom
3948 = RSList [RichSExpr atom]
49 | RSDotList [RichSExpr atom] atom
50 | RSAtom atom
51
52 -- well-formed representation
53 data WellFormedSExpr atom
54 = WFSList [WellFormedSExpr atom]
55 | WFSAtom atom
4056 ~~~~
57
58 In the above, an `RSList [a, b, c]` and a
59 `WFList [a, b, c]` both correspond to the structure
60 `SCons a (SCons b (SCons d SNil))`, which corresponds to an
61 S-expression which can be written as
62 `(a b c)` or as `(a . (b . (c . ())))`. A `RSDotList`
63 corresponds to an sequence of conses that does not terminate
64 in an empty list, e.g. `RSDotList [a, b] c` corresponds to
65 `SCons a (SCons b (SAtom c))`, which in turn corresponds to
66 a structure like `(a b . c)` or `(a . (b . c))`.
67
68 Functions for converting back and forth between
69 representations are provided, but you can also modify a
70 `SExprSpec` to parse to or serialize from a particular
71 representation using the `asRich` and `asWellFormed`
72 functions.
73
74 ~~~~
75 *Data.SCargot.General> decode spec "(a b c)"
76 Right [SCons (SAtom "a") (SCons (SAtom "b") (SCons (SAtom "c") SNil))]
77 *Data.SCargot.General> decode (asRich spec) "(a b c)"
78 Right [RSList [RSAtom "a",RSAtom "b",RSAtom "c"]]
79 *Data.SCargot.General> decode (asWellFormed spec) "(a b c)"
80 Right [WFSList [WFSAtom "a",WFSAtom "b",WFSAtom "c"]]
81 *Data.SCargot.General> decode spec "(a . b)"
82 Right [SCons (SAtom "a") (SAtom "b")]
83 *Data.SCargot.General> decode (asRich spec) "(a . b)"
84 Right [RSDotted [RSAtom "a"] "b"]
85 *Data.SCargot.General> decode (asWellFormed spec) "(a . b)"
86 Left "Found atom in cdr position"
87 ~~~~
88
89 # Comments
90
91 By default, an S-expression spec does not include a comment syntax, but
92 the provided `withSemicolonComments` function will cause it to understand
93 traditional Lisp line-oriented comments that begin with a semicolon:
94
95 ~~~~
96 *Data.SCargot.General> decode spec "(this ; has a comment\n inside)\n"
97 Left "Failed reading: takeWhile1"
98 *Data.SCargot.General> decode (withSemicolonComments spec) "(this ; has a comment\n inside)\n"
99 Right [SCons (SAtom "this") (SCons (SAtom "inside") SNil)]
100 ~~~~
101
102 Additionally, you can provide your own comment syntax in the form of an
103 AttoParsec parser. Any AttoParsec parser can be used, so long as it meets
104 the following criteria:
105 - it is capable of failing (as is called until SCargot believes that there
106 are no more comments)
107 - it does not consume any input in the case of failure, which may involve
108 wrapping the parser in a call to `try`
109
110 For example, the following adds C++-style comments to an S-expression format:
111
112 ~~~~
113 *Data.SCargot.General> let cppComment = string "//" >> takeWhile (/= '\n') >> return ()
114 *Data.SCargot.General> decode (setComment cppComment spec) "(a //comment\n b)\n"
115 Right [SCons (SAtom "a") (SCons (SAtom "b") SNil)]
116 ~~~~
117
118 # Reader Macros
119
120 A _reader macro_ is a Lisp macro which is invoked during read time. This
121 allows the _lexical_ syntax of a Lisp to be modified. The most commonly
122 seen reader macro is the quote, which allows the syntax `'expr` to stand
123 in for the s-expression `(quote expr)`. The S-Cargot library enables this
124 by keeping a map of characters to AttoParsec parsers that can be used as
125 readers. There is a special case for the aforementioned quote, but that
126 could easily be written by hand as
127
128 ~~~~
129 *Data.SCargot.General> let mySpec = addReader '\'' (fmap go) spec
130 where go c = SCons (SAtom "quote") (SCons c SNil)
131 *Data.SCargot.General> decode (asRich mySpec) "(1 2 '(3 4))"
132 Right [RSList [RSAtom "1",RSAtom "2",RSList [RSAtom "quote",RSList [RSAtom "3",RSAtom "4"]]]]
133 ~~~~
134
135 A reader macro is passed the parser that invoked it, so that it can
136 perform recursive calls, and can return any `SExpr` it would like. It
137 may also take as much or as little of the remaining parse stream as it
138 would like; for example, the following reader macro does not bother
139 parsing anything else and merely returns a new token:
140
141 ~~~~
142 *Data.SCargot.General> decode (addReader '?' (const (pure (SAtom "huh"))) mySpec) "(?1 2)"
143 Right [SCons (SAtom "huh") (SCons (SAtom "1") (SCons (SAtom "2") SNil))]
144 ~~~~
145
146 Reader macros in S-Cargot can be used to define common bits of Lisp
147 syntax that are not typically considered the purview of S-expression
148 parsers. For example, to allow square brackets as a subsitute for
149 proper lists, we could define a reader macro that is initialized by the
150 `[` character and repeatedly calls the parser until a `]` character
151 is reached:
152
153 ~~~~
154 *Data.SCargot.General> let pVec p = (char ']' *> pure SNil) <|> (SCons <$> p <*> pVec p)
155 *Data.SCargot.General> let vec = addReader '[' pVec
156 *Data.SCargot.General> decode (asRich (vec mySpec)) "(1 [2 3])"
157 Right [RSList [RSAtom "1",RSList [RSAtom "2",RSAtom "3"]]]
158 ~~~~