gdritter repos ndbl / 2a6233b
Added note about regexes Getty Ritter 9 years ago
1 changed file(s) with 26 addition(s) and 0 deletion(s). Collapse all Expand all
188188 but in the event that such structures arise, it would be better to switch from
189189 NDBL to a proper data storage format.
190190
191 A Regular Language?
192 -------------------
193
194 I asserted above that NDBL is a regular language. This is true in the sense
195 that valid NDBL documents can be _recognized_ by a regular language. However,
196 they cannot be _parsed_ by the kind of regular expressions that you have in
197 most programming environments: this is because there is no well-founded way
198 of matching a group underneath a Kleene star. We can construct a regular
199 expression that matches a group:
200
201 ~~~~
202 ([^# \t\r\n=][^ \t\r\n=]*)=("[^"|\\\\|\\"]*"|[^ \t\r\n=]*)
203 ~~~~
204
205 We could also write a regular expression that maches entire NDBL documents,
206 using here `{group}` as shorthand for the regex above:
207
208 ~~~~
209 ({group}((#[^\r\n]*[\r\n]*[ \t]*|[\r\n]*[ \t])*{group})*[\r\n*])*
210 ~~~~
211
212 But without
213 [structural regular expressions](http://9p.io/sources/contrib/steve/other-docs/struct-regex.pdf)
214 we cannot use this regex to actually pick apart the structure described
215 by an NDBL document.
216
191217 Haskell API
192218 -----------
193219