Added github readme
Getty Ritter
7 years ago
1 | # Xleb | |
2 | ||
3 | The `xleb` library defines a simple monadic language for easily parsing XML structures. It does not parse the XML itself, relying on the [`xml`](http://hackage.haskell.org/package/xml) library to define the underlying types and parser, but rather exposes a simple monad with helper functions to make defining XML-based structures quick and straightforward, of roughly equal complexity to defining a `fromJSON` instance for [`aeson`](http://hackage.haskell.org/package/aeson). | |
4 | ||
5 | ## Basic Usage | |
6 | ||
7 | ||
8 | The `Xleb` monad describes both parsing _and_ traversing a given XML structure: several of the functions to produce `Xleb` computations take other `Xleb` computations, which are run on various sub-parts of the XML tree. Consequently, instead of decomposing an XML structure and passing it around to various functions, the `Xleb` language treats "the current location in the tree" as an implicit piece of data in the `Xleb` monad. | |
9 | ||
10 | You will generally want to identify your root note with the `elem` function to ensure that your root note has the tag you expect. Children of that node can be accessed using the `child` or `children` function to either unambiguously find a specific child element, or to find all child elements that match a given selector and apply a `Xleb` computation to each of them. | |
11 | ||
12 | ~~~~.haskell | |
13 | a <- X.child (X.byTag "a") parseA | |
14 | b <- X.children (X.byTag "b") parseB | |
15 | ~~~~ | |
16 | ||
17 | Leaf data tends to come in two forms in XML: attribute values (like `\<tag attr="value"\>`) or tag content (like `\<tag\>value\<\/tag\>`). In both cases, the `Xleb` functions allow you to parse that content however you'd like by providing an arbitrary function of type `String -> Either String a`. The `xleb` library provides several built-in functions of this type for common situations. | |
18 | ||
19 | ~~~~.haskell | |
20 | c <- X.attr "index" X.number | |
21 | d <- X.contents X.string | |
22 | ~~~~ | |
23 | ||
24 | Finally, the `Xleb` monad has `Alternative` instances which allow for concise expression of optional values or multiple possibilities. | |
25 | ||
26 | ~~~~.haskell | |
27 | e <- X.children X.any (parseA <|> parseB) | |
28 | f <- optional (X.attr "total" X.number) | |
29 | ~~~~ | |
30 | ||
31 | ## Simple Example | |
32 | ||
33 | Say we want to parse a simple XML feed format that looks like the following, with the extra caveat that we'd like the `author` field to be optional: | |
34 | ||
35 | ~~~~.xml | |
36 | <feed> | |
37 | <title>Feed Name</title> | |
38 | <author>Pierre Menard</author> | |
39 | <entry title="Entry 01">First Post</entry> | |
40 | <entry title="Entry 02">Second Post Post</entry> | |
41 | </feed> | |
42 | ~~~~ | |
43 | ||
44 | We can write a `Xleb` computation which is capable of parsing this structure in a handful of lines, here written in a slightly unusual way in order to show off some features of the library: | |
45 | ||
46 | ~~~~.haskell | |
47 | import Control.Applicative (optional) | |
48 | import qualified Text.XML.Xleb as X | |
49 | ||
50 | feed :: X.Xleb (String, Maybe String, [(String, String)]) | |
51 | feed = X.elem "feed" $ do | |
52 | feedTitle <- X.child (X.byTag "title") $ | |
53 | X.contents X.string | |
54 | feedAuthor <- optional $ X.child (X.byTag "author") $ | |
55 | X.contents X.string | |
56 | feedEntries <- X.children (X.byTag "entry") entry | |
57 | return (feedTitle, feedAuthor, feedEntries) | |
58 | ||
59 | entry :: X.Xleb (String, String) | |
60 | entry = (,) <$> X.attr "title" X.string <*> X.contents X.string | |
61 | ~~~~ | |
62 | ||
63 | For a larger example, look at the [Atom-parsing example](examples/atom/Main.hs), which is both more idiomatic and more complete. |