I do not know when I wrote this README, but I will commit it now
Getty Ritter
7 years ago
1 | # Adnot | |
2 | ||
3 | The *Adnot* format is a simple data and configuration format intended | |
4 | to have a slightly enriched data model when compared to JSON or | |
5 | s-expressions but still retain the comparative simplicity of those | |
6 | formats. Unlike JSON, Adnot chooses to avoid redundant structural | |
7 | information like punctuation; unlike s-expressions, Adnot values | |
8 | natively express a wider range of basic data types. | |
9 | ||
10 | *Adnot* is not intended to be a data interchange format, but rather to | |
11 | be a richer and more convenient syntax for certain kinds of data | |
12 | description that might otherwise be done in more unwieldy formats like | |
13 | YAML. As a first approximation, Adnot may be treated as a more human- | |
14 | and version-control-friendly version of JSON whose data model is | |
15 | intended to resemble the data model of statically typed functional | |
16 | programming languages. | |
17 | ||
18 | A given Adnot value is either one of four basic types—an integer, a | |
19 | double, a string, or an identifier—or one of three composite types: a | |
20 | sequence of values, a mapping of symbols to values, or a tagged | |
21 | sequence of values which begins with a symbol: | |
22 | ||
23 | ``` | |
24 | expr ::= "{" (symbol expr) * "}" | |
25 | | "(" symbol expr* ")" | |
26 | | "[" expr* "]" | |
27 | | string | |
28 | | symbol | |
29 | | integer | |
30 | | double | |
31 | ``` | |
32 | ||
33 | Strings are understood in the same way as JSON strings, with the same | |
34 | encoding and the same set of escapes. Symbols are unquoted strings | |
35 | that start with a Unicode character with the `XID_Start` and continue | |
36 | with the `XID_Continue` characters, and thus should resemble the | |
37 | identifier syntax for a large number of C-like languages. | |
38 | ||
39 | The three kinds of composite types are meant to resemble records, sum | |
40 | or variant types, and lists, respectively. Zero or more | |
41 | symbol-expression pairs inside curly brackets form a _map_: | |
42 | ||
43 | ``` | |
44 | # a basic map | |
45 | { | |
46 | x 2 | |
47 | y 3 | |
48 | z 4 | |
49 | } | |
50 | ``` | |
51 | ||
52 | Pairs do not include colons and are not separated by commas. A map | |
53 | _must_ contain an even number of sub-expressions, and every odd | |
54 | subexpression _must_ be a symbol. (This restriction might be lifted in | |
55 | the future?) Whitespace is ignored except as a separator between | |
56 | tokens, so the above map is identical to | |
57 | ||
58 | ``` | |
59 | {x 2 y 3 z 4} | |
60 | ``` | |
61 | ||
62 | A _list_ is represented by square brackets with zero or more | |
63 | possibly-heterogeneous expressions: | |
64 | ||
65 | ``` | |
66 | # a basic list | |
67 | [ 2 "foo" bar ] | |
68 | ``` | |
69 | ||
70 | A _tagged expression_ is represented by parentheses with a single | |
71 | symbol followed by zero or more possibly-heterogeneous expressions: | |
72 | ||
73 | ``` | |
74 | # a basic tagged expression | |
75 | (some_tag blah 7.8 "??") | |
76 | ``` | |
77 | ||
78 | These are how tagged data-types are traditionally represented: because | |
79 | the thing inside the parens _must_ be a symbol, it can correspond to a | |
80 | data type in an ML-like language. |