More docs
Getty Ritter
9 years ago
5 | 5 | |
6 | 6 | There are two main tasks for a feed reader: _fetching_ and _viewing_. |
7 | 7 | These two tasks, in the `lektor` system, are split apart into different |
8 | components, mediated by a `lektordir` system. There are two main parts | |
9 | to the `lektor` architecture: the `lektor-dir` format and the | |
10 | `lektor-entry` format. | |
8 | components, mediated by a `lektor-dir` system. A `lektor-dir` contains | |
9 | two kinds of information: information about feeds (sources of new | |
10 | entries) and information about entries themselves. | |
11 | ||
12 | # At A Glance | |
13 | ||
14 | A given user has their own `lektor-dir`. A `lektor-dir` contains both | |
15 | "feeds" and "entries". Two kinds of programs operate on `lektordir`s | |
16 | in two different capcities: a _fetcher_ produces entries for one or | |
17 | more feeds, and a _viewer_ manages entries once produced and shows | |
18 | them to some user. A given `lektor-dir` can have multiple fetchers | |
19 | and multiple viewers operating on it. | |
20 | ||
21 | The rationale for these decisions is this: | |
22 | ||
23 | - Separating fetchers from viewers means that a user can easily | |
24 | mix-and-match different front-ends and back-ends. | |
25 | - Allowing multiple fetchers allows different entry sources to be | |
26 | handled independently, ideally allowing those programs to be | |
27 | simpler. | |
28 | - Allowing multiple viewers means that a user can track multiple | |
29 | feeds but view the information from those feeds in ways which | |
30 | are more or less appropriate. | |
31 | - Keeping this information split apart in the file system, rather | |
32 | than in a database or text file, both improves the ability to | |
33 | operate concurrently on different parts of a `lektor-dir` and | |
34 | lifts the burden of parsing information from the implementer. | |
35 | The file system is generally used here as a kind of hierarchical | |
36 | key-value store. | |
37 | - The overall design is lifted straight from the `maildir` format, | |
38 | which is a time-tested and well-understood format for email. This | |
39 | modifies it slightly and adds a richer structure for RSS-like | |
40 | applications. | |
11 | 41 | |
12 | 42 | ## `lektor-feed` |
13 | 43 | |
17 | 47 | inside a `lektor-dir`. Information about a given feed is stored inside |
18 | 48 | `src/$hash`, where `$hash` is the SHA-1 hash of of the `feed`'s `id`. |
19 | 49 | |
20 |
Obligatory elements |
|
50 | Obligatory elements for a `feed` include: | |
21 | 51 | |
22 | 52 | - `id`: The URI which identifies the feed. In the case of |
23 | 53 | RSS/Atom/ActivityStream feeds, this will generally be the URL at |
27 | 57 | - `name`: The human-readable name of the feed. This is |
28 | 58 | produced by the fetcher and should not be changed by a viewer. |
29 | 59 | |
30 | Optional elements include: | |
31 | ||
32 | - `description`: A human-readable description describing | |
33 | the feed. | |
60 | Optional elements for a `feed` include: | |
61 | ||
62 | - `description`: A human-readable description describing the feed. | |
34 | 63 | - `language`: The language the feed is written in. |
35 | - `image`: An image that can be optionally displayed with | |
36 | the channel. | |
64 | - `image`: An image that can be optionally displayed with the channel. | |
37 | 65 | - `copyright`: The copyright notice for the feed. |
38 | 66 | - `author`: Authorship information for the feed. |
39 | 67 | |
41 | 69 | |
42 | 70 | A minimal feed might look like |
43 | 71 | |
44 | ```.bash | |
45 | cd $LEKTORDIR | |
46 | HASH=$(printf 'http://example.com/rss.xml' | sha1sum) | |
47 | mkdir -p $HASH | |
48 | ||
49 | echo http://example.com/rss.xml >$HASH/id | |
50 | echo Example Feed >$HASH/name | |
51 | ``` | |
72 | ~~~{.bash} | |
73 | # $HASH is sha1sum('http://example.com/rss.xml') | |
74 | HASH=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
75 | cd $HASH | |
76 | ||
77 | echo 'http://example.com/rss.xml' >id | |
78 | echo 'Example Feed' >name | |
79 | ~~~ | |
52 | 80 | |
53 | 81 | A feed with more entries might look like |
54 | 82 | |
55 | ```.bash | |
56 | cd $LEKTORDIR | |
57 | HASH=$(printf 'http://example.com/rss.xml' | sha1sum) | |
58 | mkdir -p $HASH | |
59 | ||
60 | echo http://example.com/rss.xml >$HASH/id | |
61 | echo Example Feed >$HASH/name | |
62 | echo 'An example feed.' >$HASH/description | |
63 | echo en-us >$HASH/language | |
64 | echo http://example.com/image.png >$HASH/image | |
65 | echo Copyright 2015, Getty Ritter >$HASH/copyright | |
66 | echo 'Getty Ritter <gdritter@gmail.com>' >$HASH/author | |
67 | ``` | |
83 | ~~~{.bash} | |
84 | # $HASH is sha1sum('http://example.com/rss.xml') | |
85 | HASH=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
86 | cd $HASH | |
87 | ||
88 | echo 'http://example.com/rss.xml' >id | |
89 | echo 'Example Feed' >name | |
90 | echo 'An example feed.' >description | |
91 | echo 'en-us' >language | |
92 | echo 'http://example.com/image.png' >image | |
93 | echo 'Copyright 2015, Getty Ritter' >copyright | |
94 | echo 'Getty Ritter <gdritter@gmail.com>' >author | |
95 | ~~~ | |
68 | 96 | |
69 | 97 | ## `lektor-entry` |
70 | 98 | |
71 | 99 | In contrast to `maildir`, entries in a `lektor-dir` are not files |
72 | 100 | but directories adhering to a particular structure. |
73 | 101 | |
74 |
Obligatory elements |
|
102 | Obligatory elements for an `entry` include: | |
75 | 103 | |
76 | 104 | - `title`: The title of the entry. |
77 | 105 | - `id`: The URI which identifies the entry. This will often be a |
78 | 106 | URL at which the resource corresponding to the entry is available, |
79 | 107 | but may also be an opaque identifier. |
80 |
- `content`: |
|
108 | - `content`: **TBD** | |
81 | 109 | - `feed`: A directory that contains all the information about the |
82 | source `feed`. This will generally be a symlink | |
83 | ||
84 | Optional elements include: | |
110 | source `feed`. This will generally be a soft link to the relevant | |
111 | `feed` directory, but programs should not assume that it is. | |
112 | ||
113 | Optional elements for an `entry` include: | |
85 | 114 | |
86 | 115 | - `author`: Names and email addressess of the authors of the entry. |
87 | 116 | - `pubdate`: When the entry was published. |
117 | - `type`: The MIME type of the content. If `type` is not present, | |
118 | the assumed content type is `text/plain`. | |
119 | ||
120 | ### Entry example | |
121 | ||
122 | A minimal entry might look like | |
123 | ||
124 | ~~~{.bash} | |
125 | # $FEED is sha1sum('http://example.com/rss.xml') | |
126 | FEED=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
127 | echo 'Example Entry' >title | |
128 | echo 'http://example.com/example' >id | |
129 | echo 'A sample entry.' >content | |
130 | ln -s $LEKTOR-DIR/src/$FEED feed | |
131 | ~~~ | |
132 | ||
133 | A full entry might look like | |
134 | ||
135 | ~~~{.bash} | |
136 | # $FEED is sha1sum('http://example.com/rss.xml') | |
137 | FEED=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
138 | echo 'Example Entry' >title | |
139 | echo 'http://example.com/example' >id | |
140 | echo 'A sample entry.' >content | |
141 | echo 'Getty Ritter <gettyritter@gmail.com>' >author | |
142 | echo '2015-06-23T13:06:22Z' >pubdate | |
143 | echo 'text/html' >type | |
144 | ln -s $LEKTOR-DIR/src/$FEED feed | |
145 | ~~~ | |
88 | 146 | |
89 | 147 | ## `lektor-dir` |
90 | 148 | |
91 |
A `lektor |
|
149 | A `lektor-dir` is a directory with at least four subdirectories: `tmp`, | |
92 | 150 | `new`, `cur`, and `src`. A _fetcher_ is responsible for examining a feed |
93 | and adding new entries the `lektordir` according to the following process: | |
94 | ||
95 | - The fetcher `chdir()`s to the `lektordir` directory. | |
151 | and adding new entries the `lektor-dir` according to the following process: | |
152 | ||
153 | - The fetcher `chdir()`s to the `lektor-dir` directory. | |
96 | 154 | - The fetcher `stat()`s the name `tmp/$feed/$time.$pid.$host`, where |
97 | 155 | `$feed` is the hash of the feed's `id` value, `$time` |
98 | 156 | is the number of seconds since the beginning of 1970 GMT, `$pid` is the |
103 | 161 | - The fetcher creates the directory `tmp/$feed/$time.$pid.$host`. |
104 | 162 | - The fetcher writes the entry contents (according to the `lektor-entry` |
105 | 163 | format) to the directory. |
106 |
- The fetcher |
|
164 | - The fetcher moves the file to `new/$feed/$time.$pid.$host`. At that | |
107 | 165 | instant, the entry has been successfully created. |
108 | 166 | |
109 | 167 | A _viewer_ is responsible for displaying new feed entries to a user |
110 | 168 | through some mechanism. A viewer looks through the `new` directory for |
111 | 169 | new entries. If there is a new entry, `new/$feed/$unique`, the viewer may: |
112 | 170 | |
113 | - Display the contents of `new/$feed/$unique` | |
114 | - Delete `new/$feed/$unique` | |
115 | - Rename `new/$feed/$unique`. | |
116 | ||
117 | A `lektordir` can contain arbitrary other directories, but for the sake | |
118 | of compatibility, these should attempt to adhere to the following | |
119 | schema: | |
120 | ||
121 | - If the extra directory contains configuration or other information | |
122 | for a given feed, it | |
171 | - Display the contents of `new/$feed/$unique`. | |
172 | - Delete `new/$feed/$unique`. | |
173 | - Rename `new/$feed/$unique` to `cur/$feed/$unique;$info`. | |
174 | ||
175 | A `lektor-dir` can contain other information not specified here, but that | |
176 | information should attempt to adhere to these guidelines: | |
177 | ||
178 | - If the extra information pertains to a particular feed, it should appear | |
179 | in the directory `src/$feed/etc` | |
180 | - If the extra information pertains to a fetcher, it should appear in the | |
181 | directory `etc/fetch`. | |
182 | - If the extra information pertains to a viewer, it should appear in the | |
183 | directory `etc/view`. | |
184 | ||
185 | ## Possibilities for `lektor` | |
186 | ||
187 | Lektor lends itself well to web syndication (e.g. RSS, Atom, | |
188 | ActivityStreams, &c) but could be used for any kind of stream of | |
189 | information. For example, a fetcher might serve as a mediated logging | |
190 | service for other information such as regular load information on a | |
191 | running web service, pushing updates into a shared `lektor-dir` on a | |
192 | regular basis. It would also be trivial to write custom fetchers for | |
193 | services that no longer expose RSS or other syndication formats, such | |
194 | as Twitter. | |
195 | ||
196 | Here is a trivial fetcher that provides a feed of timestamps every | |
197 | hour: | |
198 | ||
199 | ~~~{.bash} | |
200 | #!/bin/bash -e | |
201 | ||
202 | cd $LEKTORDIR | |
203 | ||
204 | # the feed information | |
205 | ID='tag:example.com:timekeeper' | |
206 | HASH=$(printf $ID | sha1sum | awk '{ print $1; }' ) | |
207 | ||
208 | # other metadata | |
209 | HOST=$(hostname) | |
210 | MAX=10 | |
211 | ||
212 | # create the feed | |
213 | mkdir -p src/$HASH | |
214 | echo $ID >src/$HASH/id | |
215 | echo Timekeeper >src/$HASH/name | |
216 | ||
217 | mkdir -p "tmp/$HASH" | |
218 | mkdir -p "new/$HASH" | |
219 | ||
220 | # create entries every hour | |
221 | while true; do | |
222 | TIME=$(date '+%s') | |
223 | ENTRY="$HASH/$TIME.$$.$HOST" | |
224 | ||
225 | # if the file exists, wait two seconds and try again | |
226 | RETRY=0 | |
227 | while [ -e $ENTRY ] | |
228 | do | |
229 | # if we've waited more than $MAX times, then | |
230 | # give up | |
231 | if [ $RETRY -gt $MAX ]; then | |
232 | exit 1 | |
233 | fi | |
234 | sleep 2 | |
235 | RETRY=$(expr $RETRY + 1) | |
236 | done | |
237 | ||
238 | # create the entry | |
239 | mkdir -p tmp/$ENTRY | |
240 | ||
241 | # create entry values | |
242 | echo 'Current Time' >tmp/$ENTRY/title | |
243 | echo $TIME >tmp/$ENTRY/content | |
244 | echo "tag:example.com:timekeeper#$TIME" >tmp/$ENTRY/id | |
245 | ln -s $LEKTORDIR/src/$HASH tmp/$ENTRY/feed | |
246 | ||
247 | # move the entry to the new location | |
248 | mv tmp/$ENTRY new/$ENTRY | |
249 | ||
250 | # wait for half an hour and do it again | |
251 | sleep 3600 | |
252 | done | |
253 | ~~~ | |
254 | ||
255 | Additionally, multiple viewers can act on the same `lektor-dir`. A | |
256 | given viewer need not show every piece of information: for example, | |
257 | a viewer may sniff the `type` attribute of entries and only display | |
258 | entries of a given type, or selectively choose which feeds to display, | |
259 | or even select entries at random to display. It also has full control | |
260 | over how to display those entries. | |
261 | ||
262 | Here is a trivial viewer that shows a small digest of each entry in | |
263 | `new` and then moves those entries to `cur`: | |
264 | ||
265 | ~~~{.bash} | |
266 | #/bin/bash -e | |
267 | ||
268 | cd $LEKTORDIR | |
269 | ||
270 | for FEED in $(ls new) | |
271 | do | |
272 | mkdir -p cur/$FEED | |
273 | ||
274 | # print feed header | |
275 | echo "In feed $(cat src/$FEED/name):" | |
276 | echo | |
277 | ||
278 | for ENTRY in $(ls new/$FEED) | |
279 | do | |
280 | # print entry | |
281 | echo "$(cat new/$FEED/$ENTRY/title)" | |
282 | cat new/$FEED/$ENTRY/content | head -n 4 | |
283 | echo | |
284 | ||
285 | # move entry to `cur` | |
286 | mv new/$FEED/$ENTRY cur/$FEED/$ENTRY | |
287 | done | |
288 | done | |
289 | ~~~ |