More docs
Getty Ritter
10 years ago
| 5 | 5 | |
| 6 | 6 | There are two main tasks for a feed reader: _fetching_ and _viewing_. |
| 7 | 7 | These two tasks, in the `lektor` system, are split apart into different |
| 8 | components, mediated by a `lektordir` system. There are two main parts | |
| 9 | to the `lektor` architecture: the `lektor-dir` format and the | |
| 10 | `lektor-entry` format. | |
| 8 | components, mediated by a `lektor-dir` system. A `lektor-dir` contains | |
| 9 | two kinds of information: information about feeds (sources of new | |
| 10 | entries) and information about entries themselves. | |
| 11 | ||
| 12 | # At A Glance | |
| 13 | ||
| 14 | A given user has their own `lektor-dir`. A `lektor-dir` contains both | |
| 15 | "feeds" and "entries". Two kinds of programs operate on `lektordir`s | |
| 16 | in two different capcities: a _fetcher_ produces entries for one or | |
| 17 | more feeds, and a _viewer_ manages entries once produced and shows | |
| 18 | them to some user. A given `lektor-dir` can have multiple fetchers | |
| 19 | and multiple viewers operating on it. | |
| 20 | ||
| 21 | The rationale for these decisions is this: | |
| 22 | ||
| 23 | - Separating fetchers from viewers means that a user can easily | |
| 24 | mix-and-match different front-ends and back-ends. | |
| 25 | - Allowing multiple fetchers allows different entry sources to be | |
| 26 | handled independently, ideally allowing those programs to be | |
| 27 | simpler. | |
| 28 | - Allowing multiple viewers means that a user can track multiple | |
| 29 | feeds but view the information from those feeds in ways which | |
| 30 | are more or less appropriate. | |
| 31 | - Keeping this information split apart in the file system, rather | |
| 32 | than in a database or text file, both improves the ability to | |
| 33 | operate concurrently on different parts of a `lektor-dir` and | |
| 34 | lifts the burden of parsing information from the implementer. | |
| 35 | The file system is generally used here as a kind of hierarchical | |
| 36 | key-value store. | |
| 37 | - The overall design is lifted straight from the `maildir` format, | |
| 38 | which is a time-tested and well-understood format for email. This | |
| 39 | modifies it slightly and adds a richer structure for RSS-like | |
| 40 | applications. | |
| 11 | 41 | |
| 12 | 42 | ## `lektor-feed` |
| 13 | 43 | |
| 17 | 47 | inside a `lektor-dir`. Information about a given feed is stored inside |
| 18 | 48 | `src/$hash`, where `$hash` is the SHA-1 hash of of the `feed`'s `id`. |
| 19 | 49 | |
| 20 |
Obligatory elements |
|
| 50 | Obligatory elements for a `feed` include: | |
| 21 | 51 | |
| 22 | 52 | - `id`: The URI which identifies the feed. In the case of |
| 23 | 53 | RSS/Atom/ActivityStream feeds, this will generally be the URL at |
| 27 | 57 | - `name`: The human-readable name of the feed. This is |
| 28 | 58 | produced by the fetcher and should not be changed by a viewer. |
| 29 | 59 | |
| 30 | Optional elements include: | |
| 31 | ||
| 32 | - `description`: A human-readable description describing | |
| 33 | the feed. | |
| 60 | Optional elements for a `feed` include: | |
| 61 | ||
| 62 | - `description`: A human-readable description describing the feed. | |
| 34 | 63 | - `language`: The language the feed is written in. |
| 35 | - `image`: An image that can be optionally displayed with | |
| 36 | the channel. | |
| 64 | - `image`: An image that can be optionally displayed with the channel. | |
| 37 | 65 | - `copyright`: The copyright notice for the feed. |
| 38 | 66 | - `author`: Authorship information for the feed. |
| 39 | 67 | |
| 41 | 69 | |
| 42 | 70 | A minimal feed might look like |
| 43 | 71 | |
| 44 | ```.bash | |
| 45 | cd $LEKTORDIR | |
| 46 | HASH=$(printf 'http://example.com/rss.xml' | sha1sum) | |
| 47 | mkdir -p $HASH | |
| 48 | ||
| 49 | echo http://example.com/rss.xml >$HASH/id | |
| 50 | echo Example Feed >$HASH/name | |
| 51 | ``` | |
| 72 | ~~~{.bash} | |
| 73 | # $HASH is sha1sum('http://example.com/rss.xml') | |
| 74 | HASH=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
| 75 | cd $HASH | |
| 76 | ||
| 77 | echo 'http://example.com/rss.xml' >id | |
| 78 | echo 'Example Feed' >name | |
| 79 | ~~~ | |
| 52 | 80 | |
| 53 | 81 | A feed with more entries might look like |
| 54 | 82 | |
| 55 | ```.bash | |
| 56 | cd $LEKTORDIR | |
| 57 | HASH=$(printf 'http://example.com/rss.xml' | sha1sum) | |
| 58 | mkdir -p $HASH | |
| 59 | ||
| 60 | echo http://example.com/rss.xml >$HASH/id | |
| 61 | echo Example Feed >$HASH/name | |
| 62 | echo 'An example feed.' >$HASH/description | |
| 63 | echo en-us >$HASH/language | |
| 64 | echo http://example.com/image.png >$HASH/image | |
| 65 | echo Copyright 2015, Getty Ritter >$HASH/copyright | |
| 66 | echo 'Getty Ritter <gdritter@gmail.com>' >$HASH/author | |
| 67 | ``` | |
| 83 | ~~~{.bash} | |
| 84 | # $HASH is sha1sum('http://example.com/rss.xml') | |
| 85 | HASH=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
| 86 | cd $HASH | |
| 87 | ||
| 88 | echo 'http://example.com/rss.xml' >id | |
| 89 | echo 'Example Feed' >name | |
| 90 | echo 'An example feed.' >description | |
| 91 | echo 'en-us' >language | |
| 92 | echo 'http://example.com/image.png' >image | |
| 93 | echo 'Copyright 2015, Getty Ritter' >copyright | |
| 94 | echo 'Getty Ritter <gdritter@gmail.com>' >author | |
| 95 | ~~~ | |
| 68 | 96 | |
| 69 | 97 | ## `lektor-entry` |
| 70 | 98 | |
| 71 | 99 | In contrast to `maildir`, entries in a `lektor-dir` are not files |
| 72 | 100 | but directories adhering to a particular structure. |
| 73 | 101 | |
| 74 |
Obligatory elements |
|
| 102 | Obligatory elements for an `entry` include: | |
| 75 | 103 | |
| 76 | 104 | - `title`: The title of the entry. |
| 77 | 105 | - `id`: The URI which identifies the entry. This will often be a |
| 78 | 106 | URL at which the resource corresponding to the entry is available, |
| 79 | 107 | but may also be an opaque identifier. |
| 80 |
- `content`: |
|
| 108 | - `content`: **TBD** | |
| 81 | 109 | - `feed`: A directory that contains all the information about the |
| 82 | source `feed`. This will generally be a symlink | |
| 83 | ||
| 84 | Optional elements include: | |
| 110 | source `feed`. This will generally be a soft link to the relevant | |
| 111 | `feed` directory, but programs should not assume that it is. | |
| 112 | ||
| 113 | Optional elements for an `entry` include: | |
| 85 | 114 | |
| 86 | 115 | - `author`: Names and email addressess of the authors of the entry. |
| 87 | 116 | - `pubdate`: When the entry was published. |
| 117 | - `type`: The MIME type of the content. If `type` is not present, | |
| 118 | the assumed content type is `text/plain`. | |
| 119 | ||
| 120 | ### Entry example | |
| 121 | ||
| 122 | A minimal entry might look like | |
| 123 | ||
| 124 | ~~~{.bash} | |
| 125 | # $FEED is sha1sum('http://example.com/rss.xml') | |
| 126 | FEED=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
| 127 | echo 'Example Entry' >title | |
| 128 | echo 'http://example.com/example' >id | |
| 129 | echo 'A sample entry.' >content | |
| 130 | ln -s $LEKTOR-DIR/src/$FEED feed | |
| 131 | ~~~ | |
| 132 | ||
| 133 | A full entry might look like | |
| 134 | ||
| 135 | ~~~{.bash} | |
| 136 | # $FEED is sha1sum('http://example.com/rss.xml') | |
| 137 | FEED=80af8e84e5ef7ae6b68acb8d1987e58e3e5731dd | |
| 138 | echo 'Example Entry' >title | |
| 139 | echo 'http://example.com/example' >id | |
| 140 | echo 'A sample entry.' >content | |
| 141 | echo 'Getty Ritter <gettyritter@gmail.com>' >author | |
| 142 | echo '2015-06-23T13:06:22Z' >pubdate | |
| 143 | echo 'text/html' >type | |
| 144 | ln -s $LEKTOR-DIR/src/$FEED feed | |
| 145 | ~~~ | |
| 88 | 146 | |
| 89 | 147 | ## `lektor-dir` |
| 90 | 148 | |
| 91 |
A `lektor |
|
| 149 | A `lektor-dir` is a directory with at least four subdirectories: `tmp`, | |
| 92 | 150 | `new`, `cur`, and `src`. A _fetcher_ is responsible for examining a feed |
| 93 | and adding new entries the `lektordir` according to the following process: | |
| 94 | ||
| 95 | - The fetcher `chdir()`s to the `lektordir` directory. | |
| 151 | and adding new entries the `lektor-dir` according to the following process: | |
| 152 | ||
| 153 | - The fetcher `chdir()`s to the `lektor-dir` directory. | |
| 96 | 154 | - The fetcher `stat()`s the name `tmp/$feed/$time.$pid.$host`, where |
| 97 | 155 | `$feed` is the hash of the feed's `id` value, `$time` |
| 98 | 156 | is the number of seconds since the beginning of 1970 GMT, `$pid` is the |
| 103 | 161 | - The fetcher creates the directory `tmp/$feed/$time.$pid.$host`. |
| 104 | 162 | - The fetcher writes the entry contents (according to the `lektor-entry` |
| 105 | 163 | format) to the directory. |
| 106 |
- The fetcher |
|
| 164 | - The fetcher moves the file to `new/$feed/$time.$pid.$host`. At that | |
| 107 | 165 | instant, the entry has been successfully created. |
| 108 | 166 | |
| 109 | 167 | A _viewer_ is responsible for displaying new feed entries to a user |
| 110 | 168 | through some mechanism. A viewer looks through the `new` directory for |
| 111 | 169 | new entries. If there is a new entry, `new/$feed/$unique`, the viewer may: |
| 112 | 170 | |
| 113 | - Display the contents of `new/$feed/$unique` | |
| 114 | - Delete `new/$feed/$unique` | |
| 115 | - Rename `new/$feed/$unique`. | |
| 116 | ||
| 117 | A `lektordir` can contain arbitrary other directories, but for the sake | |
| 118 | of compatibility, these should attempt to adhere to the following | |
| 119 | schema: | |
| 120 | ||
| 121 | - If the extra directory contains configuration or other information | |
| 122 | for a given feed, it | |
| 171 | - Display the contents of `new/$feed/$unique`. | |
| 172 | - Delete `new/$feed/$unique`. | |
| 173 | - Rename `new/$feed/$unique` to `cur/$feed/$unique;$info`. | |
| 174 | ||
| 175 | A `lektor-dir` can contain other information not specified here, but that | |
| 176 | information should attempt to adhere to these guidelines: | |
| 177 | ||
| 178 | - If the extra information pertains to a particular feed, it should appear | |
| 179 | in the directory `src/$feed/etc` | |
| 180 | - If the extra information pertains to a fetcher, it should appear in the | |
| 181 | directory `etc/fetch`. | |
| 182 | - If the extra information pertains to a viewer, it should appear in the | |
| 183 | directory `etc/view`. | |
| 184 | ||
| 185 | ## Possibilities for `lektor` | |
| 186 | ||
| 187 | Lektor lends itself well to web syndication (e.g. RSS, Atom, | |
| 188 | ActivityStreams, &c) but could be used for any kind of stream of | |
| 189 | information. For example, a fetcher might serve as a mediated logging | |
| 190 | service for other information such as regular load information on a | |
| 191 | running web service, pushing updates into a shared `lektor-dir` on a | |
| 192 | regular basis. It would also be trivial to write custom fetchers for | |
| 193 | services that no longer expose RSS or other syndication formats, such | |
| 194 | as Twitter. | |
| 195 | ||
| 196 | Here is a trivial fetcher that provides a feed of timestamps every | |
| 197 | hour: | |
| 198 | ||
| 199 | ~~~{.bash} | |
| 200 | #!/bin/bash -e | |
| 201 | ||
| 202 | cd $LEKTORDIR | |
| 203 | ||
| 204 | # the feed information | |
| 205 | ID='tag:example.com:timekeeper' | |
| 206 | HASH=$(printf $ID | sha1sum | awk '{ print $1; }' ) | |
| 207 | ||
| 208 | # other metadata | |
| 209 | HOST=$(hostname) | |
| 210 | MAX=10 | |
| 211 | ||
| 212 | # create the feed | |
| 213 | mkdir -p src/$HASH | |
| 214 | echo $ID >src/$HASH/id | |
| 215 | echo Timekeeper >src/$HASH/name | |
| 216 | ||
| 217 | mkdir -p "tmp/$HASH" | |
| 218 | mkdir -p "new/$HASH" | |
| 219 | ||
| 220 | # create entries every hour | |
| 221 | while true; do | |
| 222 | TIME=$(date '+%s') | |
| 223 | ENTRY="$HASH/$TIME.$$.$HOST" | |
| 224 | ||
| 225 | # if the file exists, wait two seconds and try again | |
| 226 | RETRY=0 | |
| 227 | while [ -e $ENTRY ] | |
| 228 | do | |
| 229 | # if we've waited more than $MAX times, then | |
| 230 | # give up | |
| 231 | if [ $RETRY -gt $MAX ]; then | |
| 232 | exit 1 | |
| 233 | fi | |
| 234 | sleep 2 | |
| 235 | RETRY=$(expr $RETRY + 1) | |
| 236 | done | |
| 237 | ||
| 238 | # create the entry | |
| 239 | mkdir -p tmp/$ENTRY | |
| 240 | ||
| 241 | # create entry values | |
| 242 | echo 'Current Time' >tmp/$ENTRY/title | |
| 243 | echo $TIME >tmp/$ENTRY/content | |
| 244 | echo "tag:example.com:timekeeper#$TIME" >tmp/$ENTRY/id | |
| 245 | ln -s $LEKTORDIR/src/$HASH tmp/$ENTRY/feed | |
| 246 | ||
| 247 | # move the entry to the new location | |
| 248 | mv tmp/$ENTRY new/$ENTRY | |
| 249 | ||
| 250 | # wait for half an hour and do it again | |
| 251 | sleep 3600 | |
| 252 | done | |
| 253 | ~~~ | |
| 254 | ||
| 255 | Additionally, multiple viewers can act on the same `lektor-dir`. A | |
| 256 | given viewer need not show every piece of information: for example, | |
| 257 | a viewer may sniff the `type` attribute of entries and only display | |
| 258 | entries of a given type, or selectively choose which feeds to display, | |
| 259 | or even select entries at random to display. It also has full control | |
| 260 | over how to display those entries. | |
| 261 | ||
| 262 | Here is a trivial viewer that shows a small digest of each entry in | |
| 263 | `new` and then moves those entries to `cur`: | |
| 264 | ||
| 265 | ~~~{.bash} | |
| 266 | #/bin/bash -e | |
| 267 | ||
| 268 | cd $LEKTORDIR | |
| 269 | ||
| 270 | for FEED in $(ls new) | |
| 271 | do | |
| 272 | mkdir -p cur/$FEED | |
| 273 | ||
| 274 | # print feed header | |
| 275 | echo "In feed $(cat src/$FEED/name):" | |
| 276 | echo | |
| 277 | ||
| 278 | for ENTRY in $(ls new/$FEED) | |
| 279 | do | |
| 280 | # print entry | |
| 281 | echo "$(cat new/$FEED/$ENTRY/title)" | |
| 282 | cat new/$FEED/$ENTRY/content | head -n 4 | |
| 283 | echo | |
| 284 | ||
| 285 | # move entry to `cur` | |
| 286 | mv new/$FEED/$ENTRY cur/$FEED/$ENTRY | |
| 287 | done | |
| 288 | done | |
| 289 | ~~~ | |