1 # agg, the news aggregator
5 agg is a news aggregator for POSIX-compliant systems.
7 It follows the UNIX philosophy and simply reads a news feed
8 from stdin and creates or updates a filesystem
9 representation of that feed.
11 No command line parameters, no user interface, not even
17 * 2011-04-16 agg-0.1.1 released
18 * 2011-04-08 agg-0.1.0 released
19 * 2011-04-01 development started
24 ### 2011-04-16 agg-0.1.1
26 * Included proper README.
27 * Included nomtime in make targets.
29 ### 2011-04-08 agg-0.1.0
43 To install somewhere else see Make.config.
45 Please, run the test suites, they've been written for *you*
46 and take only two seconds on a 500 MHz CPU anyways.
51 ### Writing file names that are are specified in the feed? What about security?
53 agg removes all slashes from file and directory names
54 before they are written, so everything ends up where it
55 belongs. You should run it in a dedicated directory,
58 ### But a malicious feed could use up all space/inodes.
60 Depends on your operating system (configuration). It's not
61 the job of a news aggregator to enforce quotas.
63 ### Why no download mechanism?
65 Because it's a news aggregator, not a
66 download-and-news-aggregation-program.
68 ### Why no user interface?
70 Because it's a news aggregator, not a
71 download-and-news-aggregation-and-news-reader-program.
72 The file system hierarchy created is pretty much usable
73 using the default UNIX tools. Feel free to write your own
76 ### No way! This program writes HTML!
78 Yes, I like to be able to subscribe to xkcd and similar,
79 even if it means I have to launch a graphical browser once
80 in a while. Anyways, there's
82 cat $item | elinks -dump
84 ### But do I have to download the feed by hand?
88 ### But this wastes traffic when there are no new items!
90 agg quits when it assumes that there are no new feeds (see
91 bugs). The amount of data read too much depends on the
92 ratio of processing vs. download rate.
94 wget $URL -O - --limit-rate=10K | agg
96 ### Okay. But it only works on a single feed!
98 for feed in `cat feeds`; do
99 (wget $feed -qO - --limit-rate=10K | agg) &
104 ### How to fetch only new items from feeds that don't use publication dates?
106 Not supported by agg itsself, since it would require a
107 second level storage that contains (hashes of) everything
108 the agg directory contained -- including items you
109 explicitly deleted. You can easily built such functionality
110 on top using a few lines of shell code.
116 * Currently only tested on GNU/Linux.
117 * Uses fixed size buffers to simplify code. May lead
118 cut-off news texts. The chances for this to happen are
119 rather low and without much consequences (you can always
120 follow the link). If you encounter a link that is larger
121 than 8KiB, let me know.
122 * Assumes items are ordered descending by publication date
123 (newest items on top). Processing is stopped as soon as
124 an old item is encountered.
125 * Assumes items only change if their publication date
126 changes. Again, for simplicity.
127 * Creation of a "sub-feed" directory if the channel
128 contained an element that had a title tag but is not an
130 * Supports only dates that have their time zone formatted
131 as +xxxx, not as their abbreviation.
132 * Item titles may conflict, especially if they were too
133 long and have been cutted.
134 * Items will always be (over-) written in the order they
135 are placed in the feed.
136 * HTML output is formatted badly.
137 * Standard mtime for items without pubDate should be now.
142 * Andreas Waidler <arandes@programmers.at>
146 * git://repo.or.cz/agg.git
147 * http://www.repo.or.cz/w/agg.git
151 * http://programmers.at/work/on/agg
155 * http://programmers.at/work/on/agg/agg-0.1.1.tar.gz
156 * http://programmers.at/work/on/agg/agg-0.1.0.tar.gz
161 Copyright (C) 2011 Andreas Waidler <arandes@programmers.at>
163 Permission to use, copy, modify, and/or distribute this
164 software for any purpose with or without fee is hereby
165 granted, provided that the above copyright notice and this
166 permission notice appear in all copies.
168 THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS
169 ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL
170 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO
171 EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
172 INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
173 WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
174 WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
175 TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE
176 USE OR PERFORMANCE OF THIS SOFTWARE.