[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

NNTP combine-and-forward



I noticed Aaron is moving forward with some metadata-over-NNTP, which 
is something I think can work out very nicely.  I've been thinking of 
something that could help this work better, but am not sure yet if it 
is unworkable, so I'm hoping some people here will have some ideas.

Suppose that we are distributing RSS feeds by posting them to a 
newsgroup, something like alt.metadata.rss.  Obviously some RSS 
postings would have just a few items, while others would be larger.  
Ideally, the newsgroup would become just one huge river of RSS items 
describing stories or content that people were submitting, and 
indexers could passively monitor the newsgroup and get content of 
interest without the extra step of pulling from a site with the 
appropriate feed.  Maybe with RSS, sheer volume wouldn't be a 
problem, but lets pretend that we have enough authors participating 
that we are getting tens of thousands of individual posts (with two 
or three items) pers day.  One useful service to be performed at the 
NNTP layer would be for a news server to combine multiple small 
messages into larger individual messages before forwarding around 
USENET.  Assuming that you could never get all of USENET to 
cooperate, you would have to use the existing mechanisms to make this 
work.  So your own personal "indexing agent" could just read a bunch 
of messages, combine, and "re-post" a new message containing the 
merged RSS.  The main problem you need to solve is that, when you 
combine all of these messages into one, you do not want the "big" 
message to propagate back to the servers that you got the little ones 
from (you'll piss off lots of USENET admins).  You want to prop 
the "combined" message forward, but *stop* propping the little ones 
at that point.

One possible idea hinges on the way that NNTP posts have header info 
that tells which servers they've been through (separated by ! 
symbol).  You could aggregate/merge a batch of RSS entries that all 
originated form the same source, and tack on the original source 
chain to the "combined" message before posting back.  I know this 
used to work on USENET, but I am not sure if some of the attempts to 
stop spammers or something regard such headers on POST with 
suspicion?  And maybe ISPs are disallowing this now?  I have no idea..

The second piece would be to post CANCEL messages for all of the 
messages that had been merged, with the headers specifying that the 
CANCEL had already been seen by the source systems (not true, but 
would prevent the little guys from getting blasted on previous 
servers).  Maybe there are some more restrictions on forging cancels 
now, too?  (Used to be great fun on Usenet, when you got in a big 
flame war with someone, start cancelling all of their posts at the 
source and forge posts from that person recanting their evil ways and 
apologizing for ever thinking you wrong, just make sure the header 
has their home nntp server listed so they never see their apocryphal 
concessions ... I wonder if it is still possible? :-))