News::Archive 0.14 review

Download
by rbytes.net on

News::Archive is a Usenet news archiving package for downloading and later accessing news articles in bulk. It can load articles l

License: Perl Artistic License
File size: 33K
Developer: Tim Skirvin
0 stars award from rbytes.net

News::Archive is a Usenet news archiving package for downloading and later accessing news articles in bulk.

It can load articles laid out in INN format, retrieve them from a running news server, or just take articles one-by-one. News::Archive module is compatible with News::Web and Net::NNTP::Server, so the articles can be shared either via the Web or via NNTP.

SYNOPSIS

use News::Archive;
my $archive = new News::Archive
( 'basedir' => '/home/tskirvin/kiboze' );

# Get a news article
my $article = News::Article->new(*STDIN);
my $msgid = article->header('message-id');

die "Already processed '$msgid'n"
if ($archive->article( $messageid ));

# Get the list of groups we're supposed to be saving the article into
my @groups = split('s*,s*', $article->header('newsgroups') );
map { s/s+//g } @groups;

# Make sure we're subscribed to these groups
foreach (@groups) { $archive->subscribe($_) }

# Actually save the article.
my $ret = $archive->save_article(
[ @{$article->rawheaders}, '', @{$article->body} ], @groups );
$ret ? print "Accepted article $messageidn"
: print "Couldn't save article $messageidn";

News::Archive keeps several files to keep track of its archives:

active file

Keeps track of all newsgroups we are "subscribed" to and all of the information that changes regularly - the number of articles we have archived, the current first and last article numbers, etc.

Watched over with News::Active.

history database

A simple database keeping track of articles by Message-ID. Makes access by ID easy, and ensures that we don't save the same article twice. The database chosen to maintain these is user-determined.

newsgroup file

Keeps track of more static information about the newsgroups we are subscribed to - descriptions, creation dates, etc.

Watched over with News::GroupInfo.

archive directory

Directory structure of all articles, with each article saved as a single textfile within a directory structure laid out at one section of the group name per directory, such as "rec/games/mecha". Crossposts are hardlinked to other directory structures.

Articles are actually divided into sub-directories containing up to 500 articles, to avoid Unix directory size performance limitations. Individual files are thus stored in a file such as "rec/games/mecha/1.500/1".

Each newsgroup also contains overview information, watched over with
News::Overview. This overview file goes in the top of the structure,
such as "rec/games/mecha/.overview".

You may note that these files are very similar to how INN does its work. This is intentional - this package is meant to act in many ways like a lighter-weight INN.

Usage:

Global Variables

The following variables are set within News::Archive, and are global throughout all invocations.

$News::Active::DEBUG
Default value for "debug()" in new objects.

$News::Active::HOSTNAME
Default value for "hostname()" in new objects. Obtained using
"Sys::Hostname::hostname()".

$News::Active::HASH
The number of articles to keep in each directory. Default is 500;
change this at your own peril, since things may get screwed up later
if you change it after archiving any articles!

News::Archive 0.14 keywords