mhonarc-users

Re: can $ENV(DESCRIPTION) be sticky?

2000-12-16 14:51:39
Since using $ENV$ to store item-specific information was unproductive, 
I moved on.

Now I am doing this with $NOTE$ , and it seems to work fine.

I've included the new script I am using below.  Others might use it as a
basis for doing something similar.  If you have comments or suggestions,
please let me know.

-Michael

#!/usr/bin/perl -wT
use strict;

# parse message a single message on STDIN and write a summary
# to a file in $notedir/$msgid

# Index of this script: 

# - A) config and programmer notes
# - B) parse header            (look for message-id in headers)
# - C) msgid cleanup           (define a safe variable $notedir/$msgid)
# - D) parse body              (and assemble $note)
# - E) create file $notedir/$msgid  

# if you want a NOTE that looks different from mine, edit D) 

# -------------- A) config and programmer notes

# the only configuration you _need_ is to change $myhome
my $myhome = '/home/root204';
my $sep = '/';
my $notedir = $myhome . $sep . 'notes';
-d $notedir or die "please create $notedir"; 

# and in your .mrc file
# 1. define the <notedir>path</notedir> , where path eq $notedir 
# 2. use $NOTE$ somewhere, like
#    <meta name="description" content="$NOTE$">

# if you want to process STDIN, run it like
#    cat msg | msg2note.pl
# if you want to process a mail folder like this, run it like
#    formail -s msg2note.pl < mailfolder

my ($overwrite);  # will overwrite any existing notes if 1
# $overwrite = 1;

my ($quiet); # will be quiet about warning if 1
$quiet = 1;

# -------------- B) parse header
# read until we leave the header, looking for msgid
my (%fields);
while ( <> ) {
    last if /^$/;
    chomp;
    $fields{ message } = $1 if /^message-id: (.*)/i;
    $fields{ msg } = $1 if /^msg-id: (.*)/i;
    $fields{ content } = $1 if /^content-id: (.*)/i;
}

# -------------- C) msgid cleanup
my ($msgid);
$msgid = $fields{message} || $fields{msg} || $fields{content};
if (defined($msgid)) {
    if ($msgid =~ /<([^>]*)>/) {
            $msgid = $1;
        } else {
            $msgid =~ s/^\s+//; # strip leading whitespace
            $msgid =~ s/\s+$//; # strip trailing whitespace
        }
} else { 
    print "could not find a message-id: , NOTE will be empty\n" unless $quiet;
    exit;
    };

# since we're opening a writeable file handle using the messagid, 
# lets only let some characters in -- I could be wrong about this.

$msgid =~ s/[^\w-(_dot_)_(_at_)]//g;
$msgid =~ s/\.\././g;

my $notefile = $notedir . $sep . $msgid;

# sanity check
if (-e $notefile and ! $overwrite ) {  
   print "$notefile ... exists\n" unless $quiet;
   exit;
 }

# -------------- D) parse body 

# Here I parse the mail message body, looking for a good summary.
# Different types of messages will have different looking 'good' summaries.
# On the news releases I saw, most of the good summaries ended with a period, 
# followed by a blank line.  
# This isn't perfect, but it worked for most of my messages.
# I may want to revise this once I understand MIME better.

$/ = ""; # read by paragraphs in the body

my $note = '';
while ( <> ) {
    last if ( length($note) gt 200 );
    # grab following paragraph if we have one short but good paragraph.
    next unless ( $note or (! /^[^ ]: /  and /(\.|\?)"?\s*$/ ));
    $note .= $_;
    # remove stupid bylines 
    $note =~ s/---+[^-]*---+//g;
    # be sure to remove <>"\ from the note, or it could mess up HTML
    $note =~ tr/"/'/;
    $note =~ s (\<|\>) ()g;
};

# -------------- E) write $notedir/$msgid

open (NOTE, ">" . $notefile) or die "could not create $notefile";
print NOTE substr ( $note, 0,600);
close (NOTE) or die "could not create $notefile";  

__END__

<Prev in Thread] Current Thread [Next in Thread>