mhonarc-users

Re: Converting message body in a rc file

2002-02-28 13:43:48
On February 27, 2002 at 12:50, Institute for Social Ecology wrote:

I am wondering if their is any simple way of generating a "chopped"
message body in a RC file.  To elaborate a bit, I am working on a RC file
that generates a RSS file along side the regular index pages.  I woudl ike
to populate the RSS file with a short description of the last 10 posts.
This description would need to be in a text format, and chopped to only
include the first 100 words in the post.

www.mail-archive.com does something similiar to this, but it does
not try to include message body text.

Is this possible? The only remote solution I came across while going
throught the Mhonarc docs woudl deal with hacked up MIME filters, but this
does nto seem to be realistic.  Ideally, I'd like to just plop in a
resource variable that is defined to include the message body, but it
appears that such a vairable does not exist.

Any suggestions?

Post-process the HTML message pages.  There are well-defined
comment declarations in the message pages that delimit the
message body.  An approach is to have an OTHERINDEXES index
that represents generates your RSS file, but it includes a special
tag that you define that includes the message page filename where
you want the message body text to appear.  For example:

  <x-mesg-text filename="$MSG$">

MHonArc will then expand it to something like:

  <x-mesg-text filename="msg03456.html">

Then after MHonArc is done, you post-process the RSS file to
replace all occurances of <x-mesg-text> with the first 100 words of
the message body text.  The attribute signals which filename
to open to extract the text.  NOTE: Remember that the body
text will be in HTML, so will have to deal with HTML tags and
character entity references when extracting the first 100 words.

To help make the process more efficient, create a Perl program
that invokes mhonarc directly.  For example:

  #!/usr/bin/perl
  require 'mhamain.pl'
  mhonarc::initialize();
  if (!mhonarc::process_input()) {
    exit($mhonarc::CODE);
  }
  ## ... include RSS post-processing here by opening
  ##     RSS file specified via OTHERINDEXES and replaces
  ##     special tags with message text ...

In this case, you invoke your program just like you would
mhonarc, but get the additional behavior of generating a RSS
file in the format you desire.

Now, if you played with mhonarc API and internals, you could
potentially create the RSS file directly instead of using
OTHERINDEXES (you would still need to extract message text from
the message files).  However, you may not want to mess with
that approach.

Another possible solution is to use an idea from a a fairly recent
discussion for a different problem where the proposed solution may fit
your needs.  The solution leverages the annotation features of
MHonArc.  See
<http://www.xray.mpe.mpg.de/mailing-lists/mhonarc/2002-01/msg00047.html>.
You can pre-process a message before it is passed into MHonArc to
extract the first part of the message body text and assign it as a
annotation for the message.  Then use the annotation-related resource
variables to include the text in a mhonarc generated file.

IMO, I prefer the first appeoach mentioned above since it should
work independent on how input is fed into mhonarc.

Hope this helps,

--ewh

<Prev in Thread] Current Thread [Next in Thread>