I could use some advice from the Perl-savvy (which I ain't!) on this
list regarding the format of message bodies. Our page format includes a
"left margin" stripe of color down the side of the page, so we don't
have the usual screen width for text display. Thus the message text
bleeds over the right side of the page unless the lines happen to be
very short. I downloaded a program called txt2html to use as a filter.
I've done the following, hoping to force the message body to be
HTML-ized:
<MIMEFilters>
...
message/partial:m2h_text_plain'filter:txt2html.pl
text/*:m2h_text_plain'filter:txt2html.pl
...
text/plain:m2h_text_plain'filter:txt2html.pl
text/richtext:m2h_text_plain'filter:txt2html.pl
...
</MIMEFilters>
This will not work. MHonArc will look for a routine called
"m2h_text_plain'filter" in txt2html.pl, and there is none. In
order to use text2html.pl, you will ne to write a MHonArc filter
wrapper to interface with txt2html.pl. The information on how
to write MHonarc filters and how to hook them into the program
is described in the MIMEFILTERS resource page of the documentation.
Since you stated you were not Perl savvy, you may not have the time
to learn Perl to do what you need. Therefore, I have included below a
version of the mhtxtplain.pl library that may be able to do something
to suit your needs (this library, or reasonable facsimile, will be
included in the next release of MHonArc). Read the comments
in the code to see the new options available to the filter. Use
the MIMEARGS resource to define the options you desire.
##---------------------------------------------------------------------------##
## File:
## @(#) mhtxtplain.pl 1.7 97/04/11 19:57:26 @(#)
## Author:
## Earl Hood ehood(_at_)medusa(_dot_)acs(_dot_)uci(_dot_)edu
## Description:
## Library defines routine to filter text/plain body parts to HTML
## for MHonArc.
## Filter routine can be registered with the following:
## <MIMEFILTERS>
## text/plain:m2h_text_plain'filter:mhtxtplain.pl
## </MIMEFILTERS>
##---------------------------------------------------------------------------##
## MHonArc -- Internet mail-to-HTML converter
## Copyright (C) 1995-1997 Earl Hood,
ehood(_at_)medusa(_dot_)acs(_dot_)uci(_dot_)edu
##
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; if not, write to the Free Software
## Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
##---------------------------------------------------------------------------##
package m2h_text_plain;
$Url = '(http://|https://|ftp://|afs://|wais://|telnet://' .
'|gopher://|news:|nntp:|mid:|cid:|mailto:|prospero:)';
$UrlExp = $Url . q%[^\s\(\)\|<>"']*[^\.;,"'\|\[\]\(\)\s<>]%;
$HUrlExp = $Url . q%[^\s\(\)\|<>"'\&]*[^\.;,"'\|\[\]\(\)\s<>\&]%;
$QuoteChars = '>|[\|\]+:]';
$HQuoteChars = '>|[\|\]+:]';
##---------------------------------------------------------------------------##
## Text/plain filter for mhonarc. The following filter arguments
## are recognized ($args):
##
## nourl -- Do hyperlink URLs
## quote -- Italicize quoted message text
## nonfixed -- Use normal typeface
## keepspace -- Preserve whitespace if nonfixed
## maxwidth=# -- Set the maximum width of lines. Lines
## exceeding the maxwidth will be broken
## up across multiple lines.
## asis=set1:set2:... -- Colon separated lists of charsets
## to leave as-is. Only HTML special
## characters will be converted into
## entities.
##
## All arguments should be separated by at least one space
##
sub filter {
local($header, *fields, *data, $isdecode, $args) = @_;
local($ctype, $charset, $nourl, $doquote, $igncharset, $nonfixed,
$keepspace, $maxwidth);
local(%asis) = ();
$nourl = ($'NOURL || ($args =~ /nourl/i));
$doquote = ($args =~ /quote/i);
$nonfixed = ($args =~ /nonfixed/i);
$keepspace = ($args =~ /keepspace/i);
if ($args =~ /maxwidth=(\d+)/) {
$maxwidth = $1;
} else {
$maxwidth = 0;
}
## Grab charset parameter (if defined)
$ctype = $fields{'content-type'};
($charset) = $ctype =~ /charset=(\S+)/;
$charset =~ s/['"]//g; $charset =~ tr/A-Z/a-z/;
## Check if certain charsets should be left alone
if ($args =~ /asis=(\S+)/i) {
local(@a) = split(':', $1);
foreach (@a) {
tr/A-Z/a-z/;
$asis{$_} = 1;
}
}
## Check MIMECharSetConverters if charset should be left alone
if ($main'MIMECharSetConverters{$charset} eq "-decode-") {
$asis{$charset} = 1;
}
## Check if max-width set
if ($maxwidth) {
$* = 1;
$data =~ s/^(.*)$/&break_line($1, $maxwidth)/ge;
$* = 0;
}
## Convert data according to charset
if (!$asis{$charset}) {
## Japanese message
if ($charset =~ /iso-2022-jp/i) {
return (&jp2022(*data));
## Latin 2-6, Greek, Hebrew, Arabic
} elsif ($charset =~ /iso-8859-([2-9]|10)/i) {
$data = &iso_8859'str2sgml($data, $charset);
## ASCII, Latin 1, Other
} else {
&esc_chars_inplace(*data);
}
} else {
&esc_chars_inplace(*data);
}
## Check for quoting
if ($doquote) {
$data =~ s(_at_)\n(${HQuoteChars})(.*)@\n$1<I>$2</I>@go;
}
## Check if using nonfixed font
if ($nonfixed) {
$data =~ s/(\r?\n)/<br>$1/g;
if ($keepspace) {
$* = 1;
$data =~ s/^(.*)$/&preserve_space($1)/ge;
$* = 0;
}
} else {
$data = "<PRE>\n" . $data . "</PRE>\n";
}
## Convert URLs to hyperlinks
$data =~ s@($HUrlExp)@<A HREF="$1">$1</A>@gio unless $nourl;
($data);
}
##---------------------------------------------------------------------------##
## Function to convert ISO-2022-JP data into HTML. Function is based
## on the following RFCs:
##
## RFC-1468 I
## J. Murai, M. Crispin, E. van der Poel, "Japanese Character
## Encoding for Internet Messages", 06/04/1993. (Pages=6)
##
## RFC-1554 I
## M. Ohta, K. Handa, "ISO-2022-JP-2: Multilingual Extension of
## ISO-2022-JP", 12/23/1993. (Pages=6)
##
## Author of function:
## NIIBE Yutaka gniibe(_at_)mri(_dot_)co(_dot_)jp
## (adapted for mhtxtplain.pl by Earl Hood
<ehood(_at_)medusa(_dot_)acs(_dot_)uci(_dot_)edu>)
##
sub jp2022 {
local(*body) = shift;
local(@lines) = split(/\r?\n/,$body);
local($ret, $ascii_text);
$ret = "<PRE>\n";
for ($i = 0; $i <= $#lines; $i++) {
$_ = $lines[$i];
# Process preceding ASCII text
while(1) {
if (/^[^\033]+/) { # ASCII plain text
$ascii_text = $&;
$_ = $';
# Replace meta characters in ASCII plain text
$ascii_text =~ s%\&%\&%g;
$ascii_text =~ s%<%\<%g;
$ascii_text =~ s%>%\>%g;
## Convert URLs to hyperlinks
$ascii_text =~ s%($HUrlExp)%<A HREF="$1">$1</A>%gio
unless $'NOURL;
$ret .= $ascii_text;
} elsif (/\033\.[A-F]/) { # G2 Designate Sequence
$_ = $';
$ret .= $&;
} elsif (/\033N[ -]/) { # Single Shift Sequence
$_ = $';
$ret .= $&;
} else {
last;
}
}
# Process Each Segment
while(1) {
if (/^\033\([BJ]/) { # Single Byte Segment
$_ = $';
$ret .= $&;
while(1) {
if (/^[^\033]+/) { # ASCII plain text
$ascii_text = $&;
$_ = $';
# Replace meta characters in ASCII plain text
$ascii_text =~ s%\&%\&%g;
$ascii_text =~ s%<%\<%g;
$ascii_text =~ s%>%\>%g;
## Convert URLs to hyperlinks
$ascii_text =~ s%($HUrlExp)%<A HREF="$1">$1</A>%gio
unless $'NOURL;
$ret .= $ascii_text;
} elsif (/\033\.[A-F]/) { # G2 Designate Sequence
$_ = $';
$ret .= $&;
} elsif (/\033N[ -]/) { # Single Shift Sequence
$_ = $';
$ret .= $&;
} else {
last;
}
}
} elsif (/^\033\$[\(_at_)AB]|\033\$\([CD]/) { # Double Byte Segment
$_ = $';
$ret .= $&;
while(1) {
if (/^([!-~][!-~])+/) { # Double Char plain text
$_ = $';
$ret .= $&;
} elsif (/\033\.[A-F]/) { # G2 Designate Sequence
$_ = $';
$ret .= $&;
} elsif (/\033N[ -]/) { # Single Shift Sequence
$_ = $';
$ret .= $&;
} else {
last;
}
}
} else {
# Something wrong in text
$ret .= $_;
last;
}
}
$ret .= "\n";
}
$ret .= "</PRE>\n";
($ret);
}
##---------------------------------------------------------------------------##
sub esc_chars_inplace {
local(*foo) = shift;
$foo =~ s(_at_)\&@\&@g;
$foo =~ s@<@\<@g;
$foo =~ s@>@\>@g;
1;
}
##---------------------------------------------------------------------------##
sub preserve_space {
local($str) = shift;
1 while $str =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
# $str =~ s/ {2,}/' ' x length($&)/ge;
$str =~ s/ /\ /g;
$str;
}
##---------------------------------------------------------------------------##
sub break_line {
local($str) = shift;
local($width) = shift;
local($q, $new) = ('', '');
local($try, $trywidth);
## Translate tabs to spaces
1 while $str =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
## Do nothing if str <= width
return $str if length($str) <= $width;
## See if str begins with a quote char
if ($str =~ s/^($QuoteChars)//) {
$q = $1;
--$width;
}
## Create new string by breaking up str
while ($str) {
# handle case where no-whitespace line larger than width
if (($str =~ /^\S+/) && (length($&) >= $width)) {
$new .= $q . $&;
$str = $';
next;
}
$try = '';
$trywidth = $width;
$try = substr($str, 0, $trywidth);
if ($try =~ /\S+$/) {
$trywidth -= length($&);
$new .= $q . substr($str, 0, $trywidth);
} else {
$new .= $q . $try;
}
substr($str, 0, $trywidth) = '';
} continue {
$new .= "\n" if $str;
}
$new;
}
##---------------------------------------------------------------------------##
1;
--ewh