Re: files embedded en text messages, extensions, etc.

2000-06-10 14:37:33
On June 6, 2000 at 13:49, Uffe Henrik Engberg wrote:

1) It would be useful if MHonArc considered more than just the last
   filename extension. To reduce mail size people often compress files
   before attaching them. Using only the last extension, e.g. .gz of, in the file name means that browser does not know how
   to handle them after they have been saved. If .ps.gz is used, the
   browser knows that it just have to uncompress the file and call a
   PostScript viewer.

You regex is not sufficient.  What about a filename like,
"SomePrg2.01.tar".  What does "01" mean?  The better approach is to
recognize compress filename extensions like ".gz", ".Z", etc and
handle those cases specially.

2) 12 Aug 1999 Earl Hood announced that he had added support for
   decoding uuencoded data within text messages in of
   MHonArc v2.4.2. I would like MHonArc to be able to handle other
   kinds of embedded files. Preferably specified through resources.

The uuencoded addition was reluctant, but since it is still common
for newsgroups, I did it.  Handling other non-standard file attachment
schemes is a slippery-slope that I think provides very little benefit.
You also may have to deal with cases where users do not want a
all or nothing, but selective detection.  For example, I want to
recognize PostScript data, but no other.  So the extraction code will
need a list of what it should detect instead of blindly checking for
all data types it is designed to check.

I prefer to take a contributed filter approach.  If someone has a
filter to handle non-standard cases like embeded PostScript data in a
text/plain message, and the person believes it will be useful to others,
they can document it, and contribute it.  I have an extras/ directory
in the distribution for any useful contributions.

3) It is possible to tell MHonArc if it should gzip files it creates.
   It would be nice if one could instruct MHonArc to uncompress
   attached compressed files. Some of our users are on platforms where
   they do not necessary have the tools to make the needed

I have considered this in the past.  One problem is performance.
Another is to know what the proper media-type of the uncompressed data is
since the content-type setting will be application/x-gzip.  And since
mhonarc does not key off filenames by default, and it may not even
be present, what then.

I think some users use Apache, and Apache can be configured to
auto-decompress gzipped data.

Questions, which platforms do you refer to?  I have had no problems
getting software to handle gzip data on the different platforms I have
worked on.

4) Many attached files are not "web friendly", i.e. they are in a
   format one cannot expect the user to be able to view directly.
   However, our MHonArc server can turn many formats into pdf. If one
   could specify some post processing/hook filter MHonArc should call
   after saving an attachment, these converters could be activated.

Another custom mimefilter.  Note, the issue of format and users not
being able read it is something the sender should consider.

Regarding 1) it is sufficient to replace the line:

       ($nameparm =~ /\.(\w+)$/)) {     # filename has an extention

in with the line:

       ($nameparm =~ /\.(\w+(\.\w+)*)$/)) { # filename has extension(s)

No.  See comment above.

Regarding 2) I have made a slight modification of (patch
enclosed) which not only extracts embedded uuencoded files but also
embedded PostScript and LaTeX files. The modification is activated via
an "extract" option.

As it is now, these patterns are hard wired, but ideally they should
be specified through dedicated resource types (MIMEARGS are not
suitable). MHonArc has the ability to hook in own custom message
filters, but unfortunately one cannot hook in own custom resource

How is MIMEARGS not sufficient for your case?

Note, I have thought about custom resources, but the/my need has never
been strong enough to actually figure out how to implement it.  Any
solution would require a pre-registration step defining the custom
resources to recognize.  It's doable, but does it really need to be


