mhonarc-commits
[Top] [All Lists]

CVS: mhonarc/MHonArc/doc/resources mimefilters.html,1.18,1.19

2002-11-22 21:10:54
Update of /cvsroot/mhonarc/mhonarc/MHonArc/doc/resources
In directory subversions:/tmp/cvs-serv14330/doc/resources

Modified Files:
	mimefilters.html 
Log Message:
* Added subdir option to mhtxtplain.pl and mhtxthtml.pl filters since
  the filters can create derived files.
* Updated creation of "subdir" directory to be resistent to symlink
  attacks.
* Javascript URLs are munged by HTML filter.  Further protection against
  XSS attacks.
* <a href>'s preserved by HTML filter, even if onlu cid: URLs allowed.
  This prevents regular hyperlinks from becoming stripped and enticing
  users to use allownoncidurls to work around this (which then opens
  up XSS vulnerabilities).  With the javascript URL munging, preserving
  <a href>'s should be safe.


Index: mimefilters.html
===================================================================
RCS file: /cvsroot/mhonarc/mhonarc/MHonArc/doc/resources/mimefilters.html,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -r1.18 -r1.19
*** mimefilters.html	20 Nov 2002 23:53:08 -0000	1.18
--- mimefilters.html	23 Nov 2002 04:10:40 -0000	1.19
***************
*** 95,102 ****
  </P>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>As of v2.5 of MHonArc, the API for filters is different from
  v2.4.x, and earlier.  The following describes the v2.5 API for
  filters.
--- 95,102 ----
  </P>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>As of v2.5 of MHonArc, the API for filters is different from
  v2.4.x, and earlier.  The following describes the v2.5 API for
  filters.
***************
*** 112,116 ****
  routine is as follows: </P>
  
! <PRE>
  sub <var>filter</var> {
      my(<b>$fields_hash_ref</b>, <b>$body_data_ref</b>, <b>$is_decoded</b>, <b>$filter_args</b>) = @_;
--- 112,116 ----
  routine is as follows: </P>
  
! <PRE class="code">
  sub <var>filter</var> {
      my(<b>$fields_hash_ref</b>, <b>$body_data_ref</b>, <b>$is_decoded</b>, <b>$filter_args</b>) = @_;
***************
*** 136,140 ****
  content-type, if defined, you would do:
  </p>
! <pre>
    $fields_hash_ref-&gt;{'content-type'}[0]</pre>
  <p>Values for a fields are stored in arrays since
--- 136,140 ----
  content-type, if defined, you would do:
  </p>
! <pre class="code">
    $fields_hash_ref-&gt;{'content-type'}[0]</pre>
  <p>Values for a fields are stored in arrays since
***************
*** 192,199 ****
  </p>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>If you want MHonArc to treat the data as filtered, but not
  have anything displayed on the page, just return a string with
  a single space character.
--- 192,199 ----
  </p>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>If you want MHonArc to treat the data as filtered, but not
  have anything displayed on the page, just return a string with
  a single space character.
***************
*** 202,209 ****
  </tr>
  </table>
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>If the filter creates a subdirectory with files, the filter
  only needs to return the subdirectory in the return list.  If
  the message gets removed, MHonArc will delete the entire
--- 202,209 ----
  </tr>
  </table>
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>If the filter creates a subdirectory with files, the filter
  only needs to return the subdirectory in the return list.  If
  the message gets removed, MHonArc will delete the entire
***************
*** 229,236 ****
  location of the mail archive.
  </P>
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p><strong>Do not include</strong> 
  <strong><CODE>$mhonarc::OUTDIR</CODE></strong> as part as the
  filename that is returned to MHonArc.  If the filter
--- 229,236 ----
  location of the mail archive.
  </P>
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p><strong>Do not include</strong> 
  <strong><CODE>$mhonarc::OUTDIR</CODE></strong> as part as the
  filename that is returned to MHonArc.  If the filter
***************
*** 270,277 ****
  <h2><a name="default">Default Setting</a></h2>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>It is important to have an explicit entry for
  <b><tt>application/octet-stream</tt></b> for handling unknown
  media-types.
--- 270,277 ----
  <h2><a name="default">Default Setting</a></h2>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>It is important to have an explicit entry for
  <b><tt>application/octet-stream</tt></b> for handling unknown
  media-types.
***************
*** 281,285 ****
  </table>
  
! <PRE>
  <b>&lt;MIMEFilters&gt;</b>
  application/octet-stream;  <a href="#m2h_external">m2h_external::filter</a>;	mhexternal.pl
--- 281,285 ----
  </table>
  
! <PRE class="code">
  <b>&lt;MIMEFilters&gt;</b>
  application/octet-stream;  <a href="#m2h_external">m2h_external::filter</a>;	mhexternal.pl
***************
*** 310,320 ****
  </p>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>The default filters use a <a href="mimeargs.html">MIMEARGS</a>
  argument style similiar to HTML attribute values.  For example:
  </p>
! <pre>
  <b>&lt;MIMEArgs&gt;</b>
  m2h_text_plain::filter; attachcheck fancyquote maxwidth=80
--- 310,320 ----
  </p>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>The default filters use a <a href="mimeargs.html">MIMEARGS</a>
  argument style similiar to HTML attribute values.  For example:
  </p>
! <pre class="code">
  <b>&lt;MIMEArgs&gt;</b>
  m2h_text_plain::filter; attachcheck fancyquote maxwidth=80
***************
*** 372,383 ****
  <tr valign=top>
  <td><strong><tt>excludeexts=</tt></strong><var>ext1</var><tt>,</tt>...</td>
! <td><p>A comma separated list of message specified filename extensions
  to exclude.  I.e.  If the filename extension matches an extension
  in excludeexts, the content will not be written.  The return markup
  will contain the name of the attachment, but no link to the data.
  This option is best used with application/octet-stream to exclude
! unwanted data that is not tagged with the proper content-type.  The
! <a href="#m2h_null"><tt>m2h_null::filter</tt></a> can be used to
! exclude content by content-type.
  </p>
  </td>
--- 372,381 ----
  <tr valign=top>
  <td><strong><tt>excludeexts=</tt></strong><var>ext1</var><tt>,</tt>...</td>
! <td width="100%"><p>A comma separated list of message specified filename extensions
  to exclude.  I.e.  If the filename extension matches an extension
  in excludeexts, the content will not be written.  The return markup
  will contain the name of the attachment, but no link to the data.
  This option is best used with application/octet-stream to exclude
! unwanted data that is not tagged with the proper content-type.
  </p>
  </td>
***************
*** 405,409 ****
  <tr valign=top>
  <td><strong><tt>frame</tt></strong></td>
! <td><p>Draw a frame around the attachment link.
  </p></td>
  </tr>
--- 403,407 ----
  <tr valign=top>
  <td><strong><tt>frame</tt></strong></td>
! <td><p>Draw a border around the attachment link.
  </p></td>
  </tr>
***************
*** 467,475 ****
  of derived file.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
  Use this option with caution
  since it can lead to filename conflicts and
  security problems.
! </p></td>
  </tr>
  <tr valign=top>
--- 465,478 ----
  of derived file.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
  Use this option with caution
  since it can lead to filename conflicts and
  security problems.
! </p>
! </td></tr></table>
! </td>
  </tr>
  <tr valign=top>
***************
*** 478,485 ****
  of derived file.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
  Use this option with caution since it can lead to
  security problems.
! </p></td>
  </tr>
  </table>
--- 481,493 ----
  of derived file.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
  Use this option with caution since it can lead to
  security problems.
! </p>
! </td></tr></table>
! </td>
  </tr>
  </table>
***************
*** 1435,1442 ****
  </p>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>Only the ISO-8859-[1-10] character sets are recognized.
  </p>
  </td>
--- 1443,1450 ----
  </p>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>Only the ISO-8859-[1-10] character sets are recognized.
  </p>
  </td>
***************
*** 1447,1454 ****
  <H3><a name="m2h_text_html">m2h_text_html::filter</a></h3>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
! <td><strong><font color="red">CAUTION</font></strong></td>
! <td><p>If you are worried about security, it is recommended that you disable
  support of HTML messages in your mail archives.  There is no
  guarantee that this filter is robust enough to eliminate all possible
--- 1455,1462 ----
  <H3><a name="m2h_text_html">m2h_text_html::filter</a></h3>
  
! <table class="caution" width="100%">
  <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>If you are worried about security, it is recommended that you disable
  support of HTML messages in your mail archives.  There is no
  guarantee that this filter is robust enough to eliminate all possible
***************
*** 1483,1487 ****
  </li>
  <li><p>Any markup related to scripting is removed for security
! reasons.  The following tags are removed:
  <tt>&lt;applet&gt;</tt>,
  <tt>&lt;base&gt;</tt>,
--- 1491,1496 ----
  </li>
  <li><p>Any markup related to scripting is removed for security
! reasons.  Javascript URLs are munged to make them ineffective.
! At a minimum, the following tags are stripped:
  <tt>&lt;applet&gt;</tt>,
  <tt>&lt;base&gt;</tt>,
***************
*** 1500,1504 ****
  <tt>&lt;style&gt;</tt>,
  <tt>&lt;textarea&gt;</tt>.
! The following attributes are removed:
  <tt>onload</tt>,
  <tt>onunload</tt>,
--- 1509,1513 ----
  <tt>&lt;style&gt;</tt>,
  <tt>&lt;textarea&gt;</tt>.
! At a minimum, the following attributes are removed:
  <tt>onload</tt>,
  <tt>onunload</tt>,
***************
*** 1537,1544 ****
  disable that behavior.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
  Using this option can open up security vulnerabilies, including the
  potential accessing of server resources.
! </p></td>
  </tr>
  <tr valign=top>
--- 1546,1558 ----
  disable that behavior.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
  Using this option can open up security vulnerabilies, including the
  potential accessing of server resources.
! </p>
! </td></tr></table>
! </td>
  </tr>
  <tr valign=top>
***************
*** 1547,1551 ****
  Normally, any URL-based attribute -- <tt>href</tt>, <tt>src</tt>,
  <tt>background</tt>, <tt>classid</tt>, <tt>data</tt>, <tt>longdesc</tt>
! -- will be stripped if it is not a cid: URL.  This is to prevent
  malicious URLs that verify mail addresses for spam purposes, secretly
  set cookies, or gather some statistical data automatically with the
--- 1561,1566 ----
  Normally, any URL-based attribute -- <tt>href</tt>, <tt>src</tt>,
  <tt>background</tt>, <tt>classid</tt>, <tt>data</tt>, <tt>longdesc</tt>
! -- will be stripped if it is not a cid: URL.
! This is to prevent
  malicious URLs that verify mail addresses for spam purposes, secretly
  set cookies, or gather some statistical data automatically with the
***************
*** 1553,1557 ****
  <tt>IMG</tt>, <tt>BODY</tt>, <tt>IFRAME</tt>, <tt>FRAME</tt>,
  <tt>OBJECT</tt>, <tt>SCRIPT</tt>, <tt>INPUT</tt>.
! </p></td>
  </tr>
  <tr valign=top>
--- 1568,1583 ----
  <tt>IMG</tt>, <tt>BODY</tt>, <tt>IFRAME</tt>, <tt>FRAME</tt>,
  <tt>OBJECT</tt>, <tt>SCRIPT</tt>, <tt>INPUT</tt>.
! </p>
! <table class="note" width="100%">
! <tr valign=top>
! <td><strong>NOTE</strong></td>
! <td width="100%"><p><tt>Href</tt> attributes for anchor, <tt>&lt;A&gt;</tt>,
! elements are preserved, even if <tt>allownoncidurls</tt> is
! not specified.  This is so regular hyperlinks will function.  Javascript
! URLs will still be munged (unless the <tt>allowscript</tt> option is
! specified).
! </p>
! </td></tr></table>
! </td>
  </tr>
  <tr valign=top>
***************
*** 1560,1568 ****
  This includes elements and attributes related to scripting.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
  Use of this option can open up your archives to cross-site scripting (XSS)
  attacks.  It is highly recommended to not use this option, especially
  for publicly accessible archives.
! </p></td>
  </tr>
  <tr valign=top>
--- 1586,1599 ----
  This includes elements and attributes related to scripting.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
  Use of this option can open up your archives to cross-site scripting (XSS)
  attacks.  It is highly recommended to not use this option, especially
  for publicly accessible archives.
! </p>
! </td></tr></table>
! </td>
  </tr>
  <tr valign=top>
***************
*** 1575,1587 ****
  with a link to it from the message page.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
  If <tt>attachcheck</tt> is specified, the HTML
  data is saved "as-is".  For example, no stripping of scripting-based
! markup is performed and no resolution of cid URLs are performed.
  </p>
  </tr>
  <tr valign=top>
  <td><strong><tt>nofont</tt></strong></td>
! <td><p>Remove font tags and styles.
  </p></td>
  </tr>
--- 1606,1626 ----
  with a link to it from the message page.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
  If <tt>attachcheck</tt> is specified, the HTML
  data is saved "as-is".  For example, no stripping of scripting-based
! markup is performed and no resolution of cid URLs are performed, allowing
! your archive to be used for cross-site scripting (XSS) exploits.
! This option <strong>SHOULD NOT BE USED</strong> for archives unless all mail
! from all senders can be trusted.
  </p>
+ </td></tr></table>
+ </td>
  </tr>
  <tr valign=top>
  <td><strong><tt>nofont</tt></strong></td>
! <td><p>Remove font tags and styles, including CSS styles.
  </p></td>
  </tr>
***************
*** 1594,1597 ****
--- 1633,1643 ----
  body.  The <tt>notitle</tt> argument disables this behavior.
  </p></td>
+ <tr valign=top>
+ <td><strong><tt>subdir</tt></strong></td>
+ <td><p>Place attachment files in a subdirectory of the archive.
+ This option is used for MHTML messages that include inline images
+ are other referenced data.
+ </p></td>
+ </tr>
  </table>
  
***************
*** 1612,1619 ****
  </p>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>The functions registered via the
  <a href="charsetconverters.html">CHARSETCONVERTERS</a> are used
  to handle character set processing.
--- 1658,1665 ----
  </p>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>The functions registered via the
  <a href="charsetconverters.html">CHARSETCONVERTERS</a> are used
  to handle character set processing.
***************
*** 1650,1655 ****
  <td><p>Colon separated lists of charsets to leave as-is.
  Only HTML special characters will be converted into entities.
- The default is "<tt>us-ascii:iso-8859-1</tt>".
  </p>
  </tr>
  <tr valign=top>
--- 1696,1709 ----
  <td><p>Colon separated lists of charsets to leave as-is.
  Only HTML special characters will be converted into entities.
  </p>
+ <table class="note" width="100%">
+ <tr valign=top>
+ <td><strong>NOTE</strong></td>
+ <td width="100%"><p>The <tt>asis</tt> option is deprecated.
+ It is recommended to use the
+ <a href="charsetconverters.html">CHARSETCONVERTERS</a> resource to
+ control how character data is converted.
+ </p>
+ </td></tr></table>
  </tr>
  <tr valign=top>
***************
*** 1675,1680 ****
  formatting.
  </p>
! <p><strong>Note:</strong> The effects the <tt>fancyquote</tt> and
  <tt>quote</tt> will still occur if either option is specified.
  </tr>
  <tr valign=top>
--- 1729,1740 ----
  formatting.
  </p>
! <table class="note" width="100%">
! <tr valign=top>
! <td><strong>NOTE</strong></td>
! <td width="100%"><p>
! The effects of <tt>fancyquote</tt> and
  <tt>quote</tt> will still occur if either option is specified.
+ </p>
+ </td></tr></table>
  </tr>
  <tr valign=top>
***************
*** 1688,1694 ****
  for flowed text data, but without the flowed text wrapping sematics.
  </p>
! <p><strong>Note:</strong> If message text is denoted as flowed text by
! message headers, the text will always be rendered with flowed sematics.
  </p>
  </tr>
  <tr valign=top>
--- 1748,1760 ----
  for flowed text data, but without the flowed text wrapping sematics.
  </p>
! <table class="note" width="100%">
! <tr valign=top>
! <td><strong>NOTE</strong></td>
! <td width="100%"><p>
! If message text is denoted as flowed text by
! message headers, the text will always be rendered with flowed sematics
! unless the <tt>disableflowed</tt> option is specified.
  </p>
+ </td></tr></table>
  </tr>
  <tr valign=top>
***************
*** 1699,1702 ****
--- 1765,1771 ----
  data looks like the start of an HTML document.
  </p>
+ <p>If the data looks like HTML, the <a href="#m2h_text_html">HTML filter</a>
+ will be invoked on it.
+ </p>
  </tr>
  <tr valign=top>
***************
*** 1704,1708 ****
  <td><p>A comma separated list of message specified filename
  extensions to treat as inline data.
! Applicable only when <b><tt>uudecode</tt></b> is specified.
  </p>
  </tr>
--- 1773,1778 ----
  <td><p>A comma separated list of message specified filename
  extensions to treat as inline data.
! Applicable only when <b><tt>uudecode</tt></b> and
! <b><tt>usename</tt></b> are specified.
  </p>
  </tr>
***************
*** 1720,1723 ****
--- 1790,1800 ----
  wrapped.
  </p>
+ <table class="note" width="100%">
+ <tr valign=top>
+ <td><strong>NOTE</strong></td>
+ <td width="100%"><p>A line that contains no whitespace and is longer
+ than <tt>maxwidth</tt>, will <strong>NOT</strong> be wrapped.
+ </p>
+ </td></tr></table>
  </tr>
  <tr valign=top>
***************
*** 1757,1760 ****
--- 1834,1843 ----
  </tr>
  <tr valign=top>
+ <td><strong><tt>subdir</tt></strong></td>
+ <td><p>Place attachment files in a subdirectory of the archive.
+ This option is only applicable if <b><tt>uudecode</tt></b> is specified.
+ </p></td>
+ </tr>
+ <tr valign=top>
  <td><strong><tt>target=</tt></strong><var>name</var></td>
  <td><p>Set the TARGET attribute of anchors generated from
***************
*** 1767,1774 ****
  when write the data to disk instead of just the filename extension.
  </p>
! <p><strong><font color="red">CAUTION</font></strong>:
! Be aware that there are potential security problems when
! using this option.
! </p></td>
  </tr>
  <tr valign=top>
--- 1850,1864 ----
  when write the data to disk instead of just the filename extension.
  </p>
! <table class="caution" width="100%">
! <tr valign=top>
! <td><strong style="color: red;">CAUTION</strong></td>
! <td width="100%"><p>
! Use this option with caution
! since it can lead to filename conflicts and
! security problems.  If you plan on using <tt>usename</tt>, consider
! using the <tt>subdir</tt> option with it.
! </p>
! </td></tr></table>
! </td>
  </tr>
  <tr valign=top>
***************
*** 1817,1821 ****
  excludes all images:
  </p>
! <pre>
  <b>&lt;MIMEFilters&gt;</b>
  image/*; m2h_null::filter</a>; mhnull.pl
--- 1907,1911 ----
  excludes all images:
  </p>
! <pre class="code">
  <b>&lt;MIMEFilters&gt;</b>
  image/*; m2h_null::filter</a>; mhnull.pl
***************
*** 1825,1829 ****
  of what was excluded.  Examples:
  </p>
! <pre>
  &lt;&lt;attachment: HelloWorld.jpg&gt;&gt;
  &lt;&lt;application/postscript&gt;&gt;
--- 1915,1919 ----
  of what was excluded.  Examples:
  </p>
! <pre class="code">
  &lt;&lt;attachment: HelloWorld.jpg&gt;&gt;
  &lt;&lt;application/postscript&gt;&gt;
***************
*** 1835,1848 ****
  </p>
  
! <table border=0 cellpadding=4>
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td><p>The sematics of using the
  <a href="mimeexcs.html">MIMEEXCS</a> resource vs this filter are different.
  When using this filter, MHonArc considers the data
  successfully converted.  But with <a href="mimeexcs.html">MIMEEXCS</a>,
  MHonArc does not consider the data converted.  This is relevant when
! MHonArc processes <tt>multipart/alternative</tt> entities and is
! determing which alternative parts will be used.
  </p>
  <p>The <tt>m2h_null::filter</tt> existed before
--- 1925,1938 ----
  </p>
  
! <table class="note" width="100%">
  <tr valign=top>
  <td><strong>NOTE</strong></td>
! <td width="100%"><p>The sematics of using the
  <a href="mimeexcs.html">MIMEEXCS</a> resource vs this filter are different.
  When using this filter, MHonArc considers the data
  successfully converted.  But with <a href="mimeexcs.html">MIMEEXCS</a>,
  MHonArc does not consider the data converted.  This is relevant when
! MHonArc determines which alternative part to use for an
! <tt>multipart/alternative</tt> entity.
  </p>
  <p>The <tt>m2h_null::filter</tt> existed before
***************
*** 1868,1872 ****
  <tt>text/tab-separated-values</tt> into an HTML table:
  </p>
! <pre>
  package m2h_text_tsv;
  
--- 1958,1962 ----
  <tt>text/tab-separated-values</tt> into an HTML table:
  </p>
! <pre class="code">
  package m2h_text_tsv;
  

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV