I spent sometime today to improve the mhonarc.pl filter and to
deal with variations in message header formatting and to extract
information from the MHonArc <!--X-...--> comments. With this
filter, I now get message-id searching for my archives. A
useful feature to allow for URLs to messages that are resistant
to MHonArc message numbering.
Since MHonArc allows one to customize a message header in in radical
ways, it is possible that header fields will not be extracted (except
the <!--X-...--> will always be extracted). However, I did test the
filter against the table formatting style used by default in MHArc
and MHonArc's built-in default formatting. The key dependency for
the filter to handle the HTML message header is that field names and
field values are logically separated by a colon, ':'.
Attached is the diff of the mhonarc.pl I grabbed from CVS. I have
also attached the full modified mhonarc.pl since the changes are
significant and for those who may want to try it out directly.
All that is needed is to copy mhonarc.pl into the namazu filter
I hope this modified version can make it into the standard Namazu
Description: mhonarc.pl patch
Description: new mhonarc.pl