mhonarc-dev

[bug #15415] mhonarc eats part of a message

2006-08-25 18:41:29

Follow-up Comment #9, bug #15415 (project mhonarc):

The HTML is malformed, causing the HTML filter to remove
almost all of the content.

The HTML has duplicate <head>...</head> sections, with the
dup at the very end of the HTML part of the message.  MHonArc
removes the <head> component, and the regex is greedy, so
it deletes everything 'til the last </head> tag.  Unfortunately,
for this message, the last </head> tag appears near the
end of the message since the the dup <head>...</head>
occurs after the </body>.

I'm unsure how you would like to proceed.  The HTML filter
could be made more robust to deal with such abominations, but
it may impact performance.

Since this problem is separate from what this bug item is
initially about (and this item is currently closed), I
request you submit a new bug.  You can use for the description
my explanation of the real problem described above.


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?15415>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV

<Prev in Thread] Current Thread [Next in Thread>