Re: [Bug #2080] Redundant substitution

2002-12-30 19:22:09
On December 30, 2002 at 09:10, Gunnar Hjalmarsson wrote:

Okay, by help of your test message I now see that, if you comment out 
line 495, some blank lines are lost in a Gecko browser when doing 
'fancyquote' conversion. But your test message also proves that this 
issue must be taken care of using some other method.

Faulty logic.  Technically, the existing s/// operation generates
legal HTML, and sematically, properly reflects the original formatting
of the message.

Now, I do agree that adding the <br>'s is "ugly" (and one of the
reasons I did "<br\n>" so the raw HTML would still be readable instead
of long line), but their existence is valid and legal.

Attached to this message you find a .zip file with two fancyquote 
conversion examples where your test message was used. The first example 
was converted using the current code, while I used a modified when converting the second example.

Even if the source code differs, you don't notice a difference between 
the examples in a Gecko browser. But if you view example 1 with Internet 
Explorer, the page is screwed up.

MSIE is buggy.  The HTML is legal.  I hate IE.

In the modified file I used, I had commented out line 495, 
and added this "post-processing cleanup" code for fancyquote:

     $ret =~ s/(<pre[^>]*>\s*)([^<])/$1\n$2/g;

A more efficient diff follows:

RCS file: /cvsroot/mhonarc/mhonarc/MHonArc/lib/,v
retrieving revision 2.35
diff -u -p -r2.35
---       19 Dec 2002 05:14:23 -0000      2.35
+++       31 Dec 2002 01:47:35 -0000
@@ -492,7 +492,9 @@ sub filter {
                    $chunk =~ s/^(.*)$/&preserve_space($1)/gem;
            } else {
-               $chunk =~ s/\n/<br\n>/g;
+               # GUI browsers ignore first \n after <pre>, so we double it
+               # to make sure a blank line is rendered
+               $chunk =~ s/\A\n/\n\n/;
                $chunk = $startfixq . $chunk . $endfixq;

As the comment notes, the real problem is that browsers ignore the
first newline after the start <pre> tag, so the patch fixes this
directly with a simple s/// operation.  Unfortunately, text browsers
are less consistent.  Lynx will actually honor the first \n after
<pre>, so adding the extra \n causes additonal line break.  w3m is
off since it appears to add extra line breaks for the <blockquote>'s,
so trying to cater to w3m wrt linebreaks is practically hopeless.
Of course, those who want to cater to text browsers should not
use fancyquote and use disableflowed.

I think I did the s/\n/<br\n/g initially since I wanted to
be explicit about the formatting sematics and not be subjugated
to EOL issues.  Reviewing the HTML 4.0 standard, the following
is relavent:

  B.3.1 Line breaks

  SGML (see [ISO8879], section 7.6.1) specifies that a line break
  immediately following a start tag must be ignored, as must a
  line break immediately before an end tag. This applies to all
  HTML elements without exception.

So, according to the HTML 4.0 standard, the above patch is
consistent with the standard.

Of course, this all has the side-effect of avoiding IE's crappy
rendering and a potential flood of bogus bug reports to MHonArc :-P


To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the