perl-unicode

Re: Strange characters when displaying html files saved in UTF-8 (BOM)

2001-12-12 05:32:46
Jarkko Hietaniemi wrote on 2001-12-11 21:44 UTC:
My spec is at home but I think it's illegal in subsequent text.
(Blindly concatenating text for several files could of course
lead into such a situation.)

The BOM is illegal nowhere. The BOM is a perfectly normal Unicode
character, namely the ZERO WIDTH NO-BREAK SPACE. Browsers must display
it exactly as such (that is: not display a strange character), wherever
it appears. When you test this in your browser, it is also a good
opportunity to test that the Plane 15 tagging characters are not
displayed as well.

Some recommendations for treating the BOM under Unix and in encoding
converters are in

  http://www.cl.cam.ac.uk/~mgk25/unicode.html

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>