On Fri, Aug 29 2008 13:06:32 +0100, davidc(_at_)nag(_dot_)co(_dot_)uk wrote:
Ken,
The Unicode characters  through  are specifically
"non-characters", which means they must not be used to represent
Everybody's understanding of these code points has been changing over
the years. They are not singled out in either the Unicode Standard,
Version 2.0, or the Unicode Standard, Version 3.0 as being blessed (or
damned) as never having a character assigned to them. I don't have time
to trace when they became special, but they are so mentioned in the
Unicode Standard, Version 5.0, and in the draft XML 1.0 Fifth Edition
[1].
characters in a data stream between sender and receiver. This means
that two trading partners must not use them in XML documents, which
makes them available for XSLT users for this character mapping
technique without interfering with user data.
I'm not sure I see it that way, these non characters are not actually
banned by XML systems so (like private use characters) their use (or non
use) is constrained by convention rather than technology.
Given that XSLT files are XML, it would see that this convention would
suggest that they not be used here as well. If you say that the
convention will ensure that this character definitely won't appear in an
XML source file, what happens if someone uses the XSLT document as
input?
Naturally the use of such an unconventional feature would be thoroughly
commented in the XSLT document!
If you don't understand a noncharacter, you can delete it:
If a noncharacter that does not have a specific internal use is
unexpectedly encountered in processing, an implementation may signal
an error or delete or ignore the noncharacter. If these options are
not taken, the noncharacter should be treated as an unassigned code
point. For example, an API that returned a character property value
for a noncharacter would return the same value as the default value
for an unassigned code point. [2]
I'd expect most (all, really) XSLT processors to handle the
noncharacters, since as you point out, they are allowed in XML (even if
they could be frowned upon in future).
I'm not aware of people actually using private characters for
interchange
We (in MathML 1) got our figures severely wrapped
with a bow of a ship?
for using private use
characters for math (because the math characters were not added until
Unicode 3.1 and 3.2, many years later) on the grounds that specifying
"private" uses for private use characters would greatly hamper usage of
the standard in Asia, particularly, where apparently a certain well
known operating system used lage chunks of the private use area for
commonly used document code pages.....
And a large font company had corralled a different chunk.
Perhaps you should have volunteered for John Cowan's ConScript registry
at the time.
Nevertheless using non-characters for this XSLT stylesheet character
mapping seems to me to be better guidance than using private-use
characters.
But could be construed as breaking the convention that says that
non-characters not be used in documents.
Unicode discourages their use in interchange, which is not quite the
same as never using them, though somehow "interchange" isn't defined in
the Unicode glossary.
XML 1.0 Fifth Edition goes/will go only so far as to say they are
"discouraged".
So presumably it's okay to use them between consenting pieces of
software.
Regards,
Tony Graham
Tony(_dot_)Graham(_at_)MenteithConsulting(_dot_)com
Director W3C XSL FO SG Invited Expert
Menteith Consulting Ltd
XML, XSL and XSLT consulting, programming and training
Registered Office: 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
Registered in Ireland - No. 428599 http://www.menteithconsulting.com
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
xmlroff XSL Formatter http://xmlroff.org
xslide Emacs mode http://www.menteith.com/wiki/xslide
Unicode: A Primer urn:isbn:0-7645-4625-2
[1] http://www.w3.org/TR/2008/PER-xml-20080205/
[2] http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf
from http://www.unicode.org/versions/Unicode5.1.0/
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--