1 and do the translation
of non-printable characters to � format myself
Actually why do you care? Any conforming html or xml product will be
quite happy to have the nbsp in utf8 format (and most xml and all html
systems will also accept latin1)
If your processor is translating to ? then that's a bug but a conforming
action on receiving an nbsp and a request to encoding="us-ascii"
would be to die and say you don't support ascii, only utf 8 and 16 need
be supported. So you may not be any happier with a conforming system.
Despite my question above I do sometimes care and use US-ASCII as output
from saxon (and then post-process with sed to remove the encoding
declaration as I have xml parsers that don't accept US-ASCII and so die
on the first line if that is there). But my question remains, if getting
character data is easy and getting a numeric character refernce is hard,
why not just go with the flow and take the character data.
David
--
The LaTeX Companion
http://www.awprofessional.com/bookstore/product.asp?isbn=0201362996
http://www.amazon.co.uk/exec/obidos/tg/detail/-/0201362996/202-7257897-0619804
________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________