ietf
[Top] [All Lists]

Re: I-D file formats and internationalization

2005-12-01 15:10:24
On Dec 1, 2005, at 12:16 PM, Keith Moore wrote:

Also, the vast majority of printers in use don't natively support
printing of utf-8, thus forcing users to layer each of their computer
systems with more and more buggy cruft just to do simple tasks like
printing plain text.  Perhaps those are buggy also?

Uh, I print UTF-8 documents all the time.  Normally I do it from the  
app in which I'm viewing them (word processor, web browser, RSS  
reader, xml editor, whatever).

The point is that the apps have to support utf-8 or leverage OS support
for UTF-8, in order to print those documents.  Because the printer
doesn't support UTF-8, it's not as simple as simply sending the
characters to a printer anymore.   And while this might work fine for
you, and seem like a transparent change, it's inherently a much more
fragile setup.  (Which isn't really an argument against UTF-8, just a
digression about what is or is not "buggy".)

These days, your best bet for getting utf-8 files to print is to use a
web browser's print command, which is doable but can be fairly
cumbersome as compared to typing a simple "lpr" command.

Hmm; control-P, enter.

Well, sure, if you spend all of your time inside a particular web
browser.  

Unfortunately,
most web browsers fail to preserve page breaks (FF characters) when
printing flat text files, which makes the resulting documents hard to
read.

Turn this around; when printing HTML, the browser inserts appropriate  
page breaks depending on the combination of font, styling, and paper  
size that's in effect.  This has the effect that when you're arguing  
about some text, you have to say "Look at 5.2.1.3, 2nd para" rather  
than "Look at page 13, 2nd para".  It's not clear that this is any  
better or worse.

It's worse, because you really do want page numbers for when you print
the document and it's quite natural to reference them.  IMHO you really
want the HTML for an RFC to preserve pagination and page numbers and
make them visible (but not annoyingly so) even in a browser, while
causing page breaks when printed and still printing correctly on either
us-letter or a4 paper.  But I'm not sure that this can actually be done.

HTML with utf-8 actually displays and prints more portably than plain
text with utf-8, though it's not clear how many browsers support the
style sheet extensions enough to print page breaks in the right
places.

Given the above, I agree with the first half of the sentence.  In  
fact, I am sitting behind a desk on which there's a macintosh and an  
Ubuntu linux box, and I wouldn't really know how to print plain-ASCII  
text on either of them, and when I've tried, the page breaks usually  
come out wrong. 

Sometimes this is because there are no FFs in the source document
(especially for internet-drafts) and sometimes this is because the app
that you're using to print doesn't respect them (most web browsers
botch this).  OTOH, the command-line apps tend to do this right - not
because they are tied to a command-line but because they don't have
to deal with moby GUI libraries.

  The
biggest problems with HTML are (a) no way to include images in the
document without external links (yes I know about MHTML but it's  
not as
widely supported); (b) difficulty in finding authoring tools that will
produce output in a subset of HTML that we define; (c) avoiding
the temptation to make the documents pretty rather than readable.

I grant problem (a).  (b) and (c) can be solved using automated tools  
and compulsory stylesheets (or by using xml2rfc).

Well, sure, if we can demand that everyone use the same tools we can
define the file format however we want.   What we want is to make the
RFCs editable and displayable with existing tools.  Compulsory
stylesheets don't make problem (c) go away - they just move part of the
problem.  Plain text has the nice attribute that it encourage you to
concentrate on substance rather than appearance.

It's hard to escape the conclusion that we're trying very hard to make
our document processing much more complex for a very marginal gain.

There are two large populations for whom the gain is not "marginal".

1. Those, like me, who can't print ASCII files easily

I suspect it's not because you can't do so, but rather because you want
to be able to do so without changing the tools you use.  (after all,
lpr works fine to print text files on every UNIX/linux system I've seen
and also on my Mac)  Guess what, nobody else wants to change tools
either - and different kinds of people need different tools.
Changing the file format won't solve this problem, it will just move it.

2. Those whose names can't be spelled properly in ASCII.

It's a valid concern, but surely it's more important to communicate the
protocol specification clearly than the authors' names?  I certainly
don't object to making the native spellings of authors' names visible
if this can be done "for free", but any change to our document format
needs careful consideration, and the ability to print authors' names is
way down on the list of things to consider.

Keith

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf