ietf-822
[Top] [All Lists]

Re: UTF-8 over RFC 2047 (Re: Call for Usefor to recharter)

2003-01-11 13:23:01

[Usefor folks: Apologies for sending this twice, but the first attempt
didn't make it to ietf-822, as I am not a subscriber there.]


Dave Crocker <dcrocker(_at_)brandenburg(_dot_)com> writes:
Hence, UTF-16 and UTF-8 are methods of encoding a larger bit space into a
smaller representation space, producing variable-length strings.  One crams
the larger space into a 16-bit world.  The other crams it into an 8-bit
world.

There depending on the situation, there can be processing or space
efficiencies gained by one encoding over another.

But there is no theoretical or aesthetic superiority that can be claimed by
one over the other.

Cramming those bits into a 7-bit environment is just one more cramming
effort.  It stands equal to the others as an alternative that has benefits
and detriments.

Why let it stop with 7 bits? Why not cram it into one bit, while you're
at it?

The confusion on this issue probably stems from the fact that you can use
existing data viewers -- such as text editors -- to view the result of a
7-bit encoding and cannot use such "legacy" services for viewing UTF-8 or
UTF-16.

If you do not have UTF-8 or UTF-16 tools, you cannot view the data at all.

Incorrect. If you have an 8-bit editor that does not understand UTF-8,
you will see the text, but it will look "ugly".

The same applies if you have a any-bit editor which does not understand
RFC2047.

But:

1) UTF-8 looks less ugly than RFC2047 when incorrectly displayed.
2) UTF-8 editors are commonly available. RFC2047-capable editors are not.

And if you only have a 7-bit editor? Then you are grossly out of date
anyway.
--
Erland Sommarskog, Stockholm, sommar(_at_)algonet(_dot_)se