ietf-smime
[Top] [All Lists]

Re: How to do UTF-8

1998-02-21 01:48:25
On Fri, 20 Feb 1998, Phillip H. Griffin wrote:

may be difficult to translate into a buffer size for variable-length

Nothing to it. The upper bound is simply the maximum
number of characters to expect. There may always be some 
slack in an implementation though for a given string, since 
each UTF8String character must be encoded in the smallest 
number of octets possible for a given character. Escape
sequences and announcers are not allowed.

Of course implementors are free to choose any buffer size 
they wish, but it would be arguably prudent to choose 3*max 
to handle the worst case. In a mostly ASCII environment you
waste some space, but you'd not likely crash and burn.

   NOTE: I'm 'fairly' certain that three is the max character
   length (2 for BMPString, 4 for Universal) but I was unable
   to find this absolutely specified in X.690 at the location
   in the standard where I expected it to be. If it's decided
   that a constraint should be used, this should be checked.

For almost all purposes UTF8String characters are three bytes max,
but in theory can range to 6 bytes per character.  Only when carrying
characters that lie outside the range of those carried by BMPString
does it exceed three characters per byte.  You may want to look at: 

ftp://ftp.informatik.uni-erlangen.de/pub/doc/ISO/charsets/ISO-10646-UTF-8.html

UTF8 characters), but this SIZE clause has a different purpose:
to force the string to have at least one character.
An omitted OPTIONAL variable length item (SEQUENCE OF, xxxString, etc)
can have two possible encodings - absent, or zero length.  By forcing

In ASN.1 an omitted OPTIONAL variable length item has a single encoding -
absent.  In ASN.1 a value with a missing item is entirely different from
one where the length is zero.
 
the item in question to have at least one element, the encoding
ambiguity is eliminated.  I've gotten in the habit of including the
SIZE clause as boilerplate.

A very good habit. 

A good habit indeed!
 

I believe someone mentioned some time ago that this was a problem that
should be addressed in X.680/690, instead of in every application protocol.
But AFAIK it has not been addressed yet.

No, X.680 does not put a limit on MAX.  Some standards choose to use
a fixed upper bound instead.

--------------------------------------------------------------------------
Bancroft Scott                                Toll Free    :1-888-OSS-ASN1
Open Systems Solutions, Inc.                  International:1-609-987-9073
baos(_at_)oss(_dot_)com                                  Tech Support 
:1-732-249-5107
http://www.oss.com                            Fax          :1-732-249-4636



<Prev in Thread] Current Thread [Next in Thread>