Re: Comment on ESS and Privacy Marks

Blake Ramsdell wrote:


On Monday, February 16, 1998 1:08 PM, Phillip H. Griffin
[SMTP:asn1(_at_)mindspring(_dot_)com] wrote:

Type UTF8String has UNIVERSAL 12 as its tag. X.680 states
that "In the value notation all BMPString values are valid
UniversalString and UTF8String values", and notes that the
notation for defining individual character values is the
same for these types.


If this is the case, and UTF8String has UNIVERSAL 12 as its tag, I
recommend that we use that tag instead of OCTET STRING to package our
UTF8 strings.  I don't know what PKIX is doing regarding this, but I
suspect that they should do the same thing.

As far as profiling UTF8String (should we constrain this to ISO 10646-1
UCS-2 == BMPString == UNICODE), suggestions are welcome.


The following definition:

  BMP ::= UTF8String (FROM({0, 0, 0, 0}..replacementCharacter))

would do the trick.

I might mention, just for most folks that might find it odd that 
ASN.1 *is* sometimes simple (:-0), that to define any individual
national language characters, say for test purposes, the 
following ASN.1 value notations each produce distinct 
encodings of exactly the same values:

  aval BMPString       ::= {0, 0, 4, 56}
  aval UTF8String      ::= {0, 0, 4, 56}
  aval UniversalString ::= {0, 0, 4, 56}

If you consider, say, characters from a recent informative
RFC on the KOI8-U Ukrainian character set, you could define
such values by coding:

  cyrillicSmallLetterEn  UTF8String ::= {0, 0, 4, 61} -- 206 CE U043D
  cyrillicSmallLetterO   UTF8String ::= {0, 0, 4, 62} -- 207 CF U043E
  cyrillicSmallLetterPe  UTF8String ::= {0, 0, 4, 63} -- 208 D0 U043F

or

  cyrillicSmallLetterEn  BMPString ::= {0, 0, 4, 61} -- 206 CE U043D
  cyrillicSmallLetterO   BMPString ::= {0, 0, 4, 62} -- 207 CF U043E
  cyrillicSmallLetterPe  BMPString ::= {0, 0, 4, 63} -- 208 D0 U043F

Note here that the comments relate to the tables in the appendix
of draft-rfced-info-koi8-u-03.txt, and that the "U" numbers point
to the Unicode character numbers referenced in that draft. So it 
is possible to use such value notation to generate test data for 
all of the world's languages.

For your favorite visible ASCII (which I note is an acronym that 
never appears in the ASN.1 standards) character, the following 
example is provided in X.680:

  space BMPString ::= {0, 0, 0, 32}
  exclamationMark BMPString ::= {0, 0, 0, 33}
  quotationMark BMPString ::= {0, 0, 0, 34}
   ...  -- and so on
  tilde BMPString ::= {0, 0, 0, 126}

Note again that you can substitute UTF8String in all
of these definitions for BMPString (unicode).

It is also quite easy to create your own language types
that are constrained to a given specific permitted alphabet
that some coder tools will enforce. An old copy of the ASN.1
CHARACTER SET MODULE lists the following for example:

  BasicLatin ::=  BMPString (FROM(space..tilde))

  Hangul ::= BMPString 
      (FROM(hangulSyllableKiyeokA..hangulSyllableHieuhIIeung))

  Katakana ::= BMPString (FROM({0, 0, 48, 160}..{0, 0, 48, 255}))

  Bopomofo ::= BMPString (FROM({0, 0, 49, 0}..{0, 0, 49, 47}))

  Bmp ::= BMPString (FROM({0, 0, 0, 0}..replacementCharacter))

Notice that you can use defined character names like "space",
if you've defined such a character, or you can simply use 
numbers.

Phil


Blake
--
Blake C. Ramsdell
Worldtalk Corporation
For current info, check http://www.deming.com/users/blaker
Voice +1 425 882 8861 x103  Fax +1 425 882 8060


-- 
Phillip H. Griffin         Griffin Consulting
asn1(_at_)mindspring(_dot_)com        ASN.1-SET-Java-Security
919.828.7114               1625 Glenwood Avenue
919.832.7008 [mail]        Raleigh, North Carolina 27608 USA
------------------------------------------------------------
          Visit  http://www.fivepointsfestival.com
------------------------------------------------------------