ietf-822
[Top] [All Lists]

Re: Non-ASCII Internet addresses?

1993-04-30 08:19:53
Liam R. E. Quin <lee(_at_)sqlee(_dot_)sq(_dot_)com> writes (Thu, 29 Apr 1993  
22:23:09 -0400):

Clearly < > ( and ) are all used up by RFC822, and , (comma) and : (colon)
are used to delimit multiple addresses, and also used in the sendmail mail
alias file format.  You can't have a : in a Unix user name or mail name,
because /etc/passwd, which contains the list of user names, uses colons to
separate fields.  The ! is used by uucp.

The remainder -- `;&|^ -- are not prevented, but it would be very irritating
to be using Unix with a usercode that included any of them, as you'd have
to quote them all the time when using the shell.

In any event, I'd advise avoiding punctuation where possible.
You really *can't* use < > ( ) = / ! @ $ : " ' \ . safely in mail addresses,
as they all mean things to do with mail routing or address parsing on various
systems. There are probably other characters to avoid, too.

I have proposed the use of five characters with special meaning
for encoding non-ASCII characters in addresses:  *&'_=

As an alternative for the first four, possible to use in X.400(84)
addresses, I suggested:  =/'+

If "/" and "'" also would be excluded (I don't see why "="
should be regarded as Unix-unsafe), the usable special
characters will be uncomfortably few.  The best I can think of
right now is this pessimistic modification of my proposal:

                         Optimistic  Pessimistic
                         ----------  -----------
Uppercase prefix              _          +
Switch to prefix repr.        =          =
Switch to 2-octet repr.       *          ==
Switch to ASCII repr.         '          ?
Switch to 4-octet repr.       &          =?


Here is a summary of different categories of special characters,
not least usable for those who want to play the "find the most
neglected special characters" game:

                             __All_characters_in_the_group___  _Added________

1a) printable-string chars:        '() +,-./:  = ?             
1b) EBCDIC-safe chars:           %&'()*+,-./:;<=>?     _       %&*;<>_
1c) invariant 7-bit chars:   !"  %&'()*+,-./:;<=>?     _       !"
1d) not usable for letters:  !"#$%&'()*+,-./:;<=>?     _       #$
1e) all ASCII special chars: !"#$%&'()*+,-./:;<=>?(_at_)[\]^_`{|}~  @[\]^`{|}~

2a) allowed in "atom":       ! #$%&'  *+ - /   = ?    ^_`{|}~  
2b) allowed in "qtext":      ! #$%&'()*+,-./:;<=>?(_at_)[ ]^_`{|}~  (),.:;<>@[]
2c) all ASCII special chars: !"#$%&'()*+,-./:;<=>?(_at_)[\]^_`{|}~  "\

3a) allowed in "atom":       ! #$%&'  *+ - /   = ?    ^_`{|}~  
3b) allowed in "ctext":      !"#$%&'  *+,-./:;<=>?(_at_)[ ]^_`{|}~  ",.:;<>@[]
3c) all ASCII special chars: !"#$%&'()*+,-./:;<=>?(_at_)[\]^_`{|}~  ()\

4a) allowed in "token":      ! #$%&'  *+ -.           ^_`{|}~  
4b) all ASCII special chars: !"#$%&'()*+,-./:;<=>?(_at_)[\]^_`{|}~  
"(),/:;<=>?[\]

--
Olle Jarnefors, Royal Institute of Technology, Stockholm 
<ojarnef(_at_)admin(_dot_)kth(_dot_)se>

<Prev in Thread] Current Thread [Next in Thread>