Re: List of open issues with Sieve reject draft (draft-ietf-sieve-refuse



On Jul 17, 2006, at 6:21 AM, Aaron Stone wrote:


On Sat, 2006-07-15 at 15:16 +0100, Alexey Melnikov wrote:

Kjetil Torgrim Homme wrote:


2). Non ASCII text in rejection string - should it cause creation of
DSN/MDN, runtime error or stripping of non-ASCII content?
Should we add a tagged argument to control this?
Or maybe we need another capability to enable UTF-8 clean rejection?

That capability needs to be added to SMTP, right?

Yes, that would be an SMTP extension for allowing of UTF-8 human
readable response text.


I'd rather like to see such a UTF-8 extension, instead of the
workarounds being proposed to shoehorn messages into US-ASCII.


Of course it would be nice to be able to issue other sorts of text
in SMTP responses.  But such proposals have been going nowhere for
years.  However, the IETF EAI (Email Address Internationalization)
Working Group work may finally kick this into happening.  (After
all, currently SMTP responses often include a domain name such as on
the banner line, or an e-mail address such as in a response to a
RCPT TO: or EXPN: command.  So internationalizing email addresses
leads right into at least some internationalization of SMTP sessions.)
Note that although that although the EAI focus is on
internationalization of email addresses, their charter is a more
general examination of internationalization of the email environment,
as internationalized email addresses rapidly bring up other issues.

Quoting from Section 1.2, Problem Statement, of

           Overview and Framework for Internationalized Email

http://www.ietf.org/internet-drafts/draft-ietf-eai-framework-01.txt

   Internationalization of email addresses is not merely a matter of
   changing the SMTP envelope, or of modifying the From, To, and Cc
   headers, or of permitting upgraded mail user agents (MUAs) to decode
   a special coding and display local characters.  To be perceived as
   usable by end users, the addresses must be internationalized, and
   handled consistently, in all of the contexts in which they occur.
   That requirement has far-reaching implications: collections of
   patches and workarounds are not adequate.  Even if they were
   adequate, that approach risks an assortment of implementations with
   different sets of patches and workarounds having been applied with
   consequent user confusion about what is actually be run and
   supported.  Instead, we need to build a fully internationalized email
   environment, focusing on permitting efficient communication among
   those who share a language or other community (see [I18Nemail-
   constraints] for an extended discussion of this optimization).

And take a look at the mentioned I18Nemail-constraints:

 Internationalization in Internet Applications: Issues, Tradeoffs, and
                            Email Addresses

http://www.ietf.org/internet-drafts/draft-klensin-ima-constraints-00.txt

especially

3.3.  Communication across languages and cultures

   All of this implies that those who communicate across language and
   cultural groups will be required to learn, if they do not understand
   already, to be quite self-aware about the use of internationalized
   identifiers, as well as other examples of characters or languages,
   across those boundaries.  There will be a lower level of demands on
   those who communicate only in a single language and within a single
   culture.  This is, of course, not an issue that originated with the
   introduction of the Internet: it has been this way since languages
   and scripts started to differentiate from each other and since
   different cultures came into contact.  As we internationalize the
   network, a user of a given language that cannot be fully expressed in
   ASCII will always be faced with a choice between insisting on the
   purism of an email address local part and domain name in the script
   associated with the local language and maximizing the number of
   people who can communicate with her conveniently.  In some cases, the
   right answer will be "local language", in others, it will be "ASCII",
   and in still others it will be "maintain two addresses".  We are not
   required, and should not try, to make that choice for users: the
   users should make the best choices for their own needs, preferably
   after understanding the consequences of the choices.  As a community,
   we will need to be very clever about user interfaces.  As an example
   much more general than email, if someone with no ability to read
   Chinese characters sees a domain name written in those characters and
   decides she wants to copy and paste it somewhere, the copy mechanism
   is probably going to need to provide for both "copy the Chinese" and
   "convert quietly to punycode and copy that".  Either choice, by
   itself, will be wrong sometimes.  Users who both want to use Chinese-
   script domain names and communicate outside that language or script
   or culture are going to either learn to understand the difference and
   relationship, or develop some good rituals that work, or the network
   will keep slapping them in the head with failed lookups or bounced
   mail until they do learn.  Of course, substantially any language or
   script could be substituted for "Chinese" in that example.

Substitute in the words "error response text" where the above discussion
talks about email addresses, and I think it still applies pretty well.

But in any case, take a look at the EAI planned timeline, which is at
this point merely for _experimental_ RFCs (to start testing out ideas
and implementations), and then keep in mind that even supposing an RFC
comes out with an SMTP extension for non-US-ASCII text in SMTP dialogue
(in particular, non-US-ASCII text in SMTP responses), there will be a

long (long!) time before one can forget about the old software thatdoesn'tsupport such. So as far as SMTP responses are concerned, eitherstickingto US-ASCII, or being able to downgrade from non-US-ASCII to US-ASCIIwhen

dealing with pre-extension SMTP software is going to be necessary for a
long, long time.

So as far as what we can do _now_ for rejection text: sticking withUS-ASCII

rejection text has an efficiency advantage (allows SMTP protocol level

rejection) for those who are willing to accept the restriction ofUS-ASCIItext, but some users and user communities will prefer (demand!) to usetheir"own language", as they can with the "old" (original) reject behavior,eventhough that means the cost of generating a notification message (DSN orMDN)

instead of being able to do SMTP protocol level rejection.  And I would

consider that completely appropriate for them to do in non-spamrejections,

though I might hope/encourage them to stick with US-ASCII and hence make
SMTP protocol level rejection possible for believed-to-be-spam message
rejections.

I would take a quote from above regarding email addresses (locallanguage

vs. US-ASCII vs. providing both forms of email address) and apply it to
rejection text:

   We are not
   required, and should not try, to make that choice for users: the
   users should make the best choices for their own needs, preferably
   after understanding the consequences of the choices.

That is, I believe that we need to allow users -- when they wish -- to

choose which behavior they want. And I do not believe that it ispossible

to avoid at least some user education/training, at least in the form of
a "smart" user interface, especially for users who normally operate in
languages not written using US-ASCII.  While Western European language
users may be able to remain happily oblivious of the difference between

SMTP protocol rejection and rejection in the form of a notificationmessage,oblivious of any (desired) difference in rejecting spam vs. otherrejections,users of languages that use/need a different charset _will_ need tohave atleast some awareness (or the interface that generates Sieve filters ontheir

behalf will need some awareness) that it is much preferable to reject
believed-to-be-spam messages with the "stick to US-ASCII-only" option,

whereas more "personal" rejections for other reasons can, if the userwishes,

be rejected using more personalized text in their own language.

Regards,

Kristin


Aaron

Re: List of open issues with Sieve reject draft (draft-ietf-sieve-refuse-reject-02.txt)