Hello,
please register the following as an Internet-Draft.
To the 822-ext list: Thanks for all the constructive comments
- I believe this should be a much better document than my previous
draft.
And a merry holiday season to you all!
Harald T. Alvestrand
------ cut here ------
draft Language Tag Dec 93
Language tags for MIME content portions
Thu Mar 18 14:03:38 MET 1993
Harald Tveit Alvestrand
UNINETT
Harald(_dot_)Alvestrand(_at_)uninett(_dot_)no
Abstract
This document describes a Content-Language: header for use with
body parts of MIME.
It also describes a new parameter to the Multipart/Alternative
type, to aid in the usage of the Content-Language: header.
Status of this Memo
This draft document is being circulated for comment.
If consensus is reached it may be submitted to the RFC editor as a
Proposed Standard protocol specificiation.
Please send comments to the author, or to the IETF-822 mailing
list <ietf-822(_at_)dimacs(_dot_)rutgers(_dot_)edu>
The following text is required by the Internet-draft rules:
This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its
Areas, and its Working Groups. Note that other groups may also
distribute working documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by
other documents at any time. It is not appropriate to use
Internet Drafts as reference material or to cite them other than
Alvestrand Expires Jun 23 93 [Page 1]
draft Language Tag Dec 93
as a "working draft" or "work in progress."
Please check the I-D abstract listing contained in each Internet
Draft directory to learn the current status of this or any other
Internet Draft.
The filename of this document is draft-alvestrand-language-tag-
00.txt
Alvestrand Expires Jun 23 93 [Page 2]
draft Language Tag Dec 93
1. The Language tag
The language tag is composed of 2 parts: A language tag and a
subtag.
The syntax of this header is:
Language-Header ::= 'Content-language:' Language [',' Language]...
Language ::= ALPHA*8 [ '-' ALPHA*8 ]
The namespace of language tags and subtags is administered by the
IANA. The following registrations are predefined:
In the language tag:
- All 2-letter codes are interpreted according to ISO 639.
- All 3-letter codes are reserved for the (hopefully)
forthcoming revision to ISO 639
- The value "IANA" is reserved for IANA-defined
subregistrations
- The value "X" is reserved for private use. Subtags of "X"
will not be registered by the IANA.
- No other registration is allowed.
In the sublanguage tag:
- All 2-letter codes are interpreted as ISO 3166 country codes,
according to the rules laid down in ISO 639.
- Codes of 3 to 8 letters may be registered with the IANA by
anyone who feels a need for it. IANA has the right to reject
registrations that are felt to be misleading.
The information in the sublanguage tag may for instance be:
Alvestrand Expires Jun 23 93 [Page 3]
draft Language Tag Dec 93
- Country identification, such as en-US (this usage is
described in ISO 639)
- Dialect information, such as no-NYNORSK or en-COCKNEY
- Languages not listed in ISO 639, which can be registered with
the IANA prefix, such as IANA-CHEROKEE
If multiple languages are used in the MIME body part, they are
listed with commas between them.
2. MEANING
The meaning of the header is:
- For a single information object, it should be taken as the
set of languages that is required for a complete
comprehension of the complete object. Examples: Simple text.
- For an aggregation of information object, it should be taken
as the set of languages used inside components of that
aggregation. Examples: Document stores and libraries.
- For information objects whose purpose in life is providing
alternatives, it should be regarded as a hint that the
material inside is provided in several languages, and that
one has to inspect each of the alternatives in order to find
its language or languages. In this case, multiple languages
need not mean that one needs to be multilingual to get
complete understanding of the document. Examples: MIME
multipart/alternative.
EXAMPLES:
NOTE: NONE of the sublanguage codes shown in this document
have actually been assigned; they are used for illustration
purposes only.
Norwegian official document, with parallel text in both
official versions of Norwegian. Both versions are readable by
all Norwegians.
Alvestrand Expires Jun 23 93 [Page 4]
draft Language Tag Dec 93
Content-language: no-nynorsk, no-bokmaal
Voice recording from the London docks
Content-language: en-cockney
Document in Sami, which does not have an ISO 639 code, and is
spoken in several countries, but with about half the speakers
in Norway
Content-language: iana-sami
An English-French dictionary
Content-language: en, fr (This is a dictionary)
An official EC document
Content-language: en, fr, ge, da, gr, it
An excerpt from Star Trek dialogue
Content-language: x-klingon
3. Usage examples
Examples of protocol usage of this header are:
- WWW selection of an appropriate version of information for
display, based on a profile for the user listing languages
that are understood
- MIME usage of alternate body parts in E-mail
4. The difference parameter to multipart/alternative
As defined in RFC 1541, Multipart/Alternative only has one
parameter: boundary.
The common usage of Multipart/Alternative is to have more than one
format of the same message (f.ex. PostScript and ASCII).
Alvestrand Expires Jun 23 93 [Page 5]
draft Language Tag Dec 93
The use of language tags to differentiate between different
alternatives will certainly not lead all MIME UAs to present the
most sensible body part as default.
Therefore, a new parameter is defined, to allow the configuration
of MIME readers to handle language differences in a sensible
manner.
Name: Difference
Value: One of
content-type
content-language
Further values can be registered with IANA; it must be the name of
a header for which a definition exists in a published document.
If not present, Difference=Content-type is assumed.
The intent is that the MIME reader can look at this header of the
message component to do an intelligent choice of what to present
to the user.
(The intent of having registration with IANA of the fields used in
this context is to maintain a list of usages that a mail UA may
expect to see, not to reject usages)
MIME EXAMPLE:
Content-type: multipart/alternative; difference=content-language;
boundary="limit"
Content-language: en, fr
--limit
Content-language: fr
--limit
Content-language: en
--limit--
When composing a message, the choice of sequence may be somewhat
arbitary. However, non-MIME mail readers will show the first body
part first, meaning that this should most likely be the language
understood by most of the recipients.
Alvestrand Expires Jun 23 93 [Page 6]
draft Language Tag Dec 93
5. Security considerations
Security considerations are not considered in this memo
6. Character set considerations
Codes are always US-ASCII. The issue of deciding upon the
rendering of a character set based on the language encoding is not
addressed in this memo; however, the author cautions against
thinking that such a decision can be made correctly for all cases
(for example, a rendering engine that decides font based on
Japanese or Chinese language will fail to work when a mixed
Japanese-Chinese text is encountered)
7. Gatewaying considerations
RFC 1327 defines a Language: header. This header is not
recommended now, because it is defined to be a single 2-letter
language code, and the X.400 header it is supposed to gateway is a
list of language codes.
It is suggested that RFC 1327 be updated to produce the Content-
language: header, and to turn this header into the ISO/CCITT
specified Language components rather than the RFC-822-headers
heading extension.
8. References
ISO 639
ISO 3166
RFC 1521
RFC 1327
Alvestrand Expires Jun 23 93 [Page 7]