Hello,
I have written up a draft of a draft, showing what I feel that
a "Content-language" header should look like.
In the process, I also found a need to augument multipart/alternative
with an argument that told people what the difference is; I found
this easier than starting to define multipart/alternative(format),
multipart/alternative-language, multipart/alternative-rating
(rating=pg/r/x, for MIME video rental :-) and so on.
It's too late to give it much thought at the current IETF, but
final authority is in the lists anywaon.
Ready, set - COMMENT!
Regards,
Harald T. Alvestrand
Internet-Draft
Language tag for MIME body parts
<blabla goes here>
This document describes a Content-Language: header for use with body
parts of MIME.
It also describes a new parameter to the Multipart/Alternative type,
to aid in the usage of the Content-Language: header.
The syntax of this header is:
Content-language: <2xAlpha>[_2xAlpha] (comment) [ , ... ]
The first 2xAlpha is an ISO 639 code for a language. If required, the
second 2xAlpha may define the country using a particular language
(such as en_GB and en_US), as per ISO 639.
If further information is needed, it is carried as RFC-822 comments
until ISO 639 is revised.
For languages that do not have an ISO 639 code, the language "xx" is
used, with an appropriate geographical area and comment. This is not
very useful for picking the correct thing, but is better than lying.
(The codes xa to xz are reserved for local use in ISO 639 <CHECK>)
This may include:
- Dialect information. ISO 639 does not recognize variants of a
language that do not correspond to countries.
- Languages not listed in ISO 639.
If multiple languages are used in the MIME body part, they are listed
with commas between them.
If the comments need the usage of extended character sets, RFC 1342
<should really have a STD number; it changes!> is used inside the
comment.
MEANING
The meaning of the header is:
- For a single information object, it should be taken as the set of
languages that is required for a complete comprehension of the
complete object. Examples: Simple text.
- For an aggregation of information object, it should be taken as the
set of languages used inside components of that aggregation.
Examples: Document stores and libraries.
- For information objects whose purpose in life is providing
alternatives, it should be regarded as a hint that the material
inside is provided in several languages, and that one has to inspect
each of the alternatives in order to find its language or languages.
In this case, multiple languages need not mean that one needs to be
multilingual to get complete understanding of the document. Examples:
MIME multipart/alternative.
EXAMPLES:
Norwegian official document, with parallel text in both official
versions of Norwegian. Both versions are readable by all Norwegians.
Content-language: no (nynorsk), no (bokm?l)
Voice recording from the London docks
Content-language: en_GB (cockney)
Document in Sami, which does not have an ISO 639 code, and is spoken
in several countries, but with about half the speakers in Norway
Content-language: xx_no (Sami)
An English-French dictionary
Content-language: en, fr (This is a dictionary)
An official EC document
Content-language: en, fr, ge, da, gr, it
USAGE EXAMPLES
Examples of protocol usage of this header are:
- WWW selection of an appropriate version of information for display,
based on a profile for the user listing languages that are understood
- MIME usage of alternate body parts in E-mail
THE DIFFERENCE= PARAMETER
Currently, Multipart/Alternative only has one parameter: boundary.
The common usage of Multipart/Alternative is to have more than one
format of the same message (f.ex. PostScript and ASCII).
The use of language tags to differentiate between different
alternatives will certainly not lead all MIME UAs to present the most
sensible body part as default.
Therefore, a new parameter is defined, to allow the configuration of
MIME readers to handle language differences in a sensible manner.
Name: Difference
Value: One of
content-type
content-language
Further values can be registered with IANA; it must be the name of a
header for which a definition exists in a published document.
If not present, Difference=Content-type is assumed.
The intent is that the MIME reader can look at this header of the
message component to do an intelligent choice of what to present to
the user.
MIME EXAMPLE:
Content-type: multipart/alternative; difference=content-language
Content-language: en, fr
--limit
Content-language: fr
--limit
Content-language: en
--limit--
In order to give a sensible display first on non-MIME readers, the
English version should usually be the first one in the list of body
parts.
SECURITY CONSIDERATIONS
Security considerations are not considered in this memo
CHARACTER SET CONSIDERATIONS
See RFC 1342 comment. Codes are always US-ASCII.
GATEWAYING CONSIDERATIONS
RFC 1327 defines a Language: header. This header is not recommended
now, because it is defined to be a single 2-letter language code, and
the X.400 header it is supposed to gateway is a list of language
codes.
It is suggested that RFC 1327 be updated to produce the
Content-language: header, and to turn this header into the ISO/CCITT
specified Language components rather than the RFC-822-headers heading
extension.
REFERENCES
ISO 639
RFC 1341
RFC 1342
RFC 1327