ietf-822
[Top] [All Lists]

A spec for showing language in MIME headers

1993-11-02 08:17:36
Hello,
I have written up a draft of a draft, showing what I feel that
a "Content-language" header should look like.

In the process, I also found a need to augument multipart/alternative
with an argument that told people what the difference is; I found
this easier than starting to define multipart/alternative(format),
multipart/alternative-language, multipart/alternative-rating
(rating=pg/r/x, for MIME video rental :-) and so on.

It's too late to give it much thought at the current IETF, but
final authority is in the lists anywaon.
Ready, set - COMMENT!
Regards,

          Harald T. Alvestrand

Internet-Draft



Language tag for MIME body parts



<blabla goes here>



This document describes a Content-Language: header for use with body

parts of MIME.

It also describes a new parameter to the Multipart/Alternative type,

to aid in the usage of the Content-Language: header.



The syntax of this header is:



Content-language: <2xAlpha>[_2xAlpha] (comment) [ , ... ]



The first 2xAlpha is an ISO 639 code for a language. If required, the

second 2xAlpha may define the country using a particular language

(such as en_GB and en_US), as per ISO 639.



If further information is needed, it is carried as RFC-822 comments

until ISO 639 is revised.



For languages that do not have an ISO 639 code, the language "xx" is

used, with an appropriate geographical area and comment. This is not

very useful for picking the correct thing, but is better than lying.

(The codes xa to xz are reserved for local use in ISO 639 <CHECK>)



This may include:



- Dialect information. ISO 639 does not recognize variants of a

language that do not correspond to countries.



- Languages not listed in ISO 639.



If multiple languages are used in the MIME body part, they are listed

with commas between them.

If the comments need the usage of extended character sets, RFC 1342

<should really have a STD number; it changes!> is used inside the

comment.



MEANING



The meaning of the header is:



- For a single information object, it should be taken as the set of

languages that is required for a complete comprehension of the

complete object. Examples: Simple text.



- For an aggregation of information object, it should be taken as the

set of languages used inside components of that aggregation.

Examples: Document stores and libraries.



- For information objects whose purpose in life is providing

alternatives, it should be regarded as a hint that the material

inside is provided in several languages, and that one has to inspect

each of the alternatives in order to find its language or languages.

In this case, multiple languages need not mean that one needs to be

multilingual to get complete understanding of the document. Examples:

MIME multipart/alternative.



EXAMPLES:



Norwegian official document, with parallel text in both official

versions of Norwegian. Both versions are readable by all Norwegians.



  Content-language: no (nynorsk), no (bokm?l)



Voice recording from the London docks



  Content-language: en_GB (cockney)



Document in Sami, which does not have an ISO 639 code, and is spoken

in several countries, but with about half the speakers in Norway



  Content-language: xx_no (Sami)



An English-French dictionary



  Content-language: en, fr (This is a dictionary)



An official EC document



  Content-language: en, fr, ge, da, gr, it



USAGE EXAMPLES



Examples of protocol usage of this header are:



- WWW selection of an appropriate version of information for display,

based on a profile for the user listing languages that are understood



- MIME usage of alternate body parts in E-mail



THE DIFFERENCE= PARAMETER



Currently, Multipart/Alternative only has one parameter: boundary.



The common usage of Multipart/Alternative is to have more than one

format of the same message (f.ex. PostScript and ASCII).



The use of language tags to differentiate between different

alternatives will certainly not lead all MIME UAs to present the most

sensible body part as default.



Therefore, a new parameter is defined, to allow the configuration of

MIME readers to handle language differences in a sensible manner.



Name: Difference

Value: One of

     content-type

     content-language



Further values can be registered with IANA; it must be the name of a

header for which a definition exists in a published document.

If not present, Difference=Content-type is assumed.



The intent is that the MIME reader can look at this header of the

message component to do an intelligent choice of what to present to

the user.



MIME EXAMPLE:



Content-type: multipart/alternative; difference=content-language

Content-language: en, fr



--limit

Content-language: fr



--limit

Content-language: en



--limit--



In order to give a sensible display first on non-MIME readers, the

English version should usually be the first one in the list of body

parts.





SECURITY CONSIDERATIONS



Security considerations are not considered in this memo



CHARACTER SET CONSIDERATIONS



See RFC 1342 comment. Codes are always US-ASCII.



GATEWAYING CONSIDERATIONS



RFC 1327 defines a Language: header. This header is not recommended

now, because it is defined to be a single 2-letter language code, and

the X.400 header it is supposed to gateway is a list of language

codes.



It is suggested that RFC 1327 be updated to produce the

Content-language: header, and to turn this header into the ISO/CCITT

specified Language components rather than the RFC-822-headers heading

extension.



REFERENCES



ISO 639

RFC 1341

RFC 1342

RFC 1327