Re: language detection

1999-07-07 07:18:25
For those interested in seeing language metadata implemented for email
One argument for it is access to information for people with disabilities.
The argument goes roughly as follows:

Email is just as world-wide in its reach as the WWW.

The W3C in their recently-released Web Content Accessibility Guidelines
have recognized that language information is important for correct
text-to-speech processing which is widely used by blind people to access
computer text such as email.

The reasoning by which the W3C concluded that Web documents should have
language metadata applies with the same force to all internet documents
such as email messages.


PS:  At one point someone was working to organize an email accessibility
activity.  At the moment I can't retrieve the reference.

At 12:51 AM 7/7/99 -0700, Carl S. Gutekunst wrote:
[Resending. My original message was filtered out as being from a non-member.]

RFC 2184 provides methods to encode language information in message
header text.

Yes. The language community has serious problems with RFC 2184, however (as
in, it cannot be parsed and it breaks compatibility with existing practice),
and no one has implemented it that I am aware of.

From my reading it is a little unclear if it can be used for specifying
language in message bodies. For example:

     Content-Type: text/plain; charset="iso-8859-1'en"

No, you would not use RFC 2184 for that; you would use RFC 1766:

      Content-Type: text/plain; charset="iso-8859-1"
      Content-Language: en


<Prev in Thread] Current Thread [Next in Thread>