perl-i18n

[fwd] Re: [rt-devel] Language detection bug (from: ASnare(_at_)allshare(_dot_)nl)

2003-02-07 11:13:48
So, some folks are having issues with RT's automatic language detection,
which uses Locale::Maketext's infrastructure to do the right thing to 
guess what the right translation to serve up is. Anyone have comments on 
what Andrew's seeing?

        Jesse


----- Forwarded message from Andrew Snare <ASnare(_at_)allshare(_dot_)nl> -----

Return-Path: <rt-devel-admin(_at_)lists(_dot_)fsck(_dot_)com>
Delivered-To: j(_at_)pallas(_dot_)eruditorum(_dot_)org
Received: from pallas.eruditorum.org (localhost [127.0.0.1])
        by pallas.eruditorum.org (Postfix) with ESMTP
        id E8DB61114A; Fri,  7 Feb 2003 05:25:05 -0500 (EST)
Delivered-To: rt-devel(_at_)pallas(_dot_)eruditorum(_dot_)org
Received: from allshare.nl (unknown [62.58.46.99])
        by pallas.eruditorum.org (Postfix) with ESMTP id 4BEE011119
        for <rt-devel(_at_)lists(_dot_)fsck(_dot_)com>; Fri,  7 Feb 2003 
05:24:41 -0500 (EST)
Received: from deserver [10.1.3.36]
        by allshare.nl [127.0.0.1]
        with SMTP (MDaemon.PRO.v5.0.5.R)
        for <rt-devel(_at_)lists(_dot_)fsck(_dot_)com>; Fri, 07 Feb 2003 
11:23:43 +0100
Received: from 10.1.3.41 by deserver (InterScan E-Mail VirusWall NT); Fri, 07 
Feb 2003 11:23:43 +0100
Message-Id: 
<5(_dot_)1(_dot_)0(_dot_)14(_dot_)0(_dot_)20030207102442(_dot_)03402528(_at_)10(_dot_)1(_dot_)3(_dot_)36>
X-Sender: asnare(_at_)10(_dot_)1(_dot_)3(_dot_)36
X-Mailer: QUALCOMM Windows Eudora Version 5.1
To: Jesse Vincent <jesse(_at_)bestpractical(_dot_)com>
From: Andrew Snare <ASnare(_at_)allshare(_dot_)nl>
Subject: Re: [rt-devel] Language detection bug
Cc: Autrijus Tang <autrijus(_at_)autrijus(_dot_)org>,
        "THAUVIN Blaise (Dir. Informatique)" 
<bthauvin(_at_)clearchannel(_dot_)fr>,
        rt-devel(_at_)lists(_dot_)fsck(_dot_)com
In-Reply-To: <20030206182825(_dot_)GU19555(_at_)pallas(_dot_)fsck(_dot_)com>
References: 
<5(_dot_)1(_dot_)0(_dot_)14(_dot_)0(_dot_)20030206164216(_dot_)03368290(_at_)10(_dot_)1(_dot_)3(_dot_)36>
 
<870E25EC362DD6118A7400306E1260E2010D49D5(_at_)33par_exchange(_dot_)dauphin-affichage(_dot_)com>
 
<870E25EC362DD6118A7400306E1260E2010D49D5(_at_)33par_exchange(_dot_)dauphin-affichage(_dot_)com>
 
<5(_dot_)1(_dot_)0(_dot_)14(_dot_)0(_dot_)20030206164216(_dot_)03368290(_at_)10(_dot_)1(_dot_)3(_dot_)36>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-MDRemoteIP: 10.1.3.36
X-Return-Path: ASnare(_at_)allshare(_dot_)nl
X-MDaemon-Deliver-To: rt-devel(_at_)lists(_dot_)fsck(_dot_)com
Sender: rt-devel-admin(_at_)lists(_dot_)fsck(_dot_)com
Errors-To: rt-devel-admin(_at_)lists(_dot_)fsck(_dot_)com
X-BeenThere: rt-devel(_at_)lists(_dot_)fsck(_dot_)com
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Help: 
<mailto:rt-devel-request(_at_)lists(_dot_)fsck(_dot_)com?subject=help>
List-Post: <mailto:rt-devel(_at_)lists(_dot_)fsck(_dot_)com>
List-Subscribe: <http://lists.fsck.com/mailman/listinfo/rt-devel>,
        
<mailto:rt-devel-request(_at_)lists(_dot_)fsck(_dot_)com?subject=subscribe>
List-Id: RT Development <rt-devel.lists.fsck.com>
List-Unsubscribe: <http://lists.fsck.com/mailman/listinfo/rt-devel>,
        
<mailto:rt-devel-request(_at_)lists(_dot_)fsck(_dot_)com?subject=unsubscribe>
List-Archive: <http://lists.fsck.com/pipermail/rt-devel/>
Date: Fri, 07 Feb 2003 11:23:48 +0100
X-Spam-Status: No, hits=-0.7 required=5.0
        tests=AWL,IN_REP_TO,KNOWN_MAILING_LIST,QUOTED_EMAIL_TEXT,
              SPAM_PHRASE_03_05
        version=2.43
X-Spam-Level: 

At 01:28 PM 6/02/2003 -0500, Jesse Vincent wrote:
On Thu, Feb 06, 2003 at 04:44:40PM +0100, Andrew Snare wrote:
At 08:48 PM 6/02/2003 +0800, Autrijus Tang wrote:
This is because 'en-us' is provided, instead of 'en'.  This is arguably
wrong,
since there's no US-specific things in that lexicon -- maybe just mv
rt/lib/RT/I18N/en_us.po to en.po?

While this fix may work, I think there's still a bug in the
language-matching. According to my reading of RFC2616, Section 14.4, if
'en' is in my list, RT should be matching that against the 'en-us' that it
can supply.

Which part of the language in that section? I'm not seeing it.
FWIW, RT is using Locale::Maketext to do the parsing of the language
tags. Switching from en-us to en seems to be at least _one_ of the right
things to do. If we can make a case for anything else, I'm sure Sean
would be happy to let us try to sell him on it.

I'm reading it again; it's a little ambiguous. The text we're discussing, 
reformatted, is:
        A language-range matches a language-tag if:
                1) it exactly equals the tag; or if
                2) it exactly equals a prefix of the tag such that the
                   first tag character following the prefix is "-".
        The special range "*", if present in the Accept-Language field,
        matches every tag not matched by any other range present in the
        Accept-Language field.

Definitions, restated to give context and also highlight any bad 
assumptions[1] I'm making, are:
        language-range:
                One of the languages tags in the client-supplied 
Accept-Languages
                header. Eg: 'en' or 'en-au'
        language-tag:
                The language tag of the available content on the server.
                Eg: 'en-us'
                (NOTE: This is not explicitly defined, unfortunately, and I
                 may be making an error in assuming this.)

The situation that has occurred, is that people have 'en' in their 
Accept-Languages header, and the server has 'en-us' content available. It 
seems to me that these should match since 'en' is a prefix of 'en-us'.

As you mention, switching from en-us to en is one of several possible 
solutions. I'd argue against this switch however, for the following reasons:
        1) If a user has 'en-us' on the list, but not 'en', they won't get 
the     content. (The prefix rule is one-way). This is apparently quite 
common.
        2) Although there might not be much specifically American in 
the           translation, it will be American in subtle ways[2].

I hope this helps, one way or the other. Cheers,

 - Andrew
[1] My assumptions appear to at least match those in this post:
<http://groups.google.com/groups?selm=Pine.HPP.3.95a.1000121173010.24389J-100000%40hpplus01.cern.ch>
[2] For more information, see Section 2.2 of 
<http://kfa.univ.szczecin.pl/histvar/american.html>



_______________________________________________
rt-devel mailing list
rt-devel(_at_)lists(_dot_)fsck(_dot_)com
http://lists.fsck.com/mailman/listinfo/rt-devel


----- End forwarded message -----

-- 
»|« http://www.bestpractical.com/rt  -- Trouble Ticketing. Free.