ietf
[Top] [All Lists]

Fw: [idn] URL encoding in html page

2002-03-22 12:20:03

 Any one of you that are interested in seeing how URL links on page working
can goto http://www.neteka.com/MLTest/

If the page is set to the proper charset, there should be no problem
clicking the link. If you are saying that some user may set the improper
charset so that the links won't work, that is total non-sense because if the
user cannot view what is on the page or cannot UNDERSTAND what is on the
page(the chinese viewing a Korean page sample), then why is that user ending
up looking at that webpage...

 If UTF8 or 8 bit DNS are implemented properly... links should not be a
problem, and there is no need to change the browser behaviour so that it
will hide ACE and show IDN on the UI, what happen if someone really wants to
put something looks like an ACE on their page then we need another escape
sequence for that??

 David Leung
Chief Technology Officer
Neteka Inc.
T: (416) 971-4302
http://w!.neteka.com

----- Original Message -----
From: "Soobok Lee" <lsb(_at_)postel(_dot_)co(_dot_)kr>
To: "IETF idn working group" <idn(_at_)ops(_dot_)ietf(_dot_)org>
Sent: Friday, March 22, 2002 5:04 AM
Subject: Re: [idn] URL encoding in html page



----- Original Message -----
From: "Bruce Thomson" <bthomson(_at_)fm-net(_dot_)ne(_dot_)jp>
To: "Soobok Lee" <lsb(_at_)postel(_dot_)co(_dot_)kr>; "IETF idn working 
group"
<idn(_at_)ops(_dot_)ietf(_dot_)org>
Sent: Friday, March 22, 2002 6:29 PM
Subject: Re: [idn] URL encoding in html page


What if all the html viewable text is in english, but, only the href
url contains
legacy (korean) encoded hostnames?  chinese visitors would see clean
english homepage,
but fail to click through the korean link.

Well, that could happen, but a META tag would solve that so easily.
Personally
I often use a simple text editor to deal with HTML, and would find it
easier to
use legacy encodings or UTF-8 than cut-and-paste ACE from somewhere.
Of course the user could do it either way and it would work.

Yes. Charset META tags help. But, many homepages  have assumptions on
the
main audience's
default char encodings and very often omit the  META tag for the
encoding
like :
  <meta http-equiv="Content-Type" content="text/html; charset=euc-kr">

Moreover, IDN url would be used in a pure FRAMESET document that defines
frame URLs
and contains no viewable texts. Such FRAMESET documents often omit
charset
META tags.
 (look into the html source of http://www.freeway.co.kr/ )

AFIAK, 99.99999% of korean homepages have implicit/explicit
legacy korean encoding (KS_C_5601-1987 or euc-kr). So do most
japanese/chineses homepages.
UTF8/UCS-2 encodings are rarely used in global WEB publishing.  Legacy
encodings
will dominates even in the future, because it is compact and
inexpensive.

IF we want to make IDN truly internationally interoperable, all
IDN-aware
webbrowsers/applications
should contain libaries of all kinds of legacy-to-Unicode conversion
routines. It will burden
too much memory load on handheld devices like PDA.

Moreover, legacy encodings are revised separately from unicode. We may
face with as toughest
versioning problems as we did in stringprep/nameprep versioning problems
for newly added unicode points.
How to guarantee  stability and intergrity of IDN operations in the all
combinations of  numerous kinds and versions of iDN-aware
applications and legacy encodings?

Soobok Lee








<Prev in Thread] Current Thread [Next in Thread>