Re: From field in ISO-2022-JP

2002-12-03 12:19:34
On December 3, 2002 at 23:11, Shinichiro HIDA wrote:

I noticed 〜 probrem, a cause Mozilla(on linux) could not
display this entity (showed white square) mail converted with
MHonArc2002-12-02-snap. Perhaps browser could not display correct size
font. Then I rewrite to ~ mozilla could displayed..

I get a glyph, but if I try to enlarge the view, it converts to a '?'.
Hence, it appears to be a problem of having the right fonts available.

Perhaps there is some probrem in glyph(?) or font or browsers(?) 
or implementation?.

I think it is mainly a glyph issue.  I checked the Unicode charts
(U3000.pdf, UFF00.pdf, U32-3000.pdf, U32-FF00.pdf), and the sample
glyphs they use for U+301C and U+FF5E are quite similiar.

The following note is made about U+301C:

    This character was encoded to match JIS C 6226-1978 1-33 "wave
    dash". Subsequent revisions of the JIS standard and industry practice
    have settled on JIS 1-33 as being the fullwidth tilde character.
    -> 3030   wavy dash
    -> FF5E   fullwidth tilde

It should be noted that Unicode does try to deal with round-trip
conversion issues.  For example, converting euc-jp to unicode and
then convert back to euc-jp.  Ideally, you want this to be an
identity transformation, but sometimes there are ambiguities.

With respect to MHonArc, it only goes one way: euc-jp -> unicode
(character entity references).  Hence, it does not have to address
round-trip conversion.

There may be a use in having the ability to tell MHonArc how
to map characters, overriding the default tables.  Maybe something
like the following:

  <UFF5E> \xA1\xC1

The first line signifies the character/code set name.  Subsequent
lines associate Unicode code points to code points in the specified
set name.

This would allow a user to customize MHonArc::CharEnt conversion
tables to resolve an ambiguity according to the preferences of
a particular locale.  It seems that visual appearance is the most
important issue, but one would have to be careful since making such
changes could affect other software, like search engines.

Would such a feature be useful?


To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the

<Prev in Thread] Current Thread [Next in Thread>