Re: use Encode; # on Japanese; LONG!

jhi,

On 2002.01.10, at 11:18, Jarkko Hietaniemi wrote:

Yes, I've heard of you :-)


  Thanks.

(I'm CCing Nick Ing-Simmons, the original author of Encode, and
SADAHIRO Tomoyuki, who has worked on it a little bit, and who might
also know some Japanese :-)


  Therefore I'm cc'ing this to all recepients.

First off, I'm really thankful that you took a careful look the
current state of Encode as per Japanese encodings.

And my apology for not moving quick enough despite being a mainternerof Jcode. Well, I have an excuse. My 1st ($[ = 0, of course :) childwas born 18 days ahead of ETA. She was supposed to be born on the 7ththis month but "hello world"ed on the 20th last month. So my holidayseason schedule was quite an disarray....

I won't (can't) comment on much the Encode details, since I'm pretty
unfamiliar the design or the implementation, all I've done is to add
some (eight-bit) encodings many moons ago.  I'm hoping Nick and Sadahiro
will join in and comment.

The surprising thing however broken it functioned somewhat. Not badfor a character set you have virtually no idea on. It's as miraculousas assembling the Machine out of the blueprint sent over the stars(Read/Seen 'Contact' by late Carl Sagan?).

How about "not at all"? :-)

How do you say that in Finnish? In Japanese it would be "ZenzenWakarimasen".

don't have to; I don't grok Finnish either :).  It takes more than a
simple table lookup to handle Japanese well enough to make native
grokkers happy.  It has to automatically detect which of many charsets

are used, it has to be robust, and most of all, it must be documentedin

Japanese :)  I can do all that.


Excellent.

Or is it? As a matter of fact Jcode POD contains no Japanese sincepod parser groks no Japanese. It just has a web page in both languagesand mailing list, however....

   If I submit Encode::Japanese, are you going to merge it standard
module?


Definitely, yes.  Implementation-wise you'll have to discuss with Nick
since whatever we use should work with the Tcl/Tk scheme (hence the
name Encode::Tcl, as you no doubt guessed.)  Sadahiro can comment on
both Encode and Japanese.


  I'm honored to be a gene donator of the beast!

Dan the Man with Too Many Charsets to Deal With


Sounds good :-)

One nit, though: the sooner you can start *and* finish the task,
the better.  For delivery dates, I would prefer "yesterday"... Why?
I want to release a 5.7.3 really, REALLY soon now, so that module
authors and users can test their stuff against it, so that 5.8.0 can
be released in a few months.  So I hope you haven't got any previous
commitmentents, like a day job or a family :-)

Okay, I'll move as quickly as possible but if the worse gets the worstI can still upload it to CPAN (I just want to make sure the name spaceremains untouched).If I just code a bridging module to Jcode that would be just a fewhours away but I wouldn't want to do that knowing I can implement muchsimpler and more elegantly.I also believe the same scheme can be applied to other CJKVlanguages/charset. But once again I need some help to come that far. Iknow some chinese (perhaps enough to debug the code. I can at leasttell if certain string is a sentence or line noise :) but I know littleKorean and absolutely no Vietnamese...

  Well, enough mubling done.  Back to coding....

Dan the Man with Too Many Breed of Camels (that is, too many versions ofCamels to babysit; I still have a customer that sticks with perl4,y'know).