perl-unicode

Re:arabic/woyka/the whole earth characters mapping

1999-08-12 17:29:26
Hello

The whole earth character mapping, or all what you can eat
Freetype and Unifont vs Truetype and Unicode

read on ..

try ftp://dkuug.dk/i18n/charmaps/
pick
ASMO-708
ASMO-449
CP-1256
ISO-8859-6
MS-ARAB
IBM-868
and so on

Hebrew-HOWTO mentioned Arabic!
We believe only in open software, so all the customers get the compelete source code. We have good relations with the leading forces in this industry, including the technical staff of
X-Consortium and the technical staff of COSE.
    Eli Marmor
    El-Mar Software Ltd.
    Voice: 050-237338
    FAX: 09-984279
marmor(_at_)sunshine(_dot_)cs(_dot_)biu(_dot_)ac(_dot_)il
P.S.: The announcement of the Arabic Support for X-Windows & Motif, is expected in January. English, Hebrew, and Arabic will be handled by 8 bits (!), including the full set of
Arabic glyphes.

If you are serious about an unicode Arabic solution then do the following
Freetype and Unifont vs Truetype and Unicode
http://www.freetype.org/
http://czyborra.com/unifont/
Try this perl scripts and use the following Arabic renderer in four lines of Perl using UNICODE
http://leb.net/archives/reader/1998/0041.html

The Arabic Unicode guru# Freeware license at http://czyborra.com/
# Latest version at http://czyborra.com/unicode/
# PostScript printout at http://czyborra.com/unicode/arabjoin.ps.gz
make use of this Arabic/Farsi applet source code
and here is the all the Arabic glyphy you want - you have to wait a few minutes -
http://www.gpg.com/HT/
[if you want the source code please email me I will email it to you. inshaAllah]
This is my try on it, there are errors in the converting script, I need help
http://tehran.stanford.edu/Editors/editors.html
(1) Make a public domain Java applet that can read any page that is presented in an open browser and translate the upper ascii
code to Arabic to

(a) CP-1256 (aka MS-ARAB)
and give the reader a choice if - it is not readable - to reload with another set i.e.

(b) Sorry try ASMO-708 (aka ISO-8859-6) click_here

(c) Sorry try ASMO-449 Click_here

   And Arabic/Farsi paragraph

Where do you get these character mapping?
try ftp://dkuug.dk/i18n/charmaps/
pick
ASMO-708
ASMO-449
CP-1256
ISO-8859-6
MS-ARAB
IBM-868
and so on

(2) Then advance to standardizing the sets of :
Xterm
Arabizing Linux XWindow
Motif see Solaris Solaris 7
Solaris Arabic Links
Persian/Farsi Solution for Solaris

(3) Advance to a Koran character set that applies the Koran own dictation and letters marking according to tanween rules. The Mosque has the understanding and will help by coming up with the gif fonts solution in order that the programmers can see what
we mean by Koran own set
i.e. Alif with sukoon, and small alif not available in any Arabic WP.

(4) Xerox arabic java applet
or at least an simple web based editor like Xerox keyboard to enter Arabic words or Big Blue Bidi Arabic Java

(5) The best commercial software today is the Universal word (www.wysiwyg.com) because it works in Arabic languages under English windows. We should work with it to have the applet as a publish converter to its Arabic document (please
download its working version)


   a.  You can open one document that contains 10,000 pages
   b. Speaks all Muslims and non-muslims languages at the same time
   c. It reads and writes UNICODE
   d. The price is not very expensive


----Original Message Follows----
From: Larry Wall <larry(_at_)wall(_dot_)org>
To: Tim(_dot_)Bunce(_at_)ig(_dot_)co(_dot_)uk (Tim Bunce)
CC: A D <sapprint(_at_)hotmail(_dot_)com>, larry(_at_)wall(_dot_)org, perl-unicode(_at_)perl(_dot_)org, dmulholl(_at_)cs(_dot_)indiana(_dot_)edu
Subject: Re: should \d match *all* the digits? faster with woyka
Date: Wed, 11 Aug 1999 10:26:18 -0700 (PDT)

: On Wed, Aug 11, 1999 at 08:52:46AM -0500, A D wrote:
: > Hello Larry
: >
: > PLease read this woyka, it can speed perl 100s time
: > and revolutionize the perl engine and the unicode then.
: > Please let me know in due time what you think

It won't speed up anything 100s of times.  It might speed up some
applications 2 or 3 times.  You have to realize that traditional
orthography is already somewhat Huffman encoded--the common words are
already shorter.  The word "the" is three letters.  The word "a" is
already one letter.  By the time you figure out how to encode spaces
and punctuation, the encoding itself is only going to give 50-60%
compression at the most.

The other aspect is that you don't have to spend any time finding word
boundaries.  For an application that is interested in word boundaries,
that's a win, but for applications that aren't, it's not.  Indeed, in
the general case, any application that is interested in individual
characters will be slowed down, because you'd have to decode the word
to characters internally.

Still, for some applications, this would be a reasonable optimization.

Tim Bunce writes:
: Doesn't seem particularly revolutionary. Many people, including myself,
: have already spoken of using UTF8 'characters' to represent arbitary
: encodings and using regular expressions to search and manipulate them.

Yes.

: I do agree that it's a powerful concept that could have wide applications.
: Someone just needs to do the leg work and create a module to make it
: easy to use.

The question is how far you have to go with this. Since unicode is compatible
with ascii, you can still say

    use utf8;
    print "foo\n";

But a utf8 encoding applies to all the strings in its scope.  What should

    use utf8 'woyka_english';
    print "foo\n";

do?  Encode "foo\n" into woykan, probably, and reverse translate on print.

Or not...

Larry



_______________________________________________________________
Get Free Email and Do More On The Web. Visit http://www.msn.com
Hebrew-HOWTO mentioned Arabic!
We believe only in open software, so all the customers get the compelete source code. We
have good relations with the leading forces in this industry, including the technical staff of
X-Consortium and the technical staff of COSE.
     Eli Marmor
     El-Mar Software Ltd.
     Voice: 050-237338
     FAX: 09-984279
marmor(_at_)sunshine(_dot_)cs(_dot_)biu(_dot_)ac(_dot_)il
P.S.: The announcement of the Arabic Support for X-Windows & Motif, is expected in
January. English, Hebrew, and Arabic will be handled by 8 bits (!), including the full set of
Arabic glyphes.

If you are serious about an Islamic Arabic solution then do the following

Freetype and Unifont vs Truetype and Unicode
http://www.freetype.org/
http://czyborra.com/unifont/
Try this perl scripts and use the following Arabic renderer in four lines of Perl using UNICODE
http://leb.net/archives/reader/1998/0041.html
The Arabic Unicode guru# Freeware license at http://czyborra.com/
# Latest version at http://czyborra.com/unicode/
# PostScript printout at http://czyborra.com/unicode/arabjoin.ps.gz
make use of this Arabic/Farsi applet source code
and here is the all the Arabic glyphy you want - you have to wait a few minutes -
http://www.gpg.com/HT/
[if you want the source code please email me I will email it to you. inshaAllah]
This is my try on it, there are errors in the converting script, I need help
http://tehran.stanford.edu/Editors/editors.html
(1) Make a public domain Java applet that can read any page that is presented in an open browser and translate the upper ascii code to Arabic to

(a) CP-1256 (aka MS-ARAB)
        and give the reader a choice if - it is not readable - to reload with another set i.e.

(b) Sorry try ASMO-708 (aka ISO-8859-6) click_here

(c) Sorry try ASMO-449 Click_here

    And Arabic/Farsi paragraph

Where do you get these character mapping?
try ftp://dkuug.dk/i18n/charmaps/
pick
ASMO-708
ASMO-449
CP-1256
ISO-8859-6
MS-ARAB
IBM-868
and so on

(2) Then advance to standardizing the sets of :
Xterm
Arabizing Linux XWindow
Motif see Solaris Solaris 7
Solaris Arabic Links
Persian/Farsi Solution for Solaris

(3) Advance to a Koran character set that applies the Koran own dictation and letters marking according to tanween rules. The Mosque has the understanding and will help by coming up with the gif fonts solution in order that the programmers can see what we mean by Koran own set
i.e. Alif with sukoon, and small alif not available in any Arabic WP.

(4) Come up with an Arabic word-processing that save to html/arabic java applet
or at least an simple web based editor like Xerox keyboard to enter Arabic words
or Big Blue Bidi Arabic Java

(5) The best commercial software today is the Universal word (www.wysiwyg.com) because it works in Arabic languages under English windows. We should work with it to have the applet as a publish converter to its Arabic document (please download its working version)
 

    a.  You can open one document that contains 10,000 pages
    b. Speaks all Muslims languages at the same time
    c. It reads and writes UNICODE
    d. The price is not very expensive
 

<Prev in Thread] Current Thread [Next in Thread>
  • Re:arabic/woyka/the whole earth characters mapping, A D <=