Herron> Languages do evolve
indeed, both glacially and by fiat, which presents a challange to
establish a useful and well defined set of descriptive labels.
Herron> Therefore the a system for tagging such things should
> most definitely have a place for placing markings as to
> the "version" of the language.
ah, but unfortunatly computers demand concise and precise
information, something which is often difficult to pry out
of academics and their literature.
Herron> For the purposes which Dana mentioned (mixed text from
> different languages) it is, to my knowledge, enough to
> mark the desired character sets.
not quite enough, if we are to truly provide a system to support
the handicapped, hopefully some audible hints would also be
available, or else:
Content-Audible-language: <None>
would be indicated.
Herron> To my knowledge choosing the right glyphs is driven by
> the character set.
yes, and if a font-specific markup is provided (Application...) then
there is no issue, but there are a multitude of fonts available,
but they can be classified into a much smaller set of encodings
which relate to specific writing systems, and a means of marking
up that seems to me to be desirable, both as an interim measure
whilst awaiting a workable uni-encoding, and as an alternative to it
while it acheives adoption.
rhys>> e.g. French and French-Canadian which have different
>> capitalisation rules I believe.
French and French Canadian differ in the use of accented capitol
letters. I have no idea if the accent is actually encoded in
both cases and simply not shown because of the way the fonts are
drawn, or if there is actually a code point difference, which
would make difficultys for both pronounciation software and
spellcheckers. Further, I trust Murphy to have ensured that
Apple/IBM/? have implemented this inconsistantly.
So what we need is a sufficient quantity of character
sets so we can discuss old high germanic names in one
paragraph, old english in the next, and russian after
that.
We shouldnt be looking to specify specific fonts so much as writing
systems, Cyrillic and Roman would cover the examples you give.
Where does the need for marking the languages
come from?
A desire to have hints for spell checking and audible renderings.
Before you jump on me about the lack of need for spellchecking
an incomming document, consider the difficulty of reestablishing
language associativity while composing replys which quote
the text of said incoming message.
The subject matter is MIME remember? that includes audible
renderings as well as visual.
Hell, I had a further bit of devlish inspiration last night, consider:
Content-Sign: <American Sign Language>
(he said ducking :-)
--
dana s emery <de19(_at_)umail(_dot_)umd(_dot_)edu>