Re: [Slim] IETF last call for draft-ietf-slim-negotiating-human-language

Gunnar said:

"With some hesitation I suggest to let it mean to see a speaking person."

[BA] Is this for the purpose of enabling lip reading?

Assuming that we go that way, how would captioning be negotiated?

On Mon, Feb 13, 2017 at 1:23 PM, Gunnar Hellström <
gunnar(_dot_)hellstrom(_at_)omnitor(_dot_)se> wrote:

Bernard,

I just issued comments where I also included the "silly states" topic with
similar views as yours.

Den 2017-02-13 kl. 20:06, skrev Bernard Aboba:

Looking over Section 5.4, it seems to me that the title "Silly States" may
not be appropriate, because it mixes discussion of combinations of media
and language that have an "undefined" meaning with combinations for which
normative guidance can be provided  So rather than having a single "Silly
States" section, perhaps we can have a section on "Undefined States" (for
those combinations which have an undefined meaning) provide normative
guidance on defined combinations elsewhere.

5.4 
<https://tools.ietf.org/html/draft-ietf-slim-negotiating-human-language-06#section-5.4>.
  Silly States

   It is possible to specify a "silly state" where the language
   specified does not make sense for the media type, such as specifying
   a signed language for an audio media stream.

   An offer MUST NOT be created where the language does not make sense
   for the media type.  If such an offer is received, the receiver MAY
   reject the media, ignore the language specified, or attempt to
   interpret the intent (e.g., if American Sign Language is specified
   for an audio media stream, this might be interpreted as a desire to
   use spoken English).

   A spoken language tag for a video stream in conjunction with an audio
   stream with the same language might indicate a request for
   supplemental video to see the speaker.

[BA] Rather than using terms like "might" for combinations that could have a

defined meaning, I would like to see the specification provide normative

language on these use cases. In particular, I would like the specification to 
describe:

a. What it means when a spoken language tag is included for a video stream.

Is this to be interpreted as a request for captioning?

b. What it means when a signed language tag is included for an audio stream.

Is the meaning of this "undefined" and if so, should it be ignored?

c. What it means when a signed language tag is included for a text stream.

If some of these scenarios are not defined, the specification can say

"this combination does not have a defined meaning" or something like that.

See my recent comments for more views. I support the idea to be normative
and specific when possible.
A complication is that there is no difference between language tags for
written and spoken language.

So we have the following possible combinations and interpretations of
"silly states"

1. Spoken/written tag in video media, can mean to see a speaking person,
or to provide captions overlayed on video.
With some hesitation I suggest to let it mean to see a speaking person.
The draft adds a requirement to have the same language in the audio stream
in the same direction to have that interpretation.  Should that mean that
if there is another language in the audio stream, then the spoken/written
tag in the video stream should mean captions in the specified language?
That sounds useful for some cases, but complex to interpret and unfair to
the users who would benefit from captions in the same language as in audio.
Summary: I think we had better to use the interpretation to see a speaking
person regardless of what language is indicated for audio.

2. Signed language tag in audio media, can mean audio from a signing
person. That could be anything between near silence and spoken words
corresponding to the signed signs as far as feasible. This is usually seen
as disturbing to sign language users but it exists, e.g. when one erson
needs to communicate with both hearing and deaf persons simultaneously.
There are also variants of signing, called sign supported language, with
signs expressed with spoken language word order and grammar. That can more
easily be combined with spoken language, but would more likely be indicated
by spoken language tag in audio media.
Summary: I am inclined to let signed language tag in audio media mean
audio from the signing person and possibly used for the rare cases when it
has some relevance for language communication.

3. Sign language tag in text media. There are some ways to represent sign
language in various kinds of symbol or text representation. Some are
represented in Unicode. One is a system called Sign Writing. Some
fingerspelling methods also have fonts corresponding to characters in code
pages. There is also an informal way to write manuscripts for signing in
words with capitals approximately corresponding to signs, often with some
notation added for unique sign language ways of expression that has no
direct correspondance to words. None of these systems above are common in
real-time conversation, but I have seen examples of such use.
Summary: I think we can leave freedom here and just specify that a sign
language tag in text media means some representation of sign language or a
corresponding fingerspelling system in text media.

If these conclusions are accepted, we can formulate modified text. Note
that the case with spoken/written language tag in video media is mentioned
in two places in the draft.

Regards
Gunnar




_______________________________________________
SLIM mailing listSLIM@ietf.orghttps://www.ietf.org/mailman/listinfo/slim


--
-----------------------------------------
Gunnar Hellström
Omnitorgunnar(_dot_)hellstrom(_at_)omnitor(_dot_)se
+46 708 204 288

Re: [Slim] IETF last call for draft-ietf-slim-negotiating-human-language (Section 5.4)