Thank you for the comments, and sorry about the delay in replying.
: This document describes the encoding used in plain text electronic
: mail and network news in several Japanese networks.
This should be "... used in the plain text parts of electronic
mail and network news messages in several ...".
Hmm. I can see what you're getting at, but if I use your wording,
people might get the impression that iso-2022-jp is currently also
used in the plain text parts of message headers, but it isn't. So
perhaps I should change it to read "... used in the bodies of
electronic mail and network news messsages in several ...".
intention that JUNET encoding shall be allowed there too, in
accordance with the rules of RFC 1342? Either way, it may be
useful to include some text about the use or non-use of JUNET
encoding in message headers.
Very true. Currently MIME and RFC 1342 are intended to proceed
together along the standards track, so the iso-2022-jp document should
probably also mention its relationship to RFC 1342.
: Formal Description
: This section provides a formal description of the JUNET encoding. In
: the event that this description is not consistent with the above
: informal description, this formal description shall take precedence.
The formal description is not as complete as the informal
description: It only specifies the syntax -
which octet sequences are allowed
not the semantics -
which character set shall be used to interpret a certain
<segment> (or *<text> outside <segment>s)
The semantics is only specified in the informal specification.
True, but I don't feel that it is worth it to make the "Formal
Description" complete. I think I'll change the wording to make the two
sections complementary rather than alternate.
: CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
: text = <any CHAR, including bare
: CR & bare LF, but NOT
: including CRLF>
I see two problems with the definition of <text>, one formal and
Formal problem: <text> as defined is one single character. How
can the definition then say that <text> is "any CHAR ... but not
including CRLF"? <CRLF> is a _sequence_ of two characters.
I took that part straight from RFC 822. If it's good enough for RFC
822, it's good enough for the iso-2022-jp doc. I appreciate your
concern, but I really don't want to bog down this doc with
incomprehensible gobbledeegook. :-)
Substantial problem: Is <ESC> allowed as <text>? The present
definition implies that, but obviously at least <ESC> can't be
allowed in the context
ESC "(" ( "B" / "J" )
/ ESC "$" ( "@" / "B" )
And I guess other similar escape sequences, that would indicate
a switch to other character sets according to ISO 2022 are
disallowed too. Maybe <ESC> should be excluded from <CHAR>?
OK, I'll exclude ESC from "text".
: Additional restrictions that are difficult to describe in the above
: are as follows.
: Adjacent segments should have different escape sequences. For
: example, the following is not recommended:
: ESC $ B .... ESC $ B ....
The use of "should" indicates that this rule isn't a
"restriction" but rather a recommendation.
Actually, someone else pointed out that this part really isn't
necessary, so it'll be removed.