[Top] [All Lists]

comments on SHAVE

1993-10-25 22:10:49
First of all, I think using SGML to represent attribite/value
information is a wonderful idea, and the general approach of SHAVE is
quite good.

But I'm wondering if we can't get away with something simpler, or at
least a simpler way of describing it.

Unfortunately, SGML is a culture in itself.  It has its own
vocabulary, and its own conventions that are strange to those in other
lands.  MIME also is a culture.  My experience so far is that it's
difficult enough for outsiders to assimilate the MIME culture, without
having to learn SGML also.

It appears that if I want to design an application using SHAVE, I have
to know enough SGML to write a DTD.  It also appears that if I want to
parse a SHAVE formatted body part, I need to either have the DTD or
know which SHAVE parameters have which kinds of values.  (else how do
I know when to look for a matching end-tag and when not to?)

What I'd like to have is a generalized parser for any SHAVE body part,
that does not need to know the DTD or equivalent information, in order
to extract the relevant parameters and values from the body part.  Of
course such a parser wouldn't be able to do full syntax checking
(e.g. it wouldn't know which attribute names were valid for a
particular element), but for many applications this would not be a

One way to do this might be to require each parameter names to have a
suffix character which indicated what kind of values it takes.  For
example, a parameter name that takes a single text value could end
with a '+', one that takes a sequence of parameter/value pairs could
end with a '/', one that takes a list of text values could end with a
'%', etc.  (I don't know which of these would work within SGML, but
surely there are a few non-alphabetic characters which are legal for
this purpose.)

SHAVE would be easier to use if it first defined the SHAVE syntax in
non-SGML terms (say with an 822-style grammar).  A later section would
describe how SHAVE fits into the SGML world.  It would also be helpful
if there were a simple way to describe a SHAVE document, which could
be mechanically translated into an SGML DTD.

What follows are nits:

+ a limit of 8 characters on element and attribute names would be very
painful.  Does it break SGML compatibility to extend this a bit?

+ it might be nice to provide an alternate way of including arbitrary
octet-strings as values, say by encoding them in base64.  (maybe a
reserved tag/end-tag pair?)

+ if I understand rule 17, there's no way to encode a string like
"this is a string\r\n" because the trailing \r\n will be discarded
(even if there are multiple newlines).

Keith Moore

<Prev in Thread] Current Thread [Next in Thread>