ietf
[Top] [All Lists]

Re: Comments on draft-shafranovich-mime-sql-03

2013-01-23 22:37:53
Thanks for your comments. My understanding is that the SQL standards
as specified by ISO actually allow for variants and multiple vendor
implementations, all of which are not guaranteed to be compatible.
This is why the ISO standard specified conformance standards and
several levels of conformance. Even SQL Lite claims conformance to
SQL-92, albeit partially. I also took a quick look at several other
major DBMS, and they all have some level of conformance to the
standard.

Being that this Internet draft only registers a media type, the
details of what type of implementations there are and their
compatibilities, are best left to ISO. The interoperability
considerations section makes this clear:

      While a single standard exists ([ISO.9075.2011]), vendor
      implementations of the standard vary significantly.  Implementors
      and users should make sure that the exchanged SQL files match to
      the specific database/tool and version that they are using.

This is somewhat similar to HTML, where several specs exists, but
implementations vary greatly. One can argue that years ago
"IE-specific" HTML was not the same as "Netscape-specific" HTML, just
like various SQL dialects.

Regarding the charset issue, you are correct in that when
communicating between SQL clients and SQL servers, the encoding maybe
specified inline via a SQL command, or out of band via driver or
session parameters. HOWEVER, this media type is not intended for that
type of communication, rather it is intended for a more mundane things
like exchanging SQL files via an HTTP connection, or in an email
message. In those cases, I do not see any other way of specifying
encoding.

Going with that approach, it would seem that the draft needs slight
rewording here:

   Optional parameters:

      "charset" - indicates the character set to be used.  When not
      specified, US-ASCII should be assumed.

      Implementators should note that as per section 4.2.4, part 2 of
      [ISO.9075.2011], all SQL implementations are required to support
      US-ASCII, ISO-8859-1, UTF-8, UTF-16 and UTF-32 encodings.  Those
      should be the only valid values for the charset parameter unless
      some other agreement exists between the parties.

      When using [Unicode], the [ISO.9075.2011] standard dictates a
      preference of UTF-8 over UTF-16, and UTF-16 over UTF-32.

I would omit the last two paragraphs since we are not dealing with SQL
implementations directly, and rewrite it as follows:

   Optional parameters:

      "charset" - indicates the character set to be used.  When not
      specified, US-ASCII should be assumed.

The reason why I specify US-ASCII as the default, is because the ABNF
grammar of the SQL specification includes 88 characters that are part
of the ASCII. It seems that is the default encoding for SQL in absence
of any other information.

Yakov

On Wed, Jan 23, 2013 at 3:49 PM, Bjoern Hoehrmann 
<derhoermi(_at_)gmx(_dot_)net> wrote:
Hi,

  Regarding <http://tools.ietf.org/html/draft-shafranovich-mime-sql-03>,
I think the document should be more clear that there are many variants
of the SQL format and the media type is intended as "umbrella type" for
them all, for instance, the Abstract should refer to, say, "variants of
the Structured Query Language" and not "the Structured Query Language".
The "Published specification" field in the registration template should
also note that there exist individual specifications for individual de-
rived formats (like "SQL As Understood By SQLite").

The proposed type has an optional 'charset' parameter. As I understand
it, SQL formats typically indicate character encoding, if at all, in
proprietary ways inline in SQL resources, or entirely out of band, and
for exchange, everybody seems to be settling on UTF-* encodings, so the
parameter is unlikely to serve a useful purpose, but does create some
problems, like how implementations should handle HTTP responses where
the HTTP header says `Content-Type: application/sql;charset=iso-8859-1`
while UTF-8 is specified inline. I think the type would do better with-
out this parameter.

Whether it is kept or not, the media type registration should not de-
fine a default. If the character encoding is not explicitly specified,
then it is unknown, and it should be up to individual applications and
their conventions for their SQL variants how to handle that case.

regards,
--
Björn Höhrmann · mailto:bjoern(_at_)hoehrmann(_dot_)de · 
http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

<Prev in Thread] Current Thread [Next in Thread>