ietf
[Top] [All Lists]

Re: RFC 2152 - UTF-7 clarification

2015-10-09 00:08:56
On Thu, Oct 08, 2015 at 09:40:25PM +0300, A. Rothman wrote:

Just in case someone missed it (I almost did): Mark added his own
detailed comments on the test cases, but they got buried within a long
quote from my original email so may have gone unnoticed. To recap, here
are the two interpretations:

+A-             empty + 6 (unnecessary) padding bits
+AA-            empty + 12 (unnecessary) padding bits
+AAA-           \U+0000, and 2 (required) padding bits
+AAAA-          \U+0000, and 8 (6 extra) padding bits
+AAAAA-         \U+0000, and 14 (12 extra) padding bits
+AAAAAA-        \U+0000\U+0000, and 4 (required) padding bits
+AAAAAAA-       \U+0000\U+0000, and 10 (6 extra) padding bits


+A-             illegal       !modified base64
+AA-            illegal       !a multiple of 16 bits in modified base64
+AAA-           legal   0x0000 (last 2 bits zero)
+AAAA-          illegal !a multiple of 16 bits in modified base64
+AAAAA-         illegal       !modified base64
+AAAAAA-        legal   0x0000, 0x0000 (last 4 bits zero)
+AAAAAAA-       illegal !a multiple of 16 bits in modified base64


Does anyone else want to vote or comment on the two interpretations above?

Thanks for pointing this out more clearly.  Yes, they disagree.
However, the manner in which they disagree is rather simple.

They agree in all the cases where the padding is *minimal*.

The first variant always tolerates non-minimal padding allowing
anything less than 16 bits per the specification.  The second
variant never tolerates non-minimal padding, because there's no
need to produce it.

It is clear that clients should produce minimal padding, and we
seem to disgree on  wether to apply Postel's principle to the
decoder or not.

This is not a major disagreement, such differences of interpretation
are endemic whether the standard is clear or not.  Many implementors
are lazy, and stop writing code when the expected cases work.

While this is no excuse for ambiguous specifications, in this case
I don't think a revision is warranted.  Encoders that generate
sensibly minimal padding will not run into any friction with
non-broken decoders.  Encoders that get creative might find that
some decoders object whether the standard allows their creativity
or not.

-- 
        Viktor.