ietf
[Top] [All Lists]

Re: New Last Call: 'Tags for Identifying Languages' to BCP

2004-12-18 11:51:39
 Date: 2004-12-15 14:41
 From: "Peter Constable" <petercon(_at_)microsoft(_dot_)com>
 To: ietf-languages(_at_)alvestrand(_dot_)no, ietf(_at_)ietf(_dot_)org
[...]
How is it possible to predict ahead of time what is the worst-case
length for a RFC3066-registered language tag?

In some contexts, the length is limited by the context
(e.g. encoded-words, Content-Language fields in an
Internet Message).
 
Neither is possible. In light of that, I think it best to make sure
implementers of the revised RFC 3066 be reminded that some
implementations may impose limits (whether those implementers be
constructing tags or passing them from one process to another), and for
implementers to incorporate robustness into their implementations so
that they can respond gracefully if an unexpectedly-long tag is
encountered -- after all, no matter what limit could be imposed in a
revision to RFC 3066, there's no way to stop malware from sending bad
data.

(How *do* encoded-word parsers react if a bogus charset or language tag
that's 2k octets long is encountered?

By definition, that cannot happen. No encoded-word may be
longer than 75 octets.  A sequence longer than that limit,
even if it matches all other characteristics of an
encoded-word, is treated as ordinary ASCII text (RFC 2047,
section 6.1, paragraph marked "(1)").  No header field
line may be longer than 998 octets (not counting the
terminating CRLF pair), so 2k is simply not permitted.

The encoded-word spec already 
allows for segmenting long strings;

To be a bit more precise, it permits text to be encoded to
be split across multiple encoded-words (with several
restrictions); the encoded-words themselves cannot be in
any way segmented or split.  That is because an encoded-word
is treated by a MIME-unaware application as a single RFC
[2]822 word.

could it not also be revised to 
allow segmenting for the parameters, which would also make it more
robust?)

If you're referring to RFC 2231 extensions to Content-Type
and Content-Disposition field parameters, that's a separate
matter.

In general, though, as MIME has been around for more than a
decade and Internet Messages for more than three decades,
with a substantial installed base of interoperating
implementations, in what has become one of the core Internet
protocols, any changes would have to be backwards compatible
or would have to be negotiated between sender and receiver
at the same protocol level, or would require a lengthy
transition period before pulling the rug out from under
existing implementations.  It's probably more likely that a
separate next-generation system would be implemented first.

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf


<Prev in Thread] Current Thread [Next in Thread>