ietf-822
[Top] [All Lists]

Re: Getting RFC 2047 encoding right

2003-12-08 11:44:52

Well, I do search, and it's extremely expensive to decode all the 
messages in order to do a search. To search, I must decode when the 
message goes into the data store.

Storing two instances of the same header is also not a good answer. 
First of all, it's bad practice. Never keep two separate variables
that are supposed to stay in sync.

Every fast search engine I know of creates indices out of the data
being searched.  I don't see why searching mail should be different.
You do have to rebuild the index for that message if the message 
is changed, but that should be a rare case.

Second, it only solves the common case - if I go down that path I have
problems again soon enough. Suppose I want to answer a message which 
had an overlong encoded-word. Should I blithely emit overlong 
encoded-words? 

For that matter, suppose you want to answer a message that has some
other kind of invalid header field - maybe one that isn't encoded at
all, and has illegal characters.  The basic answer is that what you do
with illegal input is generally not specified - but clearly you aren't
expected to make the subject of the reply match the subject of the
message being replied to in that case.

Suppose I want to answer with "subject: re: <original> 
<ticket id>", then I risk having two encoded-words separated only by 
whitespace, and must do magic in order to preserve that space.

why not just use an ASCII ticket id?