[Top] [All Lists]

[ietf-822] Review of and suggested changes for draft-crocker-inreply-react-01.txt

2020-10-21 08:52:25
One of the interesting things about the Internet standards process is how
often a particular implementation creates the architectural model, rather than
the other way around.

This specification is a case in point. It starts by offering three key

(1) It's possible to tease the design of a generic "reaction" facility out
    of various social media systems.
(2) Restricting the system to emoji, or a even subset of emoji, is essential
    to the viability of the facility.
(3) Given (1) and (2), it's possible to add a reaction capability to existing
    messaging systems, even one as well established as email.

All of these insights seem obvious in hindsight, but I for one have never
really thought about them, or about extending email in this way. As such, I
applaud Dave for coming up with this proposal.

However, like so many previous proposals, the specification now shifts rather
abruptly to implementation specifics, using an approach based on existing
header-based mechanisms like X-Face. And this is where things start to get a
little iffy.

First of all, I think there's at least one more key insight that we need to
pull out of existing reaction facilities: In many if not most cases they
operate independently of reply messages. Indeed, this is what makes them so
attractive: As a user I can offer an opinion of something without having to
write anything. I just click on a button, up comes a panel of emoji, I make a
selection, done. (Yes, I realize this is a UI detail, and we don't do UIs,
but it's such an important capability that it warrants our attention.)

And this isn't simply a matter of being lazy - although the virtues of
laziness often fail to be appreciated. The separation of reactions from
replies makes it possible to implement very different sorts of workflows.
The obvious example is voting - most reaction interfaces provide the
tally you need of user reactions automatically, whereas with replies the
tally has to be done manually - exactly the sort of things computers are
supposed to do for us.

Second, as everyone knows, I am one of the authors of MIME, and one of the
key elements of MIME was backwards compatibility with non-MIME systems. I
firmly believe this is one of the things that made MIME succesful, so it's
something I worry about in every new proposal that comes along.

Returning to the current document, I believe the proposed implementation
doesn't adequately address either of these points, and that there are
alternatives that do.

In particular, the current proposal creates a reaction mechanism that's
independent of replies but then requires that a reply be sent in order to
send a reaction. This linkage makes it very awkward to use the mechanism
for things like voting - having a long list of meaningless replies show up
as part of a vote is definitely not a Good Thing.

Of course it's possible to define a semantic where, say, an empty message body
in conjunction with a reaction header, would be silently ignored. But this
brings us to the second point: Backwards compatibility. If this proposal starts
to deploy it will be into a world where no client is aware of the meaning of
this new reaction header field. Now consider what this will look like in an
unextended client: A bunch of empty replies. Definitely Not Good.

Other alternatives fare no better: Clients won't know what to do with a
"ignore" content-disposition, and will ignore it rather than the message.
Ditto for a text/ignore content type.

But there is one thing that does work, which is put the reaction in the
message body. This way clients that don't implement the reaction facility will 
display something, and it's actually pretty sensible.

Putting the reaction in the message body solves an even bigger backwards
compatibility problem: The least astonishment principle violation that occurs
when the participants in a discussion with upgraded clients see a bunch of
reactions but those without upgraded clients see nothing at all.

The current document does provide the option to do this: It says clients
MAY include a copy of the reaction in the message body. But making this
optional leads to even more unpredictability.

But if we always put the reaction in the message body, the header field is now
superfluous. So let's proceed on the assumption that the reaction will be
in the message body - and only in the message body - and see how far we get.

MIME provides us with multipart messages, so the obvious thing to do is
put the reaction in its own part. This however, brings us face to face
with our first real problem: How do we combine a reaction with a reply? If
the reply is text/plain then there's a simple solution: Use multipart/mixed
containing the two parts. (Tests show this actually works pretty well.)

What if the reply is text/html? multipart/alternative? multipart/related?
If past experience with client MIME support is any indication, complex
nested MIME structures are best avoided. But in this case we have a
simple alternative: Combine if you can, but if you can't send two

So now the rule would be for conforming clients to look for a
reaction part and process it as such. Processing terminates if the
message consists solely of that part. If not, process the remainder as
a reply.

The next issue is really more of a detail: How do we label a part as a
reaction? One way to do it is to define a new Content- field,
say Content-Reaction. This is pretty much guaranteed to work. But it would
be nice if we could do this using existing Content- fields. And we have
at least two of those: Content-type and Content-Disposition.

A text/reaction media type makes quite a bit of sense - we are, after all,
restricting the repertoire to emoji, so this isn't really text/plain. And
there are other media types, e.g., multipart/related, that exist solely to
invoke specialized client handling. Finally, RFC 2045 section 4.1.4
quite specifically states that unknown subtypes of text should be handled
as text/plain, providing the necessary fallback behavior.

Unfortunately a bit of testing shows that section 4.1.4 has apparently been
ignored by at least two major mail service providers - they both treat
text/reaction as application/octet-stream. (Bad Google. Bad Microsoft.

I guess we could stand on principle that this is clearly a standards violation
and proceed with the media type, but why fight a fight we are almost certain
to lose when there are viable alternatives?

So how about a reaction content-disposition? This seems to work well -
unknown content-dispositions look like they are ignored. (Or maybe nobody
is currently paying attention to disposition information, which is also
fine for our purposes.)

My suggestion, then, is to switch from a header field to a MIME part
with a content-disposition of "reaction"", a content-type of text/plain,
UTF-8 charset, and contents restricted to emoji. Combine the reaction with
a reply if you can, if can't send it as a separate message.

The current document is pleasantly short, so it's possible for me to
suggest a set of changes to implement this suggestion:


   This facility defines a header field, to be used in junction with the
   In-Reply-To header field, to link one or more emojis as a summary
   reaction to a previous message.


   This facility defines a new content-disposition, to be used in conjunction
   with the In-Reply-To header field, to specify that a part of a message
   containing one or more emojis be treated as a summary reaction to a
   previous message.


2.  In-Reply-React

   A message sent as a reply MAY indicate the responder's summary
   reaction to the original message by including an In-Reply-React
   header field:
   The [ABNF] for the header field is:

in-reply-react = "In-Reply-React:" emoji *(lwsp emoji) CRLF

emoji = emoji_sequence
emoji_sequence = { defined in [Emoji-Seq] }

base-emojis = thumbs-up / thumbs-down / grinning-face / frowning-face /

thumbs-up = {U+1F44D}
thumbs-down = {U+1F44E}
grinning-face = {U+1F600}
frowning-face = {U+2639}
crying-face = {U+1F622}


2.  The reaction content-disposition

   A message sent as a reply MAY include a part containing a
   content-disposition field with the value "reaction". If such a field
   is specified the content-type of the part MUST be:
     Content-type: text/plain; charset=utf-8
   The content of this part is restricted to single line of emoji:
   part-content = emoji *(lwsp emoji) CRLF

   emoji = emoji_sequence
   emoji_sequence = { defined in [Emoji-Seq] }


   Fully interoperable email uses 7-bit ASCII, although some email
   handling paths directly support 8-bit data.  Emoji characters are
   drawn from the space outside of 7-bit ASCII.  For email handling
   paths that are 8-bit clean, the an emoji character does not need
   special encoding.  If the path from author to recipients is not known
   to be 8-bit clean, The emoji character SHOULD be encoded using

(I don't think an explanation of body part encoding is necessary or


   For recipient MUAs that do not support this mechanism, the header
   field might not be displayed to the recipient.  To ensure that the
   reaction is presented to the recipient, the responding MUA MAY
   automatically include a second copy of the header field in the
   message body.  This might be as the first line of the body or as the
   first mime-part.  [MIME] By making the text be the full header field,
   it also allows MUAs that do support the mechanism to identify this
   redundant information and possibly remove it from display.


   Recipient MUAs that support this mechanism operate as follows:
   (0) If an In-Reply-To field is present check to see if it references
       a previous message the MUA has received. 

   (1) If a reference to an existing message is found check for a part with
       a "reaction" content-disposition at either the outermost level or as
       part of a multipart at the outermost level.

   (2) If such a part is found, and the content of the part conforms to the
       restrictions outlined above, remove the part from the message and
       process it as a reaction. (The exact details of reaction processing are
       necessarily MUA-specific and beyond the scope of this specification.)
   (3) Processing terminates if no parts remain in the message. (Again, the
       handling of a message that has been successfully processed is
       MUA-specific and beyond the scope of this specification.) If parts
       remain process the remaining message content as a reply.


4.  Security Considerations

   This specification defines a distinct location for specialized
   message content.  Processing that handles the content differently
   from content in the message body might introduce vulnerabilities.
   However the mere definition or use of this mechanism does not create
   new vulnerabilities.

4.  Security Considerations

   This specification employs message content that is a strict subset
   of existing content, and thus introduces no new content-specific
   security considerations.

   This specification defines a distinct label for specialized
   message content.  Processing that handles the content differently
   from other content in the message body might introduce vulnerabilities.


5.  IANA Considerations

   Add to "Permanent Message Header Field Registry":

      Header field name:    In-Reply-React

      Applicable protocol:    Mail (RFC 2822)

      Status:     Experimental

      Author/Change controller:    IETF

      Specification document(s):   This specification.

      Related information:    None


5.  IANA Considerations

 New Content-Disposition Parameter Registrations

   This document specifies a new "reaction" content disposition and its
   handling that should be added to the IANA registry.

In order for different clients to interoperate in their handling of
reactions, regardless of the mechanism used this document should
probably include a specification of how to store reactions using
IMAP annotations.

I think that's more than enough for now.


ietf-822 mailing list