pem-dev
[Top] [All Lists]

Re: PEM/MIME Encryption

1994-12-18 14:46:00
The generic problem is that in order for data to be useful at both its
source and destination, it must be representable in both those
locations.  In the context of email, when Alice sends a "text" message
to Bob, Alice assumes it will still be a "text" message when Bob
receives and displays it.  In the Internet today, this almost always
works.  It works because the mail transfer service is "helpful" while
it's transferring the message.  For example, to make this concrete, the
SMTP protocol includes record terminator canonicalization in its
operation: all messages are transferred with CRLF as the record
terminator and converted to/from local requirements at each end of the
SMTP connection.

There's a lot of slop here in practice -- MTAs often store messages
in some other format (e.g. LF-terminated, CR-terminated, counted records)
and convert incoming and outgoing material accordingly. There's even a
fairly significant deployment of standards-incompliant mailers that use
LF-only terminated SMTP.

When you introduce MIME, the issue is exacerbated because now the
message may contain data other than text.  Prior to MIME, users
addressed this issue with explicit action, e.g., an originator could
uuencode a content which a recipient was required to uudecode before it
could be processed, optionally specifying additional, adhoc information
in the message so a recipient would know what processing was required.
MIME provides a framework for labelling data that has all but eliminated
the need for adhoc mechanisms.  A recipient may now receive unsolicited
messages that can be quickly separated into those that can be processed
and those that can not.

More to the point, MIME introduces the concept of canonicalization to Internet
email. Associated with each MIME content-type definition is a canonical format
for that data. In many cases this notion seems both petty and redundant -- text
was already CRLF-terminated lines, PostScript is a stream of bytes as defined
by the PostScript specifications, and so on. But this concept is actually very
important, and provides the key to MIME interoperability.

The point to understand at this time is that an originator either:

      a priori knows what a recipient can receive and does the right
      thing within the limits of her user agent

      takes a chance and sends whatever is convenient for the
      originator to the recipient and hopes the recipient can deal
      with it.

This applies to all contents an originator sends to a recipient and is
independent of the presence or absence of security services.

This is all quite correct.

The generic problem gets much harder when you add security services.  In
the case of a digital signature, it is insufficient for the data to be
representable in each location, its representation must be precisely
identical.  It must be possible for both an originator and recipient to
calculate exactly the same message integrity check value.  In order to
do this both the originator and recipient must have exactly the same
data.  If not, the signature on the message will not be verifiable by
the recipient.

Correct, but this is non-issue as long as you stick to the canonical
forms MIME defines. Failure to do so doesn't work, but failure to do so is
also a violation of the MIME specification.

Long time participants in this technology will recall that very early
versions of PEM did not include a MIC-CLEAR option.  When a message was
digitally signed it was always Base64 encoded to ensure its identical
representation on both the originator and recipient machine.  However,
this meant that non-PEM aware user agents could not display signed
messages.  Ultimately, it was acknowledged that there was a good deal of
utility in being able to read a signed message even if the signature
could not be verified.  Thus was born the MIC-CLEAR option and the
requirement that recipient user agents (PEM-aware) canonicalize text
messages prior to verifying signatures.

The point to understand at this time is that the signature security
service breaks if a precise representation of the data is not chosen,
irrespective of whether the data is at its origin or its destination.

Security is just one of the things that break. Back when mail was all text
there wasn't much that could go wrong. Throw multimedia into the mix and
all sorts of things can go wrong if you don't stick to the defined canonical
forms.

Based on the requirements of backward compatibility, MIME installed
base, and PEM functionality, the PEM/MIME specification chose to require
all data to be digitally signed to be represented in "7bit".  In
addition, just prior to the signature creation and signature
verification, the record terminators on the data must be canonicalized.

Technically MIME already requires that you do this. However, as I mentioned
above, a lot of implementations "cheat" and use other sorts of line
termination. The key point here is that the signature must be computed on the
canonical form, regardless of how the data is actually stored.

Now let's examine the encryption service.  At first, one might think
that everything I've said up to this point applies to it also.  This
would lead one to believe that a representation must be chosen for all
data that is to be encrypted.  However, there is a fundamental
difference between the digital signature and encryption operations.

The encryption service itself does *NOT* fail if the data does not have
an identical representation in both the originator and recipient
environments.

Correct again. However, it doesn't remove the requirement of canonicalization
from MIME. (On the other hand, MIME always has the application/octet-stream
safety hatch available, which can be used to send any sequence of octets you
want.)

It is for this reason the specification chose a particular design.  The
PEM/MIME specification is intended to add security services to MIME.  It
was essential that the services function correctly.  The specification
does accomplish this.

What the specification of security services does not do is provide a
mechanism that guarantees a recipient will be able process the data that
is received after the decryption or verification process has completed.
MIME, however, does provide a mechanism for an originator to label the
data such that a recipient can make an informed decision as to whether
to attempt to process the data.

Right. And of course there's nothing to say that a given recipient will be able
to process my application/something-really-esoteric data.

In summary, email without MIME or PEM functions best when an originator
and recipient have an a prior relationship and can establish an adhoc
protocol for exchanging non-textual data and sometimes textual data.
MIME provides a framework that allows an originator to label the data
such that a recipient can make an informed decision as to whether to
process the data without any prior communication between the originator
and recipient.  PEM "secures" the data exchanged between the originator
and recipient.  When encrypting text-only MIME contents, an originator
must fall back to the baseline email environment of assessing a
recipient's environment prior to sending the content, although a better
solution would be for an originator's user agent to assist in making
this decision.

Right again. MIME/PEM (or MIME for that matter) didn't create these
issues. They exist any time you send things around in formats that have not
all been universally agreed to. All MIME does is attempt to label things for
what they are, as well as attempting to provide some assurance that when you
have an object of a given type on hand it is in a single canonical form for
such objects.

                                Ned

<Prev in Thread] Current Thread [Next in Thread>