There are a number of apparent inconsistencies in the Crypto++ 3.1 benchmark
data, as well as a few incorrect uses of it in your explanation attached to
the previous message.
First some observations on the Crypto III data itself. MD5 is shown as 57.2
MB/sec, though the conditions, i.e. data block size, are not given. MD5-MAC
is shown as 50.2 MB/sec while HMAC-MD5 is shown as 56.7 MB/sec. These
numbers appear inconsistent, unless it is due to different input conditions.
A MAC is just the hash algorithm with a non-standard initialization vector
(the secret key), which is the same length as the standard initialization
vector, so the results for the bare hash and the MAC should be identical.
It is also very hard to imagine how HHAC-MD5 could ever be faster than
MD5-MAC when run under the same conditions. The HMAC runs the hash twice
with two "randomized" versions of the original key, though it is possible to
take advantage of some shared intermediates with MD5/SHA1 hashes. In any
case, there is nothing in the MAC that is not included in the HMAC, and the
HMAC does a few more things than the MAC, so this result is inconsistent and
I wonder why the researcher published the data with this glaring
inconsistency without comment?
The data also shows SHA1 at 25.4MB/sec, which is about half of that for MD5.
More common experience suggests only a 10% actual difference between these
two hashes, though it is highly dependent on the implementation and the data
block size, which is not given here. The performance on long data sets is
not particularly important, as the initialization of the hash computation
can be a high fraction of the total for short data blocks, like spam. Since
we don't know the conditions that the SHA1 data represent, it is dangerous
to extrapolate, but this comes out to 39.4usec/1KB data block.
My largest issue with the parts of this data I looked at is that RSA
encryption with a 512-bit key is shown as 109491 operations/30 seconds,
which is 3650 signatures/second, or 273usec/signature. RSA decryption is
shown as 6886 operations/30 seconds, which is 230 validations/second (not
including the DNS lookup), or 4.36ms/validation. Frankly, the singing being
an order of magnitude faster than the validation is not particularly
believable. I think that the absolute number for validation is believable
given the slow CPU used. I personally doubt that a single-CPU 450MHz
Celeron machine can create almost 4,000 RSA signatures/second. I also note
that the data was taken under Win2K beta 3, rather than a stable version of
the operating system.
The Sendmail data uses dual Pentium III CPU's (512K L2 cache) at 1.3GHz with
2.3GB of memory each, running stable versions of both Linux and Sendmail.
They did not describe the disk subsystem, but it is not very relevant to
this test, since they give data for an implementation of the RSA algorithm
that does not use any disk resources. The validation data they gave
includes the DNS lookup for the public key, though the data was available
locally from a very capable resolver. Looking at the Sendmail data more
closely, here's what I see. The baseline is Sendmail + NOP milter, which in
the case of 1K message size for the signing case delivered 521
messages/second, or 1.92ms/message and in the verification case delivered
515 messages/second, or 1.94ms/message. For a NOP milter, you would expect
the two cases to be the same, so this is reasonable. For Sendmail + DK
milter + \tempfs in memory, signing 1K messages dropped the delivery rate to
302 messages/sec, or 3.31ms/message, an increase of 1.39ms/message over
baseline. This include doing the SHA1 digest of the message. For Sendmail
+ DK milter + \tempfs in memory, verifying 1K messages dropped the delivery
rate to 305 messages/sec, or 3.28ms/message, an increase of 1.34ms/message
over baseline. Note that the verification was about the same as the
signing, even though it included a DNS lookup as well as the SHA1 digest of
the message. I find this unexpected, but believable, since we are talking
about a specific implementation. Unfortunately, I can't find anywhere in
the Sendmail experiment description what the key length was.
So what do we make of these two benchmarks? First, we need to agree on what
can be compared. For a variety of reasons, application performance rarely
scales with processor speed in any well-behaved manner. The devil really is
in the details when it comes to extrapolating how an application would
perform on a different hardware platform. Unfortunately, this leaves us to
primarily compare numbers taken on the same hardware setup and to take some
educated guesses as to how they apply to different systems. These
extrapolations have always been a tricky business and I wouldn't rely too
heavily on them. Let me summarize the key findings of the two experiments
Sendmail Crypto++ 3.1
RSA signature - 0.273ms
RSA validation - 4.36ms
SHA1 digest - 0.0394ms/1K
unsigned 1K message 1.93ms -
DK signature only 1.39ms -
DK validation only 1.34ms -
CPU resources dual PIII-1.3G Celeron-450M
main memory dual 2.3G -
The Sendmail platform was obviously much faster than the Crypto++ 3.1
platform, but there are is still something useful to be gained by looking at
them side by side. First, I urge you to recognize that there is something
very wrong with the RSA signature measurement in the Crypto benchmark.
Aside from being unreasonable for that hardware from my own experience, it
is clearly out of line with what Sendmail measured on a faster machine that
included a SHA1 digest of a 1K message in addition to the RSA signature.
The RSA validation number from Crypto does make sense compared to the
Sendmail results, and I tend to believe both of them. I would suggest that
a number between 4-6msec would be reasonable for the Crypto benchmark for
RSA signatures. The time I list above for a SHA1 digest of a 1K block of
data is an extrapolation from the Crypto++ data point of 25.4MB/sec (with no
input conditions given). This is equivalent to about 25K SHA1
digests/second with a 1K data block, which I think is at least within reason
for their hardware setup. I think it is probably faster, more like the
number they give for MD5, but this is what they gave. It does tend to
indicate that for short messages, the SHA1 is relatively insignificant to
the RSA signature by a couple orders of magnitude. This agrees with my
experience, and I think if you ask other people who work with DSP, you will
get a similar response.
I wish that the Sendmail data separated these two parts of the DK algorithm,
but it did not. Their 32K block data was for a 32K block _average_ size
using a particular distribution, so we really can't do anything with those
numbers to answer the question at hand. Since the RSA signature is made
over a small, fixed-size data block, at some point with very long messages,
the SHA1 digest will overshadow the RSA signature. My own prediction is
when we measure this carefully, this will happen only at message lengths
that are far greater than typical spam. I would say that the Sendmail data
shows that for short messages, the DK signature is an additional load equal
to about 75% of what an unsigned message requires. The Crypto benchmark
data shows this is dominated by the RSA algorithm. This is neither
unexpected nor surprising in any way.
I suggest that the Sendmail data is sufficient reason for you to not rely
exclusively on the Crypto benchmarks. My own perusal of their data suggest
there are a number of other inconsistencies that make the work suspect.
Since we are dealing with a two specific and well-known algorithms, RSA
signatures and SHA1 hashes, I further suggest that you consider measuring
these for yourself for different size data blocks and compare results to the
benchmarks you have been using.