For the most part I second jhutz's opinion on this draft.
Additional comments inline...
Jeffrey Hutzelman wrote:
My first overall thought is "why??"
Not everything needs to be wrapped in XML, and in this case, it appears
that there are few real benefits and a number of significant drawbacks.
It is difficult to tell from the document whether the authors actually
intend this format as a substitute for programming-data formats like
S-records, or as a format for transferring data dumps over the Internet.
It has a number of drawbacks which would seem to make it unsuitable for
the former case, and doesn't seem to offer much over a raw hex dump
for the latter. It would be helpful if the authors could clarify the
intended application for this format.
It appears to me that this format is primarily useful as a storage
format. The authors state that the format is not secure enough by
itself to ensure integrity when transferring between systems. In a
transport format there should be some means of ensuring that all blocks
have been received in the correct order and that no additional blocks
were added. This description does not attempt to provide those features
therefore I believe it is meant for storage. This should be stated
clearly.
Even for a storage format it would be nice for the authors to specify
a means for describing block order as well as a means for providing
a checksum over the entire dump.
I would also advise the authors and other interested parties to examine
draft-housley-cms-fw-wrap-09.txt, on which an IETF Last Call recently
concluded. It describes a method for securely transporting firmware
images over the internet and directly to hardware devices. While it is
too complex to be suitable for direct programming of low-level devices,
it is quite appropriate for delivery as far as the workstation, device
programmer, or bootloader. [Note that I have nothing to do with that
document, other than having recently reviewed it]
In any case, I see a number of problems, some of which are significant;
- This specification repeatedly uses the word "byte" to refer to an octet.
Further, it prohibits representation of data with word sizes which are
not multiples of 8 bits, claiming that such things are not used in
"practical present-day applications". While byte sizes other than 8 bits
and word sizes which are not multiples of 8 bits have become extremely
uncommon in general-purpose computing devices, they are still used in
more special-purpose devices, and many of the low-level devices which
are within the stated scope of this document are programmed with data
which uses "odd" word sizes.
The document provides a definition for "byte" and then defines "octet"
as a "byte" but doesn't use it after that. I would replace all
references to "byte" with "octet" and get rid of "byte" entirely.
The restriction to "word" sizes which are multiples of "octets" seems a
bit odd to me as well especially given the restriction on the size of a
block being specified in "bits".
- The introduction indicates an intent to provide an alternative to
formats used for "hexadecimal data" and particularly device programming
data, the de facto standard "S-record" format is mentioned by name.
However, it fails to capture a fundamental property of such formats,
which is that they are generally simple enough to send to a device or
programmer without further parsing. The authors admit that an XML
parser is "not easily deployed in hardware devices", but suggest that
instead a workstation should be used to convert data from the specified
format into one the device can actually handle.
If this is the expected use case, then I fail to see the advantage over
simply transporting the data over established file-transfer protocols
(FTP, HTTP) in a format which can be directly understood by the device.
Many devices can be programmed by sending the distributed image over an
RS-232 connection with no preprocessing; requiring a translation step
severely reduces the set of devices that can be used for this purpose.
For example, it makes it unlikely that I would be able to walk around
my machine room with a PDA, upgrading firmware in network devices or
RAID controllers.
One example that I came up with would be a driver update package in
which the dump contained different versions of the same drivers for
different platforms. Perhaps one for a 32-bit version of the OS and
another for the 64-bit version. Only one of which would actually be
used. In such an example, the dump is simply a storage medium and
the processing application would be selectively extracting from it
independent data streams for eventual delivery to the device.
Unfortunately, I feel like I am searching for a target application which
should be spelled out in the document.
- This specification REQUIREs the use of SHA-1, providing no means to
upgrade to an alternate hash in the future. This lack of algorithm
agility is not very forward-looking.
The checksum is currently embedded in the header for each block. The
problem I have with this is that it restricts the size of the block to
be something storable in memory and even assumes that the entire data
block must be available prior to the generation of the header. It is
very likely that the source of the data being stored by be coming from
a stream source and there may not be enough memory to store it all
before writing to the dump media.
The checksum should be stored as a tag inside the block and the tag
should contain an attribute specifying which algorithm was used.
A similar checksum tag should be available to validate the entire dump.
- In section 4.1, you say "if the value is untrue...". I suspect you
mean something like "if the value does not match...". Further,
rather than leaving the behaviour in the case of an incorrect length
up to the implementation, it should be RECOMMENDED (RFC2119) that
implementations reject such files.
- In section 4.2, you require the start_address attribute to be
provided, even though it may not be meaningful in all cases. This
attribute should be OPTIONAL.
I can see this format being used to store crash data from an application
for later debugging. In this case there may be blocks which contain
stack information or register contents which are not memory addressable.
- I don't believe 64 bits are required to represent word size.
In fact, I question whether it is necessary for this format to
represent word size at all.
I believe that word size may make sense for some types of blocks which
would be stored in the dump file but it should not be REQUIRED. I
believe the most general applications would only be interested in octet
streams.
- The number of blocks is OPTIONAL, but the block length is REQUIRED.
Further, there is a per-block checksum but no overall checksum.
These properties would seem to suggest that the intent is to allow
stream-encoding by encoding an arbitrary number of relatively small
blocks. This is fine, but lacking both a block count and an overall
checksum, there is no way to tell whether the entire dump was
transferred correctly. I would suggest adding an overall-checksum
element, to be encoded after the last block (_not_ as an attribute).
If one purpose is to allow encoding an arbitrary number of small blocks,
there should be some indication of whether order is important, whether
blocks can be dropped, etc.
- Why is the number of _bits_ in a block limited to 2^64-1?
This limitation seems unnecessary, given that everything else is
done in terms of octets.
Why bits if the word size is restricted to octets? Why not just specify
the number of words since words are already required?
- The requirement that words inside a dump be represented in network
order is silly. The contents of a dump are by their nature specific
to a particular device, and should be in whatever format is most
appropriate for that device. Again, I question whether this format
should have any notion of "words" at all.
As one of the comments in the ID Tracker stated, the byte order
representation for each block should be determined by the application.
Each block should have an attribute specifying the byte order used.
My biggest concern is that this format is not general enough. I fear
that because the uses the authors were considering are not spelled out
that there are underlying assumptions embedded in the document which
will hamper its usefulness.
Jeffrey Altman
Secure Endpoints Inc.
jaltman.vcf
Description: Vcard
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf