Re: Alternative formats for IDs

2006-01-02 14:06:39
Dear Noel et al;

I trust that the whole IETF community  will have a Happy New Year.

On Jan 2, 2006, at 3:24 PM, Noel Chiappa wrote:

From: John C Klensin <john-ietf(_at_)jck(_dot_)com>

the state of online collaboration and editing that we have been at for
20 or 30 years.
Finally, there is a longstanding and more or less explicit decision in the IETF community to keep the costs of participation as low as possible

There's one other thing, also tied to the IETF (and its predecessor's) long existence, which is the long-term accessability of online documents - again,
another facet in which our experience is pretty unique. Is MS-Word (or
anything else) going to be 30 years from now?

In case you think this is a silly question, I just recently finished scanning / OCR'ing / proofing the oft-cited IEN-19 (Shoch, "Inter-Network Naming, Addressing, and Routing"), from January 1978 - 28 years ago. The original of this document was presumably in some Bravo format, and the printing version was in PRESS - and I somehow doubt either is supported anywhere in the world now. I only had a hardcopy, so the question's a bit moot, but I very much doubt a machine-readable version of either form would have done me much good.

"Don't re-invent the wheel" is I think generally a good engineering principle. In this case, there are a number of entities that are interested in long term archival storage of electronic documents; I think that the IETF should use their expertise and experience.

It seems that the library community has settled on PDF as its long term storage choice, and is
moving to standardize this.

From Harvard University's Report to the Digital Library Federation, October, 2004 :


Adobe's Portable Document Format (PDF) has become the de-facto standard for web-based delivery of electronic documents. The International Organization for Standardization (ISO) has initiated an effort to create an standard for an archival profile of PDF that is amendable for long-term preservation. This standard, PDF/A, is intended to provide an unambiguous definition of the requirements necessary for the reliable and predictable future rendering of archived PDF documents. The second draft of the PDF/A standard was released in May 2004 and is currently undergoing a comment period by experts from the constituent national bodies of ISO. Stephen Abrams, the LDI Digital Library Program Manager at Harvard University, is the project leader and document editor for the ISO PDF/A joint working group.


The next ISO meeting  on this is january 25-26 in Berlin, Germany

Here is the PR announcing the  project :

A new joint activity has been initiated between NPES The Association for Suppliers of Printing, Publishing and Converting Technologies, and the Association for Information and Image Management, International (AIIM International) to develop an International standard that defines the use of the Portable Document Format (PDF) for archiving and preserving documents.

The project, currently referred to as PDF/A, will address the growing need to electronically archive documents in a way that will ensure preservation of their contents over an extended period of time, and will further ensure that those documents will be able to be retrieved and rendered with a consistent and predictable result in the future. This need exists in a growing number of international government and industry segments, including legal systems, libraries, newspapers, regulated
industries, and others.

The work will address the use of PDF for multi-page documents that may contain a mixture of text, raster images and vector graphics. It will also address the features and requirements that must be supported by reading devices that will be used to retrieve and render the archived documents.

This joint committee formed under AIIM and NPES will identify issues to be addressed, as well as proposed solutions, and will develop a draft document that will then be presented to a Joint Working Group of the International Organization for Standardization (ISO) for development and
approval as an International Standard.

The Library of Congress has set up a web site devoted to this issue, ,
which lists all of the formats being considered at

and the ones specifically for text at

these being

DTB, Digital Talking Book
OEBPS_1_0, Open eBook Forum Publication Structure 1.0.1
OEBPS_1_2, Open eBook Forum Publication Structure 1.2
NCBIArch_1, NCBI/NLM Journal Archiving and Interchange DTD, version 1
NITF, News Industry Text Format

PDF, Portable Document Format
PDF_1_4, PDF, Versions 1.0-1.4
PDF_1_5, PDF, Version 1.5
PDF/A, PDF for Preservation
PDF/X, PDF for Prepress Graphics File Interchange


with PDF/A being further described at : http://

(Note : There is of course wording that says that "Inclusion of a format does not imply that it is preferred or acceptable for Library of Congress collections. Conversely, omission of a format from the list does not imply that it is not preferred or acceptable. Descriptions will be drafted and added over time.")


My personal conclusion is

- I am in favor of moving beyond ASCII only.
- I am against using any non-standardized format, such as Word
- If there was a proposal to use PDF/A as standardized, I would support it.

Marshall Eubanks

ASCII may be pretty lobotomized, but it *is* timeless.

(Not that I'm per-se against allowing more powerful forms, mind, but any
proprietary option is just not viable, IMO.)


