Re: draft-rfc-image-files-00.txt

Hi John,

On 2008-08-23 20:01 John C Klensin said the following:


--On Saturday, 23 August, 2008 14:01 +0200 Henrik Levkowetz
<henrik(_at_)levkowetz(_dot_)com> wrote:

Tools-wise I believe this proposal proposes no great
challenges.
There's however one point where a minor change could make a big
difference in ease of tools handling:

The proposal suggests that the pdf-format pages which contains
the figures be numbered with page numbers which follow the last
content bearing page of the base (ascii) document, with the
final boilerplate page(s) being numbered to follow the pdf
pages.

This means that in order to provide links between text
references
in derived formats, such as the htmlized RFC and drafts, a tool
would need to find and parse the figure index, and use that to
create a translation table from Figure number to Page number,
in
order to be able to provide links from 'Figure N' references in
the text to the actual figure.


I was assuming that such a tool could reasonably be manual,
since the "image file" would presumably be fairly static.


If a 'Figure concordance file' was produced manually as part of
the publication process, maybe this would work for RFCs, but I
can't really see it working for drafts.  If I even had to spend
one minute doing manual work for each new draft, in order to
produce the htmlized draft versions, it just would not happen.

However...

It would be easier to provide correct links if the analogy with
separately printed illustration plates in books (which were
commonly numbered 'Plate 1', 'Plate 2', etc.) were carried
further, and the PDF pages were numbered "Figure-1",
"Figure-2",
etc., thus eliminating the indirection through page numbers.
(This would also obviate the need to update xml2rfc to make it
produce non-consecutive page numbers for the trailing
boilerplate page(s).)


That was more or less where we started, and something we came
back to a few times.  The problem is a different and very
practical one.   I've been advised by colleagues in the library
community that the "Plate-1" / "Figure-1" model (or even leaving
those pages unnumbered, which is also common) works because the
pages are physically bound in with the rest of the volume.   If
they are lose, they are often given different accession numbers,
separate check-out records when that is relevant, etc.,  from
the base document/book, just to be sure that both pieces can be
kept track of.  We have no direct analogy to that unless, e.g.,
we don't store "rfcN.txt" and "rfcN.img.pdf" in the archives at
all but instead (borrowing a suggestion from a private note) we
store "rfcN.zip" or "rfcN.sh" (a self-extracting script).
Storing a "get one piece, get it all" aggregate that way has
some considerable advantages, but also considerable
disadvantages (e.g., your retrieval tools would have to extract
the pieces and sort things out).


I'm not so sure the disadvantages are of any great account; while
I also considered the advantages of making a bundle available. I
think, however, that this really is peripheral to the issue of
how to number the Figures pages.

All of that would be almost irrelevant were it not for the IPR
issue.  The "number consecutively and use common headers and
footers" idea emerged in part from a discussion with Counsel
about whether separate boilerplate would be needed for the image
file.  Even starting down the path of that discussion leads us
into the obvious debate about what should be in that
boilerplate, who could use what for which purposes, etc. -- a
debate that, speaking personally, I'd prefer to avoid at almost
any cost.


Hmm.  Understood.

By the way, another obvious alternative, which would be
numbering the pages of the image file so that they started
_after_ the boilerplate page of the text base RFC, runs into the
same problem because there are requirements in our various
documents, and perhaps legal requirements, that the "last page"
boilerplate be the last page, not buried somewhere in the middle
of the logical document.


Right.

Note that, whether it is a good idea or not, and whether it
should be done by the RFC Editor, otherwise, or not at all, the
format as specified would permit the creation of a rather
trivial of a merging process/ script that would:

      (1) Generate the PDF-print (no additional formatting)
      version of the ASCII text (i.e., the object that now has
      the file name rfcN.txt.pdf if it exists)
      
      (2) Create a new, composite, PDF file that contained
      pages 1 .. N-1 of the output of (1), the image file, and
      then page N of the output of (1).

Numbering the figures/plates in a completely different way
doesn't make that harder, but it does make the result look a
little less coherent.


True.

I'm still concerned about the lack of a solid machine-readable
coupling between figure number in the base document and page
number in the pdf file.  But if my proposal of numbering the
pdf pages with figure numbers doesn't fly, I'll look for
alternatives.


Regards,

        Henrik

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf