Identity of Documents Puzzle
2002-12-06 17:08:05
I am working on an implementation of xinclude that implements ID
rewriting so that links are authored as references to elements in their
base source locations but the links are resolved correctly in the
transcluded result (as opposed to authoring the links as references to
the elements in their transcluded locations).
In order to do this I must determine the set of documents that comprise
a single compound document (so that I can tell whether a link to a
particular file is in the same compound document or in a different one).
I call this the "xinclude BOS (bounded object set)".
This all works great except for links from an included document back to
the top-level document. At the start of xinclude processing, I calculate
the BOS by constructing a node list of document nodes, one for each
unique document in the document tree represented by the xinclude
references, including to initial document. However, a subsequent
reference to the initial document's file results in a new document node
being constructed, a node different from the one initially added to the
BOS list for the top-level document. Thus, the "is document in BOS?"
check made during the rewriting of link pointers fails and my code
treats the reference as a cross-compound-document link, so it fails in
the transcluded result doc (because the pointer is not rewritten to
reflect the location of the target in the trancluded result document).
This is all tested with Saxon 6.5.2 and depends on, at a minimum, that
multiple calls to document() with the same URL will result in the same
document node instance (and ideally, calls to the same file system
object (e.g., inode in *nx file systems) would result in the same
document node instance).
My question: is there any way, other than passing in the filename of the
top-level file as a parameter to the style sheet, to ensure that the
node for the document as created by the initial style sheet processing
is the same as one for a call to document() for the same file (it may or
may not be the same filename depending on the relative locations of the
files involved)? I can't think of one, but there are many subtleties of
XSLT that I have yet to master.
If I pass in the top-level filename as a parameter, then I can use
document() on that (and just ignore the initially-created document), but
that seems sort of crude. It seems like there ought to be a more
fundamental way to do this. [In HyTime, because everything being
processed is formally grovified, it is possible for there to be a
reliable identity relationship between input files and the document
nodes created from them--I don't think anything in XSLT requires that
level of precision of in-memory representation.]
Or is there some other way to create a reliable identity for document
nodes that doesn't depend on these sorts of implementation details (or
on, for example, putting globally-unique identifiers on document
elements)? I can't think of one off the top of my head.
Because files are objects and therefore have inherent identity, it
shouldn't be necessary, in the abstract, to need to add additional
identifying metadata to a document in order to know with certainty that
it is or is not another document, regardless of how that document is
referenced.
Thanks,
Eliot
--
W. Eliot Kimber, eliot(_at_)isogen(_dot_)com
Consultant, ISOGEN International
1016 La Posada Dr., Suite 240
Austin, TX 78752 Phone: 512.656.4139
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Identity of Documents Puzzle,
W. Eliot Kimber <=
|
|
|