xsl-list
[Top] [All Lists]

[xsl] Suggestions for naming files based on starting node?

2008-04-16 10:43:11
Hi -

I'm parsing a very long XML file into multiple smaller XHTML files using <xsl:result-document>, and I want to be able to name the resulting files based on the first node from the source document which each output XTHML file contains. This first node is always a container node (similar to "Sect1" ... "Sect5" in DocBook) and can have varying and often lengthy amounts of content.

There are some wrinkles in this request:

1) the source document contains intra-document hyperlinks which I'd like to preserve in the output documents. I have no control over which portion of the document a given link points to, so it's pretty much the norm that a link in one of the output XHTML documents will end up pointing to a target in another XHTML file. I have to be able to recreate the file name of the XHTML file containing the target node using only the link-to-target id/idref relationship and/or content from the link itself or its target. I already have all the linking mechanisms working, so this file naming issue is the only real issue.

2) I'd like if at all possible to preserve the order in which the files were generated in the file name so that one can sort the files by name (within frontmatter/bodymatter/rearmatter divisions) and move from one document to the next without using hyperlinks while still reading the content in order.

3) Users can select the level of container at which they want to break up the source document, so a fixed naming scheme is problematic.

I tried generating file names using count(preceding::*) from the target node as part of the file name, which results in names like this:

frontmatter-121.html
frontmatter-219.html
bodymatter-1171.html

This works quite well, but as you'd expect the overhead is ridiculous. Using generate-id() also works for points 1 and 3 above, and (using Saxon-B 9.0.0.4) it *appears* to generate ascending IDs, but I don't know if I can depend on this as an ordering mechanism.

I know there's got to be some way of doing this that doesn't require counting preceding nodes every time I have to generate a new file, but I've been looking at this for a while and I need some new perspectives. About the only thing I've come up with so far to address this is to write a pre-pass transformation which copies every node in the source and generates sequential identifier attributes on every container node which could be used to generate file names should the user choose to break at that container level. If this in itself isn't too expensive then it would be a pretty easy way to address the issue.

Do you smart folks out there have any other ideas?


Thanks
Chris



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--