All,
Thanks so much for the responses. I realize the best place to solve this
problem is at the source and this gives me incentive to have one more talk
with the editors.
-troy
Wendell Piez <wapiez(_at_)mulberrytech(_dot_)com>
04/18/2006 09:38 AM
Please respond to
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
To
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
cc
Subject
Re: [xsl] creating tags around a string
Troy,
Listen to Jon. (Except when he says listen to me. Everything I say
should be taken with salt. I have been known to crack jokes.)
Since INX is an XML-based format, you can and should consider your
handling of it to fall well within the range of garden-variety XML
processing. Thus no "tag-writing" techniques are needed. You can and
should go with the straight stuff.
However, as Jon says that doesn't really solve your problem. That's
because you are dealing with what we call an "upconversion", namely a
transformation that goes "up hill", adding information to the source
that was not there to begin with.
That is, in order to go from
<AUTHOR>Al Stick, Tom She, Dick Burg, and Harry Ward</AUTHOR>
to something richer and more useful like
<AUTHOR>
<name><fname>Al</fname> <lname>Stick</lname></name>,
<name><fname>Tom</fname> <lname>She</lname></name>,
<name><fname>Dick</fname> <lname>Burg</lname></name>, and
<name><fname>Harry</fname> <lname>Ward</lname></name>
</AUTHOR>
or even to something intermediate like
<AUTHOR>
<fname>Al</fname> <lname>Stick</lname>,
<fname>Tom</fname> <lname>She</lname>,
<fname>Dick</fname> <lname>Burg</lname>, and
<fname>Harry</fname> <lname>Ward</lname>
</AUTHOR>
your process has to be able to do more than split up strings and wrap
the substrings in tags (or more properly, insert them in elements).
Ultimately, it has to be able to recognize what's a name, what's an
"fname" and what's an "lname".
These are non-trivial operations, which is why thinking up the
realistic and all-too-common complex cases is an important part of
this task. Jon suggested "Anne Marie Scott", which takes a form
you'll see in almost any list of names. Then there's "Mishima Yukio"
(Japanese like many other languages places the family name first) or
"George Noel Gordon, Lord Byron" (not two names but one, and you'll
have to specify how it should be tagged).
XSLT 1.0 was not designed for upconversion, so you'll find even
straightforward string-wrapping operations (which I see now was the
essence of your original question) to be rather gnarly and difficult,
albeit a common problem, which can therefore be handled using
publicly-available code.
XSLT 2.0 is much better at this, and since you already appear to be
using XPath 2.0 constructs, I'd recommend you look further into the
tokenize() function along with XSLT 2.0 regular expressions.
But due to the deeper issues, which have to do not with the mechanics
of string-wrapping but with semantic inferencing (getting the
processor to discriminate between the parts of your complex cases and
label them correctly), my feeling is that this is not wise even to
attempt without a clear-eyed assessment of the difficulties and
limitations. This is one of those cases where half a solution is
often worse than none, since it creates expectations that are then
bound to be disappointed.
This is probably why Jon also urges that you push the problem back
upstream. The people creating this data are in a much better position
to tag it fully and correctly to begin with. Short of that, you may
find a manual or semi-automated method is less painful than a broken
automated process that only creates bad code, which must then be
corrected by hand.
Good luck,
Wendell
At 11:53 AM 4/18/2006, Jon wrote:
On 4/18/06, TGolshan(_at_)computer(_dot_)org
<TGolshan(_at_)computer(_dot_)org> wrote:
Wendell,
Thanks for the insight. Perhaps I need to explain myself a little
more.
I'd recommend paying attention to Wendell. He addressed at least one
of your problems. You need to think about generating elements, not
"tags". The code is a bit clearer when you do:
<fname><xsl:value-of select="." /></fname>
instead of
<xsl:text>
<fname></xsl:text>
<
xsl:value-of select="."/>
<xsl:text>
</fname></xsl:text>
I am taking an InDesign inx file and trying to build some structure (ie
an
XML document) that I can then use later. I am working with an army of
editors who will not style first or last name in InDesign. They will
however style every name as author, so my inx file looks like this:
<AUTHOR>Al Stick, Tom She, Dick Burg, and Harry Ward</AUTHOR>
and I want to add <fname> and <lname> elements to the mix.
What is the best way to do this? I wrote the below function but
realize
that this is difficult at best.
The reason you're not necessarily getting a ton of help on your
question is that it's a lot deeper and more complex than any simple
trick with XSLT. This mailing list is concerned with XSLT, while your
problem is more a fundamental problem with markup systems and
publishing....
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--