xsl-list
[Top] [All Lists]

Re: [xsl] creating tags around a string

2006-04-19 08:23:05
All,

Thanks so much for the responses. I realize the best place to solve this 
problem is at the source and this gives me incentive to have one more talk 
with the editors. 
-troy



Wendell Piez <wapiez(_at_)mulberrytech(_dot_)com> 
04/18/2006 09:38 AM
Please respond to
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com


To
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
cc

Subject
Re: [xsl] creating tags around a string






Troy,

Listen to Jon. (Except when he says listen to me. Everything I say 
should be taken with salt. I have been known to crack jokes.)

Since INX is an XML-based format, you can and should consider your 
handling of it to fall well within the range of garden-variety XML 
processing. Thus no "tag-writing" techniques are needed. You can and 
should go with the straight stuff.

However, as Jon says that doesn't really solve your problem. That's 
because you are dealing with what we call an "upconversion", namely a 
transformation that goes "up hill", adding information to the source 
that was not there to begin with.

That is, in order to go from

<AUTHOR>Al Stick, Tom She, Dick Burg, and Harry Ward</AUTHOR>

to something richer and more useful like

<AUTHOR>
   <name><fname>Al</fname> <lname>Stick</lname></name>,
   <name><fname>Tom</fname> <lname>She</lname></name>,
   <name><fname>Dick</fname> <lname>Burg</lname></name>, and
   <name><fname>Harry</fname> <lname>Ward</lname></name>
</AUTHOR>

or even to something intermediate like

<AUTHOR>
   <fname>Al</fname> <lname>Stick</lname>,
   <fname>Tom</fname> <lname>She</lname>,
   <fname>Dick</fname> <lname>Burg</lname>, and
   <fname>Harry</fname> <lname>Ward</lname>
</AUTHOR>

your process has to be able to do more than split up strings and wrap 
the substrings in tags (or more properly, insert them in elements). 
Ultimately, it has to be able to recognize what's a name, what's an 
"fname" and what's an "lname".

These are non-trivial operations, which is why thinking up the 
realistic and all-too-common complex cases is an important part of 
this task. Jon suggested "Anne Marie Scott", which takes a form 
you'll see in almost any list of names. Then there's "Mishima Yukio" 
(Japanese like many other languages places the family name first) or 
"George Noel Gordon, Lord Byron" (not two names but one, and you'll 
have to specify how it should be tagged).

XSLT 1.0 was not designed for upconversion, so you'll find even 
straightforward string-wrapping operations (which I see now was the 
essence of your original question) to be rather gnarly and difficult, 
albeit a common problem, which can therefore be handled using 
publicly-available code.

XSLT 2.0 is much better at this, and since you already appear to be 
using XPath 2.0 constructs, I'd recommend you look further into the 
tokenize() function along with XSLT 2.0 regular expressions.

But due to the deeper issues, which have to do not with the mechanics 
of string-wrapping but with semantic inferencing (getting the 
processor to discriminate between the parts of your complex cases and 
label them correctly), my feeling is that this is not wise even to 
attempt without a clear-eyed assessment of the difficulties and 
limitations. This is one of those cases where half a solution is 
often worse than none, since it creates expectations that are then 
bound to be disappointed.

This is probably why Jon also urges that you push the problem back 
upstream. The people creating this data are in a much better position 
to tag it fully and correctly to begin with. Short of that, you may 
find a manual or semi-automated method is less painful than a broken 
automated process that only creates bad code, which must then be 
corrected by hand.

Good luck,
Wendell

At 11:53 AM 4/18/2006, Jon wrote:
On 4/18/06, TGolshan(_at_)computer(_dot_)org 
<TGolshan(_at_)computer(_dot_)org> wrote:
Wendell,

Thanks for the insight. Perhaps I need to explain myself a little 
more.

I'd recommend paying attention to Wendell.  He addressed at least one
of your problems.  You need to think about generating elements, not
"tags".  The code is a bit clearer when you do:

<fname><xsl:value-of select="." /></fname>

instead of

       <xsl:text>
&lt;fname&gt;</xsl:text>
                                                               <
xsl:value-of select="."/>
                                                       <xsl:text>
&lt;/fname&gt;</xsl:text>



I am taking an InDesign inx file and trying to build some structure (ie 
an
XML document) that I can then use later. I am working with an army of
editors who will not style first or last name in InDesign. They will
however style every name as author, so my inx file looks like this:

<AUTHOR>Al Stick, Tom She, Dick Burg, and Harry Ward</AUTHOR>

and I want to add <fname> and <lname> elements to the mix.

What is the best way to do this? I wrote the below function but 
realize
that this is difficult at best.

The reason you're not necessarily getting a ton of help on your
question is that it's a lot deeper and more complex than any simple
trick with XSLT.  This mailing list is concerned with XSLT, while your
problem is more a fundamental problem with markup systems and
publishing....


======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--