xsl-list
[Top] [All Lists]

Re: [xsl] Inverting names with Jr and Sr considered

2012-11-05 19:30:43
Hi Gerrit,
Good question about multiple names. Since my sources names are, for the most part, British and Czech names, I may never encounter one. I suspect I will have to insert a hyphen by hand if I find one.

Since I am collecting my data from a Word format index of a journal -- by pasting the four elements from each entry into an xml document, I already have the labor of at least looking at every piece of data.

I used the function Andrew supplied on my XML document to produce a second document with the name in the desired order. Saves an immense amount of work (not having to invert them by hand).

Thanks again for your suggestion and especially your insightful question.
Mark

-----Original Message----- From: Imsieke, Gerrit, le-tex
Sent: Monday, November 05, 2012 5:21 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Inverting names with Jr and Sr considered

Hi Mark,

I first tried this regex: ((\w+\s+)+)(\w+)(\s+[JS]r)?

But its inherent greediness would always put [JS]r in the (\w+) group,
considering it as the last name.

So maybe split this up into two analyze-string passes (or replacements,
don’t necessarily need to use analyze-string):

      <xsl:analyze-string select="." regex="\s+([JS]r)\s*$">
        <xsl:matching-substring>
          <xsl:text> (</xsl:text>
          <xsl:value-of select="regex-group(1)"/>
          <xsl:text>)</xsl:text>
        </xsl:matching-substring>
        <xsl:non-matching-substring>
          <xsl:analyze-string select="." regex="(.+?)\s+(\w+)\s*$">
            <xsl:matching-substring>
              <xsl:value-of select="regex-group(2)"/>
              <xsl:text>, </xsl:text>
              <xsl:value-of select="regex-group(1)"/>
            </xsl:matching-substring>
          </xsl:analyze-string>
        </xsl:non-matching-substring>
      </xsl:analyze-string>

How do you intend to deal with multiple surname components? The way your
examples are structured, only multiple first names may occur, while the
last name is always expected to match \w+ (no spaces in between).

Gerrit

On 2012-11-06 00:45, Mark wrote:
This must have been done many times, so can some one show me where to
find the answer?

I have a series of personal names in natural order that I need to
invert. The surname is always last except when followed by ‘Jr’, or ‘Sr’
(either of which may not be present). I want to represent:

J Allen Rogers –> Rogers, J Allen
Bill T Wilson Jr –> Wilson, Bill T (Jr)
A B Brown –> Brown, A B
John Victor Case Sr –> Case, John Victor (Sr)

and so on. There may be a single space or multiple spaces between some
the elements of the name.

It looks like <xsl:analyze-string> will do this, but I do not know how
to write regex.

Thanks,
Mark


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--