I wouldn't even attempt to write any code based on this as the
specification. For this to work at all well, you're going to need to
iteratively adapt the solution to handle all the names in your dataset,
or at least a sample of a couple of thousand of them. There's just too
much variation in the names you might encounter. Are "Jr" and "Sr"
really the only suffixes, and are they always spelt this way, or do you
also get "III" and "Jnr" and "Jnr."?
If I'm wrong, and the names are all regular and in the pattern you
describe, then I think you can just tokenize on whitespace and do
something like
suffix := $tokens[last()][. = ('Jr', 'Sr')]
stem := if ($suffix) then remove($tokens, count($tokens)) else $tokens
value-of select="concat($stem[last()], ',']), remove($stem,
count($stem), if ($suffix) then concat('(', $suffix, ')') else '')"
Michael Kay
Saxonica
On 05/11/2012 23:45, Mark wrote:
This must have been done many times, so can some one show me where to
find the answer?
I have a series of personal names in natural order that I need to
invert. The surname is always last except when followed by ‘Jr’, or
‘Sr’ (either of which may not be present). I want to represent:
J Allen Rogers –> Rogers, J Allen
Bill T Wilson Jr –> Wilson, Bill T (Jr)
A B Brown –> Brown, A B
John Victor Case Sr –> Case, John Victor (Sr)
and so on. There may be a single space or multiple spaces between some
the elements of the name.
It looks like <xsl:analyze-string> will do this, but I do not know how
to write regex.
Thanks,
Mark
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--