xsl-list
[Top] [All Lists]

RE: regex grouping precedence.

2004-09-29 01:19:41
The groups are numbered by counting left brackets: the 5th unescaped left
bracket starts group 5, regardless of where the closing brackets or are, and
regardless of other operators such as "|".

Michael Kay
http://www.saxonica.com/

-----Original Message-----
From: Pawson, David [mailto:David(_dot_)Pawson(_at_)RNIB(_dot_)ORG(_dot_)UK] 
Sent: 29 September 2004 08:23
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] regex grouping precedence.

http://www.w3.org/TR/xslt20/#element-matching-substring seems to say
little about how nested grouping is numbered.

(...) (....) ( ....)
gives regex-group (1,2,3) OK.

(...) (....)| ( ....)
  Is the third group counted as two due to the alternates?
or still 3?

(...) ( (..)(.).) ( ....)
1     2 3   4     5
is how I would expect to number them,
but I'm totally unsure.

Using xslt 2 to parse a plain text file;

Input string
500748,500748              ,Set My People Free  

regex

 <xsl:for-each select='tokenize($f, "[\r]?\n")'>
    <r> 
  
         <xsl:analyze-string  flags="ix"
         regex="([0-9]{{6}})
         (,,)|(,([0-9]{{6}})\p{{Zs}}+,(.*))$" 
          select=".">
    <xsl:matching-substring>
      <bibno><xsl:value-of select="regex-group(1)"/></bibno>
      <ck><xsl:value-of 
select="normalize-space(regex-group(4))"/></ck>
      <ttl><xsl:value-of 
select="normalize-space(regex-group(5))"/></ttl>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
          <n><xsl:value-of select="."/></n>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
    
   
</r>
</xsl:for-each>
   

output is 

      <n>500748</n>
      <bibno/>
      <ck>500748</ck>
      <ttl>Set My People Free</ttl>


Regards DaveP.

**** snip here *****

-- 
DISCLAIMER:

NOTICE: The information contained in this email and any 
attachments is 
confidential and may be privileged.  If you are not the intended 
recipient you should not use, disclose, distribute or copy any of the 
content of it or of any attachment; you are requested to notify the 
sender immediately of your receipt of the email and then to delete it 
and any attachments from your system.

RNIB endeavours to ensure that emails and any attachments generated by
its staff are free from viruses or other contaminants.  However, it 
cannot accept any responsibility for any  such which are transmitted.
We therefore recommend you scan all attachments.

Please note that the statements and views expressed in this email and 
any attachments are those of the author and do not 
necessarily represent
those of RNIB.

RNIB Registered Charity Number: 226227

Website: http://www.rnib.org.uk




--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--




<Prev in Thread] Current Thread [Next in Thread>