Are you aware that XPath 3.0 has the function parse-ietf-date() for this?
https://www.w3.org/TR/xpath-functions-31/#func-parse-ietf-date
The Saxon implementation is in Java; I haven't attempted an XPath
implementation. But you might find the spec (and the associated notes) is
useful in itself; and of course the QT3 test suite has test cases.
I don't know how date/times in RFC 2822 relate to all the other miscellaneous
RFCs referenced in the spec. Liam Quin did most of the research for this.
What are your requirements for handling invalid values?
Michael Kay
Saxonica
On 8 Apr 2019, at 22:58, Martynas Jusevičius
martynas(_at_)atomgraph(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Hi,
I have an XSLT 2.0 task where I'm parsing email Date headers defined
in RFC 2822 and converting them to xsd:dateTime.
Below is a function that converts between the two. I wanted to hear if
there are improvements that could be made?
<xsl:function name="aex:rfc2822dateTime-to-dateTime" as="xs:dateTime">
<xsl:param name="date-time" as="xs:string"/> <!-- Tue, 9 Apr
2019 00:07:24 +1200 (NZST) -->
<xsl:variable name="months" select="'Jan', 'Feb', 'Mar',
'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'"
as="xs:string*"/>
<xsl:analyze-string select="$date-time"
regex="^(?:(Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+)?(0[1-9]|[1-2]?[0-9]|3[01])\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(19[0-9]{{2}}|[2-9][0-9]{{3}})\s+(2[0-3]|[0-1][0-9]):([0-5][0-9])(?::(60|[0-5][0-9]))?\s+([-\+][0-9]{{2}}[0-5][0-9]|(?:UT|GMT|(?:E|C|M|P)(?:ST|DT)|[A-IK-Z]))(\s+|\(([^\(\)]+|\\\(|\\\))*\))*$">
<xsl:matching-substring>
<xsl:sequence
select="xs:dateTime(concat(format-number(xs:integer(regex-group(4)),
'0001'), '-', format-number(index-of($months, regex-group(3)), '01'),
'-', format-number(xs:integer(regex-group(2)), '01'), 'T',
format-number(xs:integer(regex-group(5)), '01'), ':',
format-number(xs:integer(regex-group(6)), '01'), ':',
format-number(xs:integer(regex-group(7)), '01'),
substring(regex-group(8), 1, 3), ':', substring(regex-group(8), 4,
2)))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:message>Invalid RFC 2822 datetime: <xsl:value-of
select="$date-time"/></xsl:message>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
The regex pattern is taken from
https://stackoverflow.com/questions/9352003/rfc-2822-date-regex
Martynas
atomgraph.com
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--