The relevant chunk of the spec (from http://www.w3.org/TR/xslt#number):
-------------------
The following attributes are used to control conversion of a list of
numbers into a string. The numbers are integers greater than 0. The
attributes are all optional.
The main attribute is format. The default value for the format attribute
is 1. The format attribute is split into a sequence of tokens where each
token is a maximal sequence of alphanumeric characters or a maximal
sequence of non-alphanumeric characters. Alphanumeric means any character
that has a Unicode category of Nd, Nl, No, Lu, Ll, Lt, Lm or Lo. The
alphanumeric tokens (format tokens) specify the format to be used for each
number in the list. If the first token is a non-alphanumeric token, then
the constructed string will start with that token; if the last token is
non-alphanumeric token, then the constructed string will end with that
token. Non-alphanumeric tokens that occur between two format tokens are
separator tokens that are used to join numbers in the list. The nth format
token will be used to format the nth number in the list. If there are more
numbers than format tokens, then the last format token will be used to
format remaining numbers. If there are no format tokens, then a format
token of 1 is used to format all numbers. The format token specifies the
string to be used to represent the number 1. Each number after the first
will be separated from the preceding number by the separator token
preceding the format token used to format that number, or, if there are no
separator tokens, then by . (a period character).
--------------------------------
Reading this, I'd say Xalan has it right. "If the first token is a
non-alphanumeric token, then the constructed string will start with that
token; if the last token is non-alphanumeric token, then the constructed
string will end with that token." makes it pretty clear that your example
should start with "(" and end with ")". "Each number after the first will
be separated from the preceding number by the separator token preceding
the format token used to format that number, or, if there are no separator
tokens, then by . (a period character)." makes it pretty clear that your
numbers should be separated by periods, since you specified no separator.
I can see where Mike Kay got his implementation, though: "separated from
the preceding number by the separator token preceding the format token
used to format that number". However, the "after the first" part makes me
think that the opening "(" should not apply to numbers after the first.
That, combined with the last sentence from that paragraph in the spec,
makes me think that (1.2.1.1) is the right output.
As I read this, to get the output that Saxon produced, you'd have to
specify "(1(1)", and the fully specified string for what Xalan produced
would be "(1.1)".
I think a clarifying sentence for when only number is present in the
format string but multiple numbers are to be formatted would help. Perhaps
something like "When the format string contains only one numeric position
but the output will be multiple numeric values, the separator should be .
(a period character)."
By the way, I do not wish to imply that I think ill of the spec or its
authors because of this problem. It's very hard to write something
sufficiently generic and still anticipate every case. It's easy to say
that a clarifying sentence would help after the problem has arisen. It's a
much harder writing task to anticipate the problem in the first place and
write the spec to cover it. It's a wonder these kinds of things don't pop
up more often.
My $.02.
Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)
Jack Matheson <jack(_at_)snazzypost(_dot_)com>
04/15/2005 09:31 AM
Please respond to
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
To
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
cc
Subject
[xsl] xsl:number question (XSLT 1.0)
According to the spec, when a sequence number contains more values than
there are formatting tokens, the last formatting token is used for the
excess values. Unfortunately, it is a little vague on which separator
token to use with the excess values.
It says that a '.' is to be used if no separator token exists, but does
this also apply to the case where the final formatting token is re-used
with excess sequence values?
Here is a quick test I did to try and see how different processors are
handling this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="a/b/c/d">
<xsl:number level="multiple" count="*" format="(1)"/>
</xsl:template>
</xsl:stylesheet>
If my input document is...
<?xml version="1.0"?>
<a><b><c><d/></c></b></a>
...then Saxon produces this:
(1(2(1(1)
...while Xalan produces this:
(1.2.1.1)
Both answers seem perfectly reasonable to me, given the lack of clarity
in the 1.0 spec.
Can anyone help me figure out which is (more) correct?
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--