Here be dragons.
I agree with you that the specification of numbering sequences is very
weak. In my view it's a classic case of "benign cultural imperialism" -
the spec authors wanted to make it fully international and localisable,
but since they were a bunch of Americans plus the odd expatriate
European, they didn't really have much idea in detail how to go about
it. This situation hasn't really changed in the 2.0 working group, and
the same problem has also made it difficult to agree a spec for
format-date().
As regards the specific questions, I think the result is that
implementors have a pretty free hand to do whatever they think is right.
On collating sequences the group has adopted a different approach: leave
it all to the implementor. This is probably wiser, since implementors
who want to sell their product in a particular geographical market
probably have access to local information about the requirements of that
market. (Well, perhaps this is being optimistic - for years US vendors
produced collating sequences for German which were approved by the
grammar textbooks, but had long since been superseded in popular use:
and contrariwise, Microsoft spell-checkers still tell me that "-ize"
endings are not allowed in the UK, when the OED insists that they
are...).
Michael Kay
-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of
Mike Brown
Sent: 17 March 2003 05:50
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] xsl:number
I have questions about xsl:number. This is the most poorly
specified instruction I've come across. It's really hard to
even know what questions to ask.
The way I interpret the XSLT 1.0 spec (and the 2.0 draft
doesn't help),
<xsl:number format="A"/>
must be supported, and it must produce something from the sequence
A, B, C, ..., Z, AA, AB, AC, ...
where A=1, B=2, etc.
The way it is specified, it seems to indicate that the
alphabet must be the English alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ.
Or perhaps it could be any alphabet that starts with ABC and
ends with Z, like the Spanish alphabet, which varies
depending on who you ask, but for computing purposes I think
is generally ABCDEFGHIJKLMNÑOPQRSTUVWXYZ.
Or perhaps everything after "A" is just an example, meaning
that it very well could be the Swedish alphabet:
ABCDEFGHIJKLMNOPQRSTUVWXYZÅÄÖ ... or perhaps Vietnamese,
which starts with A and has no Z.
Anyway, the implication is that a processor must support some
alphabet that contains "A". Or is "A" just a placeholder for
any alphabetic character?
"When numbering with an alphabetic sequence, the lang
attribute specifies which language's alphabet is to be used;
it has the same range of values as xml:lang [XML]; if no lang
value is specified, the language should be determined from
the system environment."
It seems to me that if format="A", then the value of lang,
whether determined by the processor or specified in the
stylesheet, must be a language that contains "A".
What happens if the processor supports both English and
Hebrew, and I do something like
<xsl:number format="A" lang="he"/>
? Or for that matter,
<!-- #1488 = Hebrew letter Aleph -->
<xsl:number format="א" lang="en"/>
?
What does
<xsl:number format="B"/>
mean? At the very least, I know "B" must represent 1. If the
default language is English, does this mean the sequence must be
B, C, D, ..., Z, BB, BC, BD, ...
?
The spec also says format="I" must be supported by using
Roman numerals. What does format="I" mean when the language
is not English?
The spec says "In many languages there are two commonly used
numbering sequences that use letters. One numbering sequence
assigns numeric values to letters in alphabetic sequence, and
the other assigns numeric values to each letter in some other
manner traditional in that language. In English, these would
correspond to the numbering sequences specified by the format
tokens a and i."
This seems to indicate that using "I" for Roman is a
"traditional" English convention, and (reading further) that
I could use letter-value="alphabetic" to override this
interpretation. If my theory about format="B" is correct,
then format="I" with letter-value="alphabetic" would result
in I, J, K, ... sequences.
I don't know. I have more questions, but I'll just stop here.
I really hope this stuff gets cleared up in 2.0, although
that doesn't help me much in trying to properly implement 1.0.
Mike
--
Mike J. Brown | http://skew.org/~mike/resume/
Denver, CO, USA | http://skew.org/xml/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list