dtdparse-trackers
[Top] [All Lists]

[Dtdparse-trackers] [ dtdparse-Support Requests-3305203 ] #FIXED entity ref not expanded in HTML 4.01 DTD

2011-05-20 10:43:59
Support Requests item #3305203, was opened at 2011-05-20 10:39
Message generated for change (Comment added) made by srn3
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=399000&aid=3305203&group_id=30351

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: dtdparse
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Steven R. Newcomb (srn3)
Assigned to: Nobody/Anonymous (nobody)
Summary: #FIXED entity ref not expanded in HTML 4.01 DTD

Initial Comment:
A parameter entity is not being expanded.  The discrepancy is that the output is:

<attribute name="version"
           type="#FIXED"
           value="CDATA"
           default="%HTML.Version;"/>

but it should be:

<attribute name="version"
           type="#FIXED"
           value="CDATA"
           default="-//W3C//DTD HTML 4.01 Transitional//EN"/>

I'm attaching:

html.dtd  -- a vastly truncated version of the HTML 4.01 DTD containing nothing but what's needed to demonstrate the bug

run.txt -- what the run of dtdparse looked like, complete.  As you'll see, curiously, the entity text is correctly expanded.  The expanded version doesn't appear in the default attribute value, though.

sgml.soc -- the catalog file invoked on the command line in the attached 'run' file.  It's very vanilla.

sgml.dcl  -- the sgml declaration referenced in sgml.soc.  It's very vanilla, too.

----------------------------------------------------------------------

>Comment By: Steven R. Newcomb (srn3)
Date: 2011-05-20 10:43

Message:
Well, I was only permitted to attach one file to this, so here are the rest
of them in a less-convenient form:

#######################################################
run.txt
#######################################################
/tmp  srn@zorba% dtdparse --catalog sgml.soc html.dtd
Reading sgml.soc...
Public ID: unknown
System ID: html.dtd
SGML declaration: sgml.dcl
Parse complete.
<!DOCTYPE dtd PUBLIC "-//Norman Walsh//DTD DTDParse V2.0//EN"
              "dtd.dtd" [
]>
<dtd version='1.0'
     unexpanded='1'
     title="?untitled?"
     namecase-general="1"
     namecase-entity=""
     xml="0"
     system-id="html.dtd"
     public-id=""
     declaration="sgml.dcl"
     created-by="DTDParse V2.00"
     created-on="Fri May 20 11:27:50 2011"
>
<entity name="version"
        type="param"
>
<text-expanded>version CDATA #FIXED '-//W3C//DTD HTML 4.01
Transitional//EN'</text-expanded>
<text>version CDATA #FIXED '%HTML.Version;'</text>
</entity>

<entity name="HTML.Version"
        type="param"
>
<text-expanded>-//W3C//DTD HTML 4.01 Transitional//EN</text-expanded>
<text>-//W3C//DTD HTML 4.01 Transitional//EN</text>
</entity>

<element name="HTML" stagm="O" etagm="O"
         content-type="element">
<content-model-expanded>
  <empty/>
</content-model-expanded>
<content-model>
  <empty/>
</content-model>
</element>

<attlist name="HTML">
<attdecl>
  %version;
</attdecl>
<attribute name="version"
           type="#FIXED"
           value="CDATA"
           default="%HTML.Version;"/>
</attlist>

</dtd>
Done.
/tmp  srn@zorba% dtdparse --version
Version: dtdparse v2.00

Usage:
     dtdparse [options] [dtdfile]

/tmp  srn@zorba% 


#######################################################
html.dtd
#######################################################
<!ENTITY % HTML.Version "-//W3C//DTD HTML 4.01 Transitional//EN">

<!ENTITY % version "version CDATA #FIXED '%HTML.Version;'">

<!ELEMENT HTML O O EMPTY>
<!ATTLIST HTML
  %version;
>

#######################################################
sgml.soc
#######################################################
SGMLDECL "sgml.dcl"

#######################################################
sgml.dcl
#######################################################
<!SGML  "ISO 8879:1986"
    --
         SGML Declaration for HyperText Markup Language version 4.0

         With support for the first 17 planes of ISO 10646 and
         increased limits for tag and literal lengths etc.
    --

    CHARSET
          BASESET  "ISO Registration Number 177//CHARSET
                    ISO/IEC 10646-1:1993 UCS-4 with
                    implementation level 3//ESC 2/5 2/15 4/6"
         DESCSET 0       9       UNUSED
                 9       2       9
                 11      2       UNUSED
                 13      1       13
                 14      18      UNUSED
                 32      95      32
                 127     1       UNUSED
                 128     32      UNUSED
                 160     55136   160
                 55296   2048    UNUSED  -- SURROGATES --
                 57344   8191    57344

CAPACITY        SGMLREF
                TOTALCAP        150000
                GRPCAP          150000
                ENTCAP          150000

SCOPE    DOCUMENT
SYNTAX
         SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
           17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127
         BASESET  "ISO 646IRV:1991//CHARSET
                   International Reference Version
                   (IRV)//ESC 2/8 4/2"
         DESCSET  0 128 0

         FUNCTION
                  RE            13
                  RS            10
                  SPACE         32
                  TAB SEPCHAR    9

         NAMING   LCNMSTRT ""
                  UCNMSTRT ""
                  LCNMCHAR ".-_:"
                  UCNMCHAR ".-_:"
                  NAMECASE GENERAL YES
                           ENTITY  NO
         DELIM    GENERAL  SGMLREF
                  SHORTREF SGMLREF
         NAMES    SGMLREF
         QUANTITY SGMLREF
                  ATTCNT   120     -- increased to 60 and then doubled
because htmlloose.dtd uses 62 at one point --
                  ATTSPLEN 65536   -- These are the largest values --
                  LITLEN   65536   -- permitted in the declaration --
                  NAMELEN  65536   -- Avoid fixed limits in actual --
                  PILEN    65536   -- implementations of HTML UA's --
                  TAGLVL   100
                  TAGLEN   65536
                  GRPGTCNT 150
                  GRPCNT   64

FEATURES
  MINIMIZE
    DATATAG  NO
    OMITTAG  YES
    RANK     NO
    SHORTTAG YES
  LINK
    SIMPLE   NO
    IMPLICIT NO
    EXPLICIT NO
  OTHER
    CONCUR   NO
    SUBDOC   NO
    FORMAL   YES
  APPINFO NONE
>




----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=399000&aid=3305203&group_id=30351

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay