Hi Folks,
Thank you for your recommendations on how to check a bunch of XHTML files for
well-formedness. Here's what I found:
1. I was unable to obtain an EXE for the xml parser that Richard Tobin created,
RXP. This page
http://www.cogsci.ed.ac.uk/~richard/rxp.html
has a link to an EXE of RXP:
ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.exe
However, that link does not work.
Anyone know where I can get the EXE of RXP?
2. Next, I tried xmlwf. I discovered that you must first download and install
EXPAT:
https://libexpat.github.io/
That results in downloading: expat-win32bin-2.2.10.exe
Next, double click on it and expat will be installed on your system. Find the
folder where expat was installed. In there is a bin folder and in the bin
folder is xmlwf.exe
I ran xmlwf on a folder that contains 10,000 XHTML files. Wow! It checked all
of them in a couple seconds. However, the error messages are poor. For example,
here is one of the error messages:
xhtml\htmloutput10.xhtml:206:2: mismatched tag
Compare that to the error message I get when I run my super-simple XSLT program
on the XHTML file:
Error on line 206 column 3 of htmloutput10.xhtml:
SXXP0003 Error reported by XML parser: The element type "input" must be
terminated by the
matching end-tag "</input>".
I find the latter error message to be more helpful.
Perhaps there is a flag that can be set in xmlwf to output more verbose/useful
error messages?
/Roger
-----Original Message-----
From: Liam R. E. Quin liam(_at_)fromoldbooks(_dot_)org
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
Sent: Tuesday, February 16, 2021 8:52 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [EXT] Re: [xsl] Use XSLT to check a bunch of XHTML files
forwell-formedness?
On Tue, 2021-02-16 at 21:42 +0000, Martin Honnen
martin(_dot_)honnen(_at_)gmx(_dot_)de
wrote:
On 16.02.2021 22:10, Liam R. E. Quin liam(_at_)fromoldbooks(_dot_)org wrote:
On Tue, 2021-02-16 at 21:04 +0000, Martin Honnen
martin(_dot_)honnen(_at_)gmx(_dot_)de
wrote:
In theory I think that should check with doc-available if the file
is well-formed or not. Haven't tested however.
It catches some problems, but will try to load the DTD.
I thought Saxon has all the important W3C DTDs internalized.
It might, but last time i did this i was texting files with other DTDs,
including JATS (various different versions, too, each needing a different
catalogue file).
--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text
Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--