xsl-list
[Top] [All Lists]

Re: [xsl] Use XSLT to check a bunch of XHTML files forwell-formedness?

2021-02-17 10:52:01
Hi Folks,

Thank you for your recommendations on how to check a bunch of XHTML files for 
well-formedness. Here's what I found:

1. I was unable to obtain an EXE for the xml parser that Richard Tobin created, 
RXP. This page

http://www.cogsci.ed.ac.uk/~richard/rxp.html

has a link to an EXE of RXP:

ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.exe

However, that link does not work.

Anyone know where I can get the EXE of RXP?

2. Next, I tried xmlwf. I discovered that you must first download and install 
EXPAT:

https://libexpat.github.io/

That results in downloading: expat-win32bin-2.2.10.exe

Next, double click on it and expat will be installed on your system. Find the 
folder where expat was installed. In there is a bin folder and in the bin 
folder is xmlwf.exe

I ran xmlwf on a folder that contains 10,000 XHTML files. Wow! It checked all 
of them in a couple seconds. However, the error messages are poor. For example, 
here is one of the error messages:

        xhtml\htmloutput10.xhtml:206:2: mismatched tag

Compare that to the error message I get when I run my super-simple XSLT program 
on the XHTML file:

Error on line 206 column 3 of htmloutput10.xhtml:
  SXXP0003  Error reported by XML parser: The element type "input" must be 
terminated by the
  matching end-tag "</input>".

I find the latter error message to be more helpful.

Perhaps there is a flag that can be set in xmlwf to output more verbose/useful 
error messages?

/Roger

-----Original Message-----
From: Liam R. E. Quin liam(_at_)fromoldbooks(_dot_)org 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> 
Sent: Tuesday, February 16, 2021 8:52 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [EXT] Re: [xsl] Use XSLT to check a bunch of XHTML files 
forwell-formedness?

On Tue, 2021-02-16 at 21:42 +0000, Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de
wrote:
On 16.02.2021 22:10, Liam R. E. Quin liam(_at_)fromoldbooks(_dot_)org wrote:
On Tue, 2021-02-16 at 21:04 +0000, Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de
wrote:

In theory I think that should check with doc-available if the file 
is well-formed or not. Haven't tested however.

It catches some problems, but will try to load the DTD.

I thought Saxon has all the important W3C DTDs internalized.

It might, but last time i did this i was texting files with other DTDs, 
including JATS (various different versions, too, each needing a different 
catalogue file).

--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text 
Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org


--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--


<Prev in Thread] Current Thread [Next in Thread>