xsl-list
[Top] [All Lists]

Re: Converting HTML to plain text

2004-06-22 17:41:17

I can constrain HTML pages to be valid XML. So, the hard part is solved. But still I don't know of a good solution to covert it to plain text. ...

If tables weren't an issue, I think "lynx -dump file.html" would work for you.
To deal with tables, you could try converting to groff format and using
groff's "tbl" pre-processor to format your tables.
--
Larry Kollar    k  o  l  l  a  r  @  a  l  l  t  e  l  .  n  e  t
"The hardest part of all this is the part that requires thinking."
-- Paul Tyson, on xml-doc