xsl-list
[Top] [All Lists]

[xsl] Correcting unbound namespace prefixes

2010-08-02 11:09:00
I'm not sure this is the correct place to post. This may be a question about 
JAXP, or simply about good standard operating procedure for bad input data. 

I've got some XML that I know is invalid, but I'm not in a position to get the 
customer to fix it. Here's what it looks like:

<document>
   <text>Four score and twenty years ago..,</text>
   <pp:metadata publication-date="2010-07-31T12:30:00Z" />
  ...

You get the idea (I hope): clearly someone began with XML in the "" namespace, 
extracted metadata in a post-processing step, and inserted the corresponding 
markup without adding the necessary namespace declarations or mapping "pp" to 
one. I don't know of a way to fix this through the JAXP API (i.e. interpolating 
the prefix mapping). Or am I better off just preprocessing this XML via Perl or 
Python before it's ever parsed?


Tony Nassar Ph.D. 
Palantir Technologies | Forward Deployed Engineer 


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>