xsl-list
[Top] [All Lists]

Re: [xsl] Xpath Syntax Issue

2012-06-24 11:26:58
Sorry, here's my XSLT (remove.xsl):

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
    xmlns:s="http://www.sitemaps.org/schemas/sitemap/0.9";
    exclude-result-prefixes="s"


    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <xsl:strip-space elements="*"/>

    <!-- Standard copy -->
    <xsl:template match="*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="s:urlset/s:url[normalize-space(s:loc) = 'URL']"/>

</xsl:stylesheet>

XML Snippet (sitemap1.xml):
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
     xsi:schemaLocation="
     http://www.sitemaps.org/schemas/sitemap/0.9
     http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd";>

     <url>
          <loc>URL</loc>
          <lastmod>2012-06-23T13:37:27+00:00</lastmod>
          <changefreq>monthly</changefreq>
          <priority>1.0</priority>
     </url>
     ....
</urlset>

Command used in Linux:
xsltproc -o sitemapb.xml remove.xsl sitemap1.xml

(In case anyone is wondering why I want to remove URLs from a sitemap,
there are a few pages generated by a script, purely for crawling
reasons, as the pages don't crawl well otherwise. The sitemap feeds
the indexing engine for our website and I don't want these artificial
pages cluttering up search results. So after the sitemap is generated,
I want to run this XSLT to remove the URLs before the indexer starts.)

Thanks,
Nathan


On Sun, Jun 24, 2012 at 11:31 AM, Michael Kay <mike(_at_)saxonica(_dot_)com> 
wrote:


On 24/06/2012 15:35, Nathan Tallman wrote:

Is there any reason why this transformation works in Oxygen, using
Saxon and xsltproc, yet doesn't work from the Linux command line using
xsltproc? When running from the command line, all the attributes from
urlset are removed, but the unwanted URLs remain.


I for one haven't followed this thread in detail, so I'm not sure what "this
transformation" refers to.

Michael Kay
Saxonica


On Sat, Jun 23, 2012 at 10:56 PM, Nathan 
Tallman<ntallman(_at_)gmail(_dot_)com>
 wrote:

Thanks Chris. I had just found this explanation on

<http://stackoverflow.com/questions/3836121/xslt-does-not-work-when-i-include-xmlns-http-www-sitemaps-org-schemas-sitemap>
when your email came in. This takes care of it.

Much appreciation.
Nathan

On Sat, Jun 23, 2012 at 10:51 PM, Christopher R. 
Maden<crism(_at_)maden(_dot_)org>
 wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/23/2012 10:38 PM, Nathan Tallman wrote:

I still wasn't getting the results in my application, so I created
pets.xml and sure enough the template worked. It only works with
my original document if I remove attributes found in the root
element.

The original first 6 lines:<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation=" http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd";>

I had to remove all attributes from<urlset>  before the XSL would
work. Do I need to reference the schema in my XSL?

Ahh... the good ol’ namespace FAQ.

Every element type name is a pair: namespace URI and local name.

What you thought was null-namespace plus “species” is in fact
http://www.sitemaps.org/schemas/sitemap/0.9 plus “species” (often
written as {http://www.sitemaps.org/schemas/sitemap/0.9}species).  An
XPath expression matching just “species” matches {}species, which is a
*different name* than
{http://www.sitemaps.org/schemas/sitemap/0.9}species.

You need, in your XSLT, to declare something like
xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9"; and then
use sitemap:species in your XPath.  (A shorter prefix might be in
order, but a prefix is required for XSLT 1.0 and recommended (IMO) for
clarity for XSLT 2.0.)

~Chris
- --
Chris Maden, text nerd<URL: http://crism.maden.org/>
LIVE FREE: vote for Gary Johnson, Libertarian for President.
    <URL: http://garyjohnson2012.com/>    <URL: http://lp.org/>
GnuPG fingerprint: DB08 CF6C 2583 7F55 3BE9  A210 4A51 DBAC 5C5C 3D5E


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP5oCxAAoJEEpR26xcXD1eHSUH/0E0F49MPJJJ1j/1lB9Zw0zK
gNBxalYi/zVpHCgSYNzdXYrdvYWZFIDkQng4opPXBLA5nbWvaJ4qpObrMbB80cmN
unUmPhrb5IkuYx1adgCvNzxlRuabdG06jUUbO11kq8HPbyWH74tEsFP5+IPrTOpn
/xmZTkR5Z0kO93yl6osUbyeq42dF34HmyQKVwWQD0dXHVM8q5BUbVesnxmjdGoE9
7zZTJH+r3K0WhGbM0Iq91wZ4LF3qTT25gih+TBF3cMAzsBCGaxzzFlRoJj0qDVj2
q6DW/awQW+JU8VxRavaoQG1rk1No/k/GkStSv+UXCBdl3qwdwbVIXWdXaliZ0/o=
=YGiD
-----END PGP SIGNATURE-----

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>