xsl-list
[Top] [All Lists]

[xsl] RE: HTML to XML

2009-06-12 13:55:52


Oops forgot the subject
-----Original Message-----
From: Knight, Michel 
Sent: Friday, June 12, 2009 1:52 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Cc: Knight, Michel
Subject: 

Hi,

Sound simple but I'm having problem making it work.
It's complaining about xsl:key  and from there I'm not sure where I should plug 
does line, I plan to use the same strategy for multiple other validation, if I 
can only get this one working.
I've included 
File 1-> html once it got clean 
File 2->References I use this to see if the value exist
File3-> my xslt file that should do all of this magic.

<xsl:variable name="file2root" select="doc('file2')/root"/>
<xsl:key name="k" match="references" use="."/>

then

<xsl:template match="meta">
  <xsl:if test="empty(key('k', @content))">
    <xsl:message>error</xsl:message>




File 1: Source File (contact.htm)
==========================================
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"">
<html xmlns="http://www.w3.org/1999/xhtml"; lang="en" xml:lang="en">
<head>
<meta name="generator" content="HTML Tidy for Windows (vers 25 January 2008), 
see www.w3.org" /><!-- CLF 2.0 TEMPLATE VERSION 1.04 | VERSION 1.04 DU GABARIT 
NSI 2.0 -->
<!-- HEADER BEGINS | DEBUT DE L'EN-TETE -->
<!-- TITLE BEGINS | DEBUT DU TITRE -->
<title>Contact Transact - Transport Canada</title>
<!-- TITLE ENDS | FIN DU TITRE -->
<!-- METADATA BEGINS | DEBUT DES METADONNEES -->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="schema.dc" href="http://purl.org/dc/elements/1.1/"; />
<link rel="schema.dcterms" href="http://purl.org/dc/terms/"; />
<meta name="dc.description" content="Contact Transact" />
<meta name="description" content="Corporate Services (CS) is part of the 
Department's Administration Business Line, along with Communications and 
General Counsel. CS's role is to provide efficient and effective support 
services and functional expertise - in finance, administration, technology and 
information management, human resources and executive services - that respond 
to departmental needs." />
<meta name="keywords" content="corporate, services, corporate services, audit 
reports, audit, CS, departement, support services, functional expertise, 
finance and administration, technology and information management, human 
resources, executive services, acts and regulations, Canada's transportation 
system, departmental audit reports, evaluation services,departmental business, 
departmental business, reporting tools required by Parliament, Transport Canada 
library, regional offices, Corporate Services programs, policies and standards" 
/>
<meta name="dc.creator" content="Government of Canada - Transport Canada - 
Transact" />
<meta name="dc.title" content="Contact Transact" />
<meta name="dcterms.issued" scheme="W3CDTF" content="2009-01-06" />
<meta name="dcterms.modified" scheme="W3CDTF" content="2009-01-06" />
<meta name="dc.subject" scheme="gccore" content="Contact Transact" />
<meta name="dc.language" scheme="ISO639-2/T" content="eng" /><!-- NAVIGATION 
METADATA BEGINS | DEBUT DES METADONNEES DE NAVIGATION -->
<!-- #include 
virtual="/CLF-NSI/includes-final/en/ia/organization/what_we_do/org_chart/breadcrumb.inc"
 -->
<meta name="gc.tc.navigation.echelon.4" content="Corporate Services; 
URL=/corporate-services/menu.htm" />
<meta name="gc.tc.navigation.echelon.5" content="Contact Transact; URL=[NONE]" 
/><!-- NAVIGATION METADATA ENDS | FIN DES METADONNEES DE NAVIGATION -->
<!-- METADATA ENDS | FIN DES METADONNEES -->
<!-- ALTERNATIVE LANGUAGE LINK | LIEN ALTERNATIVE -->
<link rel="alternate" type="text/html" hreflang="fr" 
href="/services-generaux/transact/contactez.htm" title="Contactez les Services 
généraux" /><!--#include virtual="/CLF-NSI/v2-1_04/includes/2css.inc" -->
<!-- PROGRESSIVE ENHANCEMENT BEGINS | DEBUT DE L'AMELIORATION PROGRESSIVE -->

<script src="/CLF-NSI/v2-1_04/scripts/pe-ap.js" type="text/javascript">
</script>
<script type="text/javascript">
                /* <![CDATA[ */
                var params = {
                        lng:"eng",
                        pngfix:"/CLF-NSI/v2-1_04/images/inv.gif"
                };
                PE.progress(params);
                /* ]]> */
</script><!-- PROGRESSIVE ENHANCEMENT ENDS | FIN DE L'AMELIORATION PROGRESSIVE 
-->
</head>
<body>
<!--#include virtual="/CLF-NSI/v2-1_04/includes/header-eng.inc" -->
<div class="colLayout"><!-- THREE COLUMN LAYOUT BEGINS | DEBUT DE LA MISE EN 
PAGE DE TROIS COLONNES -->
<!-- LEFT SIDE MENU BEGINS | DEBUT DU MENU LATERAL GAUCHE -->
<div class="left"><!-- SIDE MENU TITLE BEGINS | DEBUT DU TITRE DU MENU LATERAL 
-->
<h1 class="navaid"><a name="il" id="il">Institutional links</a></h1>
<!-- SIDE MENU TITLE ENDS | FIN DU TITRE DU MENU LATERAL -->
<!-- #include 
virtual="/CLF-NSI/includes-final/en/ia/organization/what_we_do/nav.inc" 
--></div>
<!-- LEFT SIDE MENU ENDS | FIN DU MENU LATERAL GAUCHE -->
<!-- CONTENT BEGINS | DEBUT DU CONTENU -->
<div class="center"><!-- Optional Navigation Menu -->
<!-- #include virtual="/corporate-services/includes/cs-right.inc" -->
<!-- End of Optional Navigation Menu -->
<h1 class="flexible"><a name="cont" id="cont"><!-- CONTENT TITLE BEGINS | DEBUT 
DU TITRE DU CONTENU -->
 Contact Transact <!-- CONTENT TITLE ENDS | FIN DU TITRE DU CONTENU --></a></h1>
<p>Your comments are important to us and we will address them as quickly as 
possible.</p>
<p>If you cannot find the answer to your question on any of the pages referred 
to above, please fill in the following <a href="#form">form</a> or contact us 
at:</p>
<ul>
<li>Email: <strong>webfeedback(_at_)tc(_dot_)gc(_dot_)ca</strong></li>
<li>Phone: <strong>1-866-949-2262</strong></li>
<li>TTY:<strong>1-888-675-6863</strong></li>
<li>Fax: <strong>613-954-4731 / 613-998-8620</strong></li>
<li>Mailing Address:<br />
<strong>Transport Canada<br />
330 Sparks Street<br />
Ottawa, ON<br />
K1A 0N5</strong></li>
</ul>
<p>When contacting us by phone, please have the following information ready so 
that a Transport Canada representative can assist you more efficiently:</p>
<ul>
<li>user name;</li>
<li>customer account number; and,</li>
<li>brief description of query.</li>
</ul>
<p>When you are commenting on a specific page, please include the URL (Web 
address).</p>
<!-- adds the privacy disclaimer and personal information bank number -->
<!-- #include virtual="/includes/en/pib_privacy_PPU-079_e.inc" -->
<p><a name="form" id="form"></a></p>
<form method="post" action="/CLF-NSI/v2-1_04/feedback/feedback.asp">
<div><input type="hidden" value="e" name="x_lang" /> <input type="hidden" 
value="webfeedback(_at_)tc(_dot_)gc(_dot_)ca" name="x_mailto" /> <input 
type="hidden" value="Transact" name="x_subject" /> <input type="hidden" 
value="/corporate-services/transact/confirm/menu.htm" name="x_acknowledge" 
/></div>
<div class="fc-tbx"><label for="Comments">Comments and Questions:</label><br />
<textarea name="Comments" id="Comments" rows="11" cols="50">
</textarea></div>
<div class="fc-tbx"><label for="Name">Name:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="x_name" id="Name" /></div>
<div class="fc-tbx"><label for="Title">Title:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="x_title" id="Title" /></div>
<div class="fc-tbx"><label for="Organization">Organization:</label> 
(Optional)<br />
<input type="text" size="41" maxlength="400" name="Organization" 
id="Organization" /></div>
<div class="fc-tbx"><label for="email">E-mail address:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="x_email" id="email" /></div>
<div class="fc-tbx"><label for="Telephone">Telephone:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="Telephone" id="Telephone" 
/></div>
<div class="fc-tbx"><label for="prov"><strong>You live in: 
(Optional)</strong></label><br />
<select name="Province or territory" id="prov">
<optgroup label="Province or territory">
<option label="Alberta" value="Alberta">Alberta</option>
<option label="British Columbia" value="British Columbia">British 
Columbia</option>
<option label="Manitoba" value="Manitoba ">Manitoba</option>
<option label="New Brunswick" value="New Brunswick">New Brunswick</option>
<option label="Newfoundland and Labrador" value="Newfoundland and Labrador 
">Newfoundland and Labrador</option>
<option label="Northwest Territories" value="Northwest Territories">Northwest 
Territories</option>
<option label="Nova Scotia" value="Nova Scotia">Nova Scotia</option>
<option label="Nunavut" value="Nunavut">Nunavut</option>
<option label="Ontario" value="Ontario">Ontario</option>
<option label="Prince Edward Island" value="Prince Edward Island">Prince Edward 
Island</option>
<option label="Quebec" value="Quebec">Quebec</option>
<option label="Saskatchewan" value="Saskatchewan">Saskatchewan</option>
<option label="Yukon Territory" value="Yukon Territory">Yukon Territory</option>
</optgroup>
</select><br />
<br />
<div class="fc-tbx"><input type="submit" value="Submit" /> <input type="reset" 
value="Clear" /></div>
</div>
</form>
</div>
<!-- CONTENT ENDS | FIN DU CONTENU -->
<!-- THREE COLUMN LAYOUT ENDS | FIN DE LA MISE EN PAGE DE TROIS COLONNES -->
<!-- FOOTER BEGINS | DEBUT DU PIED DE LA PAGE -->
<!-- #include virtual="/CLF-NSI/v2-1_04/includes/en/footerbottom.inc" -->
<!-- FOOTER ENDS | FIN DU PIED DE LA PAGE --></div>
</body>
</html>
==================================================
File 2: validation file(ref.xml)
 <root>
        <references>Corporate Services</references>
        <references>airport Services</references>
        <references>train Services</references>
        <references>Naval Services</references>
</root>
=================================================
File 3: my xsl file( myXslt.xsl)
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml"; exclude-result-prefixes="xhtml" >
<xsl:output method="xml" indent="yes" encoding="windows-1252" 
exclude-result-prefixes="xhtml"></xsl:output>
<xsl:strip-space elements="*"/>

<xsl:template match="xhtml:html">

<root>
<filename>
        <xsl:value-of select="base-uri()"></xsl:value-of>
</filename>     
        <matchingfile>
                <xsl:apply-templates select="xhtml:head/xhtml:link[3]/@href"/>
        </matchingfile>
<title> 
         <xsl:apply-templates select="xhtml:head//xhtml:title"/>
</title>
        <metadata> 
                <xsl:apply-templates select="//xhtml:meta"/>
        </metadata> 
</root>
</xsl:template>
<!--*****************************************-->
<xsl:template match="xhtml:head/xhtml:link[3]/@href" 
exclude-result-prefixes="xhtml">
                <xsl:value-of select="."/>      
</xsl:template>
<!--****************** LINK ***********************-->
<xsl:template match="xhtml:head//xhtml:link">
        <xsl:copy-of select="."/>
</xsl:template>
<!--***************** META ************************-->
 <xsl:template match="//xhtml:meta">
<!--meta name="dc.subject" scheme="gccore" content="Corporate Services" /-->
<xsl:if test ="@name">
<meta>  
        <name>
          <xsl:value-of select="@name" />
        </name>

    <xsl:if test="@scheme">
                <scheme>
                        <xsl:if test="@name = 'dc.subject'" >
                                <xsl:if test="@scheme = 'gccore'">
                                        Validate gccore look at 
validation_gccore.xml
                                </xsl:if>
                                <xsl:if test="@scheme='gctct'">
                                        Validate gccore look at 
validation_gctct.xml
                                </xsl:if>
                        </xsl:if>
                        <xsl:value-of select="@scheme"/>
                </scheme>
    </xsl:if>
        
        <content>
<xsl:if test="lower-case(@name) != 'dc.description' and lower-case(@name) != 
'keywords' and @name != 'dc.date' ">
                <xsl:value-of select="@content" />
                        </xsl:if>       
                <!-- dc.description rules -->
                <!--2.dc.description   limit to 250 at word (ie if text is 255 
characters, and 250 falls on a letter, not eol, chop back to the last space)-->
                        <xsl:if test="lower-case(@name) = 'dc.description' ">
                                <xsl:choose>    
                                        <xsl:when test="string-length(@content) 
&gt;= 250 ">
                        <xsl:value-of select="substring(@content,1,250)"/>
                                        </xsl:when>
                                        <xsl:otherwise><xsl:value-of 
select="@content" /> </xsl:otherwise>      
                                </xsl:choose>
                        </xsl:if>       

        <!-- keywords rules -->
    <!--3.meta name="keywords" content=Insérer les mots-clés en français | 
Insert the French keywords   Remove it -->
                        <xsl:if test="lower-case(@name) = 'keywords' ">
                                <xsl:variable name="theString">
                                                        <xsl:text>Insert the 
French keywords</xsl:text>
                                </xsl:variable>
                                <xsl:variable name="stringKeyword">
                                                <xsl:value-of 
select="normalize-space(@content)"></xsl:value-of> 
                                </xsl:variable> 
                <!-- compare to see if it match, if so remove(render empty 
value) -->           
                                <xsl:choose>
                                        <xsl:when 
test="ends-with($stringKeyword,$theString)"> 
                                                <xsl:text></xsl:text>
                                        </xsl:when>
                                        <xsl:when 
test="starts-with($stringKeyword,$theString)"> 
                                                <xsl:text></xsl:text>
                                        </xsl:when>
                <!--Not a match so render the attribute content -->     
                                        <xsl:otherwise><xsl:value-of 
select="@content" /> </xsl:otherwise>      
                                </xsl:choose>
                        </xsl:if>       
                <!-- end of keywords validation -->     
        
                <!-- dc.date rules -->
                <!-- Validation of the Date --> 
                        <xsl:if test="lower-case(@name)='dc.date'">
                                <xsl:variable name="theDate">
                                                <xsl:value-of 
select="replace(@content,' ','')"></xsl:value-of>
                                </xsl:variable>
                                <xsl:choose>
                                        <xsl:when 
test="matches($theDate,'^[1-2][0-9]+-[0-9][0-9]+-[0-9][0-9]+$')">
                                                <xsl:value-of 
select="$theDate"></xsl:value-of>
                                        </xsl:when>
                                <xsl:otherwise></xsl:otherwise><!-- Bad Date -->
                                </xsl:choose>   
                        </xsl:if>
        
        </content>
</meta>
</xsl:if>
</xsl:template>
  <!--*****************************************-->
  <xsl:template match="xhtml:html//xhtml:head//xhtml:title">
          <xsl:value-of select="." /> 
  </xsl:template>
<!--*****************************************-->
</xsl:stylesheet>


======================================
Michel Knight
CGI
275 Slater Street 
16th FloorOttawa,Ontario K1P5H9   
www.cgi.com



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>