🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

[SOLVED] Help with xml, xpath, namespaces.

User: "cindyharper"
New Altair Community Member
Updated by Jocelyn
Below is sample XML from GoogleCSE API:

<?xml version="1.0" encoding="UTF-8"?>
<feed gd:kind="customsearch#search" xmlns="http://www.w3.org/2005/Atom" xmlns:cse="http://schemas.google.com/cseapi/2010" xmlns:gd="http://schemas.google.com/g/2005" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
<title>Google Custom Search -  Albertus Magnus College.  library  Albertus Magnus College Library intitle:newsletter albertus.edu</title>
<id>tag:www.googleapis.com,2010-09-29:/customsearch/v1?q= Albertus Magnus College.  library  Albertus Magnus College Library intitle:newsletter albertus.edu&amp;cx=008033228147187897025:-ua_scxr1uc&amp;num=7&amp;start=1&amp;safe=off</id>
<author>
 <name>Library Website Search Engine - Google Custom Search</name>
</author>
<updated>1970-01-16T11:10:30.455Z</updated>
<opensearch:Url type="application/atom+xml" template="https://www.googleapis.com/customsearch/v1?q={searchTerms}&amp;num={count?}&amp;start={startIndex?}&amp;lr={language?}&amp;safe={cse:safe?}&amp;cx={cse:cx?}&amp;cref={cse:cref?}&amp;sort={cse:sort?}&amp;filter={cse:filter?}&amp;gl={cse:gl?}&amp;cr={cse:cr?}}&amp;googlehost={cse:googleHost?}&amp;c2coff={?cse:disableCnTwTranslation}&amp;hq={cse:hq?}&amp;hl={cse:hl?}&amp;siteSearch={cse:siteSearch?}&amp;siteSearchFilter={cse:siteSearchFilter?}&amp;exactTerms={cse:exactTerms?}&amp;excludeTerms={cse:excludeTerms?}&amp;linkSite={cse:linkSite?}&amp;orTerms={cse:orTerms?}&amp;relatedSite={cse:relatedSite?}&amp;dateRestrict={cse:dateRestrict?}&amp;lowRange={cse:lowRange?}&amp;highRange={cse:highRange?}&amp;searchType={cse:searchType?}&amp;fileType={cse:fileType?}&amp;rights={cse:rights?}&amp;imgsz={cse:imgsz?}&amp;imgtype={cse:imgtype?}&amp;imgc={cse:imgc?}&amp;imgcolor={cse:imgcolor?}&amp;alt=atom"/>
<opensearch:Query role="request" title="Google Custom Search -  Albertus Magnus College.  library  Albertus Magnus College Library intitle:newsletter albertus.edu" totalResults="7" searchTerms=" Albertus Magnus College.  library  Albertus Magnus College Library intitle:newsletter albertus.edu" count="7" startIndex="1" inputEncoding="utf8" outputEncoding="utf8" cse:safe="off" cse:cx="008033228147187897025:-ua_scxr1uc"/>
<opensearch:totalResults>7</opensearch:totalResults>
<opensearch:startIndex>1</opensearch:startIndex>
<cse:context title="Library Website Search Engine"/>
<cse:searchInformation>
 <cse:searchTime>0.073074</cse:searchTime>
 <cse:formattedSearchTime>0.07</cse:formattedSearchTime>
 <cse:totalResults>7</cse:totalResults>
 <cse:formattedTotalResults>7</cse:formattedTotalResults>
</cse:searchInformation>
<cse:spelling>
 <cse:correctedQuery type="html"/>
</cse:spelling>
<entry gd:kind="customsearch#result">
 <id>http://www.albertus.edu/policy-reports/advancement-publications/documents/albertus-archive-october-2011-special-edition.pdf</id&gt;
 <updated>1970-01-16T11:10:30.455Z</updated>
 <title type="html">Special Edition Athletics @lbertus &lt;b&gt;Newsletter&lt;/b&gt;</title>
 <link href="http://www.albertus.edu/policy-reports/advancement-publications/documents/albertus-archive-october-2011-special-edition.pdf" title="www.albertus.edu"/>
 <summary type="html">This weekend marks a busy and historic time on campus for the &lt;b&gt;Albertus&lt;/b&gt;. &lt;br&gt;  &lt;b&gt;Magnus College&lt;/b&gt; Athletics Department as both the men&amp;#39;s and women&amp;#39;s soccer &lt;b&gt;...&lt;/b&gt;</summary>
 <cse:cacheId>AJGUZgC9CVMJ</cse:cacheId>
 <cse:mime>application/pdf</cse:mime>
 <cse:fileFormat>PDF/Adobe Acrobat</cse:fileFormat>
 <cse:formattedUrl type="html">www.&lt;b&gt;albertus.edu&lt;/b&gt;/.../&lt;b&gt;albertus&lt;/b&gt;-archive-october-2011-special-edition.pdf</cse:formattedUrl>
 <cse:PageMap>
  <cse:DataObject type="metatags">
   <cse:Attribute name="creationdate" value="D:20111118135759-05&apos;00&apos;"/>
   <cse:Attribute name="producer" value="Acrobat Web Capture 8.0"/>
   <cse:Attribute name="moddate" value="D:20111118140743-05&apos;00&apos;"/>
   <cse:Attribute name="title" value="Special Edition Athletics @lbertus Newsletter"/>
  </cse:DataObject>
 </cse:PageMap>
</entry>
...

</feed>


I'm using Generate Extract operator.  I've specified the namespaces as:
      <list key="namespaces">
         <parameter key="x" value="http://www.kbcafe.com/rss/atom.xsd.xml"/>
         <parameter key="xmlns:cse" value="http://schemas.google.com/cseapi/2010"/>
         <parameter key="xmlns:gd" value="http://schemas.google.com/g/2005"/>
         <parameter key="xmlns:opensearch" value="http://a9.com/-/spec/opensearch/1.1/"/>
         <parameter key="xx" value="xml"/>
       </list>

I've tried to extract xpath such as
//x:feed
//feed
and more specific - can't seem to match anyhting in ths feed.  I'm sure the problem is in my namespaces, but I don't know where to go to find the answer.

The targets I want to extract are
//x:feed/x:entry/x:title
and //x:feed/x:entry/x:link/@href.


Any help would be appreciated.

Find more posts tagged with