Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Syntax error: I am getting ? while extracting data using XPATH
Shahzad
Hi, I am trying to extract some data from donedeal.ie website. But I am getting ? instead of values. I am not sure if my syntax is correct or not.
I have extracted XPATH using google chrome. Right-click and inspect the element and copy the Xpath. For example, I have extracted following following Xpath
/html/body/main/div/div[1]/div/div[2]/div[2]/div[3]/div[1]/div/div[1]/div/h1
I have used h: before div and html but didnt help
Can you please help?
Regards
/Shahzad
Find more posts tagged with
AI Studio
Web Mining
XML
JSON
Errors
Accepted answers
All comments
sgenzer
hi
@Shahzad
can you please post your XML?
Scott
Shahzad
Hello Scott
XML is pasted below. I have two processes
Adverts Process
and
Donedeal Process
. In Adverts process I am not able to fetch "Year" rest all other attributes are OK.
From Donedeal process, i cant fetch any attribute from the web page. Any help will be helpful.
Regards
/Shahzad
RapidMiner.txt
sgenzer
hi
@Shahzad
so for some weird reason your .txt file has no <> symbols in it - hence impossible to paste into RapidMiner. Can you please just insert the XML into this thread by using the ¶ and then choose "Code"?
Thank you.
Scott
Shahzad
Hello Scott
I have tried to paste the code but web page is not allowing me to post the comment. I have attached file including xml tag. Hope that will help.
Regards
/Shahzad
RapidMiner.txt
sgenzer
hello
@Shahzad
so thank you for this. Some thoughts...
- For Adverts, if you want the year of the car why not just create a new attribute which is the prefix of your Vehicle Name or Description fields which have that information? As years are always in the beginning and four digits, you could simply do this:
- For Donedeal, the issue is that your information is in JSON format, not XML. Just use the Json path option instead of XPath in your Extract Information operator:
If you're not familiar with JSONPath, this is always my go-to resource:
https://goessner.net/articles/JsonPath/
Scott
kayman
http://jsonpath.com/
is an easy to use online tool to test your json path.
Combined with Scott's link it saved me a lot of time already
Shahzad
Thanks for update guys. In few cases year is not the part of the Vehicle name. Hence JSON wont work. I have used cut operator to extract year from Vehicle name but as mentioned if year is not mentioned in Vehicle title then I am back to square one
I am not sure if the website is badly designed or information in GRID cannot be accessible via XPath.
Regards
/Shahzad
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups