Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
JSON behind URL to Excel
gianluca_scheid
Dear Rapidminer Community
I have about 55 URLs each containing JSON data. How can I crawl the JSON data behind the URLs and combine/convert them to one .xlsx file?
Here are three URLs as a sample:
https://link.retailflux.com/flow/api/v2/json/FFE3234CDFF3919C5BBAFEAA63AA13E3/query/?store_alias=ochsnermultergasse&start_date=2019-10-07T08:00:00&end_date=2019-10-12T23:00:00&period=hourly
https://link.retailflux.com/flow/api/v2/json/FFE3234CDFF3919C5BBAFEAA63AA13E3/query/?store_alias=ochsnermultergasse&start_date=2019-10-14T08:00:00&end_date=2019-10-19T23:00:00&period=hourly
https://link.retailflux.com/flow/api/v2/json/FFE3234CDFF3919C5BBAFEAA63AA13E3/query/?store_alias=ochsnermultergasse&start_date=2019-10-21T08:00:00&end_date=2019-10-26T23:00:00&period=hourly
Thank you for your support
Best regards
GL
Find more posts tagged with
AI Studio
Web Mining
JSON
Excel
Accepted answers
lionelderkrikor
Hi Gianluca,
There is a third method which uses a Python script inside the RM process.
So you have first :
- to install the
Python Scripting
extension from the Marketplace.
- to install Python on your computer
More over you have to copy the link of your 55 URLs in the Excel file in attached file :
Here a screenshot of this XLS file with the 3 URLs you shared :
Secondly open the Python script and modify the 2 following variables :
-
path :
it is the absolute path where the Excel file containing your URL links is stored. Don't forget the quotes ('')
-
path_to_JSON
: it is the absolute path of the directory where all your 55 JSON files will be stored after extraction of the URL content. Don't forget the quotes ('')
Thirdly, set, in the
Write Excel
operator parameters, the filename and the directory where your final output excel file will be stored :
Finally execute the process. In addition of the generated excel output file, the data will be available as an example set inside RapidMiner.
Here a screenshot of the output excel file :
Hope this helps,
Regards,
Lionel
PS : the process is in attached file
list_URLs.xlsx
URLs_to_JSON_to_Excel.rmp
All comments
sgenzer
hi
@gianluca_scheid
I'm not sure what you mean by crawl the JSON data. If you want to convert it to a tabular format, you have a couple of choices...
- you can use the JSON to Data operator in the Text Processing extension. Unfortunately this will simply "flatten" your JSON into one huge row. You can of course use other operators or Turbo Prep to massage it further.
- you can use the much better JSON tools in the Web Automation extension. Note this is a paid extension sold by a third party (Old World Computing).
Scott
lionelderkrikor
Hi Gianluca,
There is a third method which uses a Python script inside the RM process.
So you have first :
- to install the
Python Scripting
extension from the Marketplace.
- to install Python on your computer
More over you have to copy the link of your 55 URLs in the Excel file in attached file :
Here a screenshot of this XLS file with the 3 URLs you shared :
Secondly open the Python script and modify the 2 following variables :
-
path :
it is the absolute path where the Excel file containing your URL links is stored. Don't forget the quotes ('')
-
path_to_JSON
: it is the absolute path of the directory where all your 55 JSON files will be stored after extraction of the URL content. Don't forget the quotes ('')
Thirdly, set, in the
Write Excel
operator parameters, the filename and the directory where your final output excel file will be stored :
Finally execute the process. In addition of the generated excel output file, the data will be available as an example set inside RapidMiner.
Here a screenshot of the output excel file :
Hope this helps,
Regards,
Lionel
PS : the process is in attached file
list_URLs.xlsx
URLs_to_JSON_to_Excel.rmp
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups