Cannot retrieve data with "Enrich Data by Webservice"

rachel_lomasky
rachel_lomasky New Altair Community Member
edited November 2024 in Community Q&A

Hi,

 

I've downloaded the Web Mining extension and would like to use it to connect to a Google-provided webservice.  I've constructed a GET url, and it works fine when I just paste it into a browser (bunch of JSON returned).  However, when I run it with "Enrich Data by Webservice", I get:

Dec 3, 2016 10:31:57 AM SEVERE: Process failed: Cannot retrieve data from the specified URL 'https://www.googleapis.com/analytics/v3/data/ga'.
Dec 3, 2016 10:31:57 AM SEVERE: Here:
Dec 3, 2016 10:31:57 AM SEVERE: Process[1] (Process)
Dec 3, 2016 10:31:57 AM SEVERE: subprocess 'Main Process'
Dec 3, 2016 10:31:57 AM SEVERE: +- Retrieve questions[1] (Retrieve)
Dec 3, 2016 10:31:57 AM SEVERE: ==> +- Enrich Data by Webservice[1] (Enrich Data by Webservice)

Two questions:

1. Why doesn't it work?

2. Is there a way that I can see the query string to do debugging?

 

Thank you,

Rachel

Best Answer

  • sgenzer
    sgenzer
    Altair Employee
    Answer ✓

    here's a sample process (it's using RM 7.3):

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="7.3.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="34">
    <list key="attribute_values">
    <parameter key="foo" value="0"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="179" y="34">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <list key="regular_expression_queries">
    <parameter key="foo2" value=".*"/>
    </list>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="url" value="https://www.googleapis.com/analytics/v3/data/ga?ids=ga:XXXXX&amp;amp;start-date=30daysAgo&amp;amp;end-date=yesterday&amp;amp;metrics=ga:sessions&amp;amp;access_token=XXXXXX"/>
    <list key="request_properties"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Enrich Data by Webservice" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice" from_port="ExampleSet" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process> 

    I just tested this with my own Google API account and it works.

     

    Scott 

Answers

  • sgenzer
    sgenzer
    Altair Employee

    hi...I use Google API all the time with this operator and it is quite tricky to get all the settings right. First guess - did you encode your URL?  Can you share your parameter settings (without your key of course)?

    The answer to your second question is no, RM does not give you the same verbose output as you would get with the terminal.  Sometimes when I can't get it right, I do a cURL at the command line, get that to work, and then go back to RM.  

    Scott

  • rachel_lomasky
    rachel_lomasky New Altair Community Member

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
    <operator activated="true" class="retrieve" compatibility="7.2.003" expanded="true" height="68" name="Retrieve questions" width="90" x="45" y="85">
    <parameter key="repository_entry" value="../../data/import/questions"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.2.001" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="246" y="85">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <parameter key="attribute_type" value="Nominal"/>
    <list key="regular_expression_queries"/>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <parameter key="ignore_CDATA" value="true"/>
    <parameter key="assume_html" value="true"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="request_method" value="GET"/>
    <parameter key="service_method" value="reportRequests"/>
    <parameter key="url" value="https://www.googleapis.com/analytics/v3/data/ga"/>
    <parameter key="delay" value="0"/>
    <list key="request_properties">
    <parameter key="ids" value="ga:myids"/>
    <parameter key="start-date" value="30daysAgo"/>
    <parameter key="end-date" value="yesterday"/>
    <parameter key="metrics" value="ga:sessions"/>
    <parameter key="access_token" value="my access token"/>
    </list>
    <parameter key="encoding" value="SYSTEM"/>
    </operator>
    </process>

  • sgenzer
    sgenzer
    Altair Employee

    hi ok thanks.  It was hard to figure out that XML (it's from ver 7.2 and there's some strange cut and paste there) but I think I know what you're doing.  I have not used Google Analytics API before but for a GET request, I would first try putting all the parameters in the URL, rather than in "request properties".  Don't ask me why this makes a difference, but in my experience, it does.  Try something like this in the URL:

     

    https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3A<your number here>&start-date=30daysAgo&end-date=yesterday&metrics=ga%3Asessions&access_token=<your access token>

     

    I also don't see anything in your String Matching (called "Machting in the XML!) query so you'll need to tell RapidMiner what you want to do with the response.  I would recommend just doing Regular Expression and using .* for now - just to ensure you're getting a response.

     

    Scott

     

  • sgenzer
    sgenzer
    Altair Employee
    Answer ✓

    here's a sample process (it's using RM 7.3):

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="7.3.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="34">
    <list key="attribute_values">
    <parameter key="foo" value="0"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="179" y="34">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <list key="regular_expression_queries">
    <parameter key="foo2" value=".*"/>
    </list>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="url" value="https://www.googleapis.com/analytics/v3/data/ga?ids=ga:XXXXX&amp;amp;start-date=30daysAgo&amp;amp;end-date=yesterday&amp;amp;metrics=ga:sessions&amp;amp;access_token=XXXXXX"/>
    <list key="request_properties"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Enrich Data by Webservice" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice" from_port="ExampleSet" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process> 

    I just tested this with my own Google API account and it works.

     

    Scott 

  • rachel_lomasky
    rachel_lomasky New Altair Community Member

    Thank you, this works.  Now to figure out how to parse the response...

  • sgenzer
    sgenzer
    Altair Employee

    <grin> should not be too bad.  There are a variety of tools to use.  Post if you need more help.

     

    Scott


  • rachel_lomasky
    rachel_lomasky New Altair Community Member

    It ain't pretty, but I got it working :).

  • khairulnizam
    khairulnizam New Altair Community Member

    Hi, I have the same problem with the "Enrich Data by Webservice". I already tried the parameters using curl.. its work. Here is my process:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="text:create_document" compatibility="7.4.001" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="I love hotdogs. Hotdogs are the greatest. They are hot and delicious."/>
    <parameter key="add label" value="false"/>
    <parameter key="label_type" value="nominal"/>
    </operator>
    <operator activated="true" class="text:documents_to_data" compatibility="7.4.001" expanded="true" height="82" name="Documents to Data" width="90" x="179" y="136">
    <parameter key="text_attribute" value="text"/>
    <parameter key="add_meta_information" value="true"/>
    <parameter key="datamanagement" value="double_sparse_array"/>
    </operator>
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="313" y="136">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <parameter key="attribute_type" value="Nominal"/>
    <list key="regular_expression_queries">
    <parameter key="all" value=".*"/>
    </list>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <parameter key="ignore_CDATA" value="true"/>
    <parameter key="assume_html" value="true"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="request_method" value="POST"/>
    <parameter key="body" value="text=&lt;%text%&gt;"/>
    <parameter key="url" value="https://twinword-sentiment-analysis.p.mashape.com/analyze/"/>
    <parameter key="delay" value="0"/>
    <list key="request_properties">
    <parameter key="X-Mashape-Key" value="QhBpo6d9YgmsherFsSBVfycN0czjp1rf0HIjsnooes2EdNYmao"/>
    <parameter key="Content-Type" value="application/x-www-form-urlencoded"/>
    <parameter key="Accept" value="application/json"/>
    </list>
    <parameter key="encoding" value="SYSTEM"/>
    </operator>
    <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
    <connect from_op="Documents to Data" from_port="example set" to_op="Enrich Data by Webservice" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice" from_port="ExampleSet" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    I think there's a problem with your API key. I tried your XML code and get a JSON respons that say "

    {"message":"Missing Mashape application key. Go to http:\/\/docs.mashape.com\/api-keys to learn how to get your API application key."}

      

  • rachel_lomasky
    rachel_lomasky New Altair Community Member

    My problem was that I was quoting parameters. Everything should be non-quoted.

  • kludikovsky
    kludikovsky New Altair Community Member