🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

python modules not working on linux server when they need to connect to internet?

User: "kayman"
New Altair Community Member
Updated by Jocelyn

I've encountered a very strange and very annoying problem when trying to run some python packages. All of them work on local desktop, or when running the server process in local mode. But whenever I want to run the same process entirly on the server (Ubuntu 16.04) it fails and gives me 'the script can not be parsed'.

 

On a windows server setup they work fine, so my first guess was security settings, but running the same process on another ubuntu test server where I really give everything all options it still gave problems, so I can probably count that out. 

 

Some packages work fine on the server, basically any standard python command works fine but it seems as soon as there is some internet connection required the script fails. I have 2 totally different ones giving the same problems, one that I use to call the microsoft translation API's and another one I use to validate a language. As mentioned they work fine on the desktop framework, and under windows server, and when using them on the linux servers outside of Rapidminer. So I'm really stuck and it's a key aspect of our to be process.

 

If added a simplified workflow, with one sentence. First part it uses a beautiful soup pythin script, that works fine. Second part uses langid.py to get the language. This fails, only when executed on the server (ubuntu)

 

I would stringly appreciate if someone could take a look at this, as this is of extreme importance for us. We are going to make a big investment in RM and translation to allow text mining is a huge part of the process flow. It worked all fine on a smaller windows test server, but the final production server will be Linux.

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data_user_specification" compatibility="7.5.001" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="179" y="187">
<list key="attribute_values">
<parameter key="data" value="&quot;Dit is een zin in het Nederlands&quot;"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="nominal_to_text" compatibility="7.5.001" expanded="true" height="82" name="Nominal to Text" width="90" x="313" y="187"/>
<operator activated="true" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="simple py" width="90" x="447" y="187">
<parameter key="script" value="import pandas as pd&#10;from bs4 import BeautifulSoup&#10;&#10;def rm_main(data):&#10;&#9;langs=[]&#10;&#10;&#9;for index,row in data.iterrows():&#10;&#9;&#9;# we select the first interaction field to be translated, and strip eventual tags&#10;&#9;&#9;s=BeautifulSoup(row[&quot;data&quot;],&quot;lxml&quot;).get_text(&quot; \[-\] &quot;)&#10;&#9;&#9;langs.append(s)&#10;&#9;# and finally we add all the new data to the dataframe&#10;&#9;data['data']=langs&#10;&#10;&#9;return data&#10;"/>
<description align="center" color="transparent" colored="false" width="126">This works so python is installed correctly on server</description>
</operator>
<operator activated="true" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="get language" width="90" x="581" y="187">
<parameter key="script" value="import pandas as pd&#10;import langid&#10;&#10;def rm_main(data):&#10;&#9;langs=[]&#10;&#10;&#9;for index,row in data.iterrows():&#10;&#9;&#9;# we select the first interaction field to be translated, and strip eventual tags&#10;&#9;&#9;s=row[&quot;data&quot;]&#10;&#9;&#9;try:&#10;&#9;&#9;&#9;rl = langid.classify(s)[0]&#10;&#9;&#9;except:&#10;&#9;&#9;&#9;pass&#10;&#9;&#9;&#9;rl = &quot;undefined&quot;&#10;&#10;&#9;&#9;langs.append(rl)&#10;&#9;# and finally we add all the new data to the dataframe&#10;&#9;data['lang']=langs&#10;&#10;&#9;return data&#10;"/>
<description align="center" color="transparent" colored="false" width="126">This one fails. Using the same script in other programs, or from cmd line works fine, so the package is installed correctly. Also works fine on local machine</description>
</operator>
<operator activated="true" class="store" compatibility="7.5.001" expanded="true" height="68" name="Store" width="90" x="715" y="187">
<parameter key="repository_entry" value="result"/>
</operator>
<connect from_op="Generate Data by User Specification" from_port="output" to_op="Nominal to Text" to_port="example set input"/>
<connect from_op="Nominal to Text" from_port="example set output" to_op="simple py" to_port="input 1"/>
<connect from_op="simple py" from_port="output 1" to_op="get language" to_port="input 1"/>
<connect from_op="get language" from_port="output 1" to_op="Store" to_port="input"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
</process>

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "homburg"
    New Altair Community Member
    Accepted Answer

    Hmm, looks like the information in the referrenced community post is wrong. Please always use the key value "rapidminer.python_scripting.path" - it is the same for all possible operations systems.