[SOLVED] Rapidminer 5.3 on Linux
Dear all,
many thanks in advance for a hint what might be wrong ...
I use a text mining process on Windows, consisting of
- Process documents from files
-- Tokenize
-- Filter Stopwords
-- Stem (Porter)
-- Filter Tokens by Length
-- Transform Cases
with no problems.
Now I tried to copy the process to Ubuntu 14.04
- either by copying the *.rpm file
- or by newly creating the process on ubuntu
To my opinion both processes are identical; both tell me: "No problems found"
Starting the process on Windows I get results after about 2 minutes (Wordlist and Example Set)
Starting the process on Ubuntu I get an emty result after 0 seconds (Wordlist as well as Example set are both empty); process stops without errors.
What might go wrong?
many thanks in advance for a hint what might be wrong ...
I use a text mining process on Windows, consisting of
- Process documents from files
-- Tokenize
-- Filter Stopwords
-- Stem (Porter)
-- Filter Tokens by Length
-- Transform Cases
with no problems.
Now I tried to copy the process to Ubuntu 14.04
- either by copying the *.rpm file
- or by newly creating the process on ubuntu
To my opinion both processes are identical; both tell me: "No problems found"
Starting the process on Windows I get results after about 2 minutes (Wordlist and Example Set)
Starting the process on Ubuntu I get an emty result after 0 seconds (Wordlist as well as Example set are both empty); process stops without errors.
What might go wrong?
Find more posts tagged with
Sort by:
1 - 4 of
41
Marco,
thanks, but this is not the problem.
What I did (System: Ubuntu 14.04 with Oracle Java 1.8.0_31):
- completely deleted rapidminer, including .RapidMiner5 in $HOME
- downloaded rapidminer 5.3.13 from Sourceforge
- unzip below my local homedir
- set all dirs to 755, all files to 644
- copied 5 arbitrary english articles from the web and put them in a directory
- copied RapidMinerGUI from scripts to installation dir
- started rapidminer
- made update to 5.3.15
- installed extensions
- created a new process, consisting of the elements I already mentioned (xml see below)
- get no errors, as well as no results after running the process, ready within 0s
Same process runs on Windows 7 without any problem
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="text:process_document_from_file" compatibility="5.3.002" expanded="true" height="76" name="Process Documents from Files" width="90" x="112" y="30">
<list key="text_directories">
<parameter key="Files" value="/home/XXX/Data/ParticleFoam/test.null"/>
</list>
<parameter key="prune_method" value="percentual"/>
<parameter key="prune_above_percent" value="95.0"/>
<process expanded="true">
<operator activated="true" class="text:tokenize" compatibility="5.3.002" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
<operator activated="true" class="text:filter_stopwords_english" compatibility="5.3.002" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="45" y="120"/>
<operator activated="true" class="text:stem_porter" compatibility="5.3.002" expanded="true" height="60" name="Stem (Porter)" width="90" x="45" y="210"/>
<operator activated="true" class="text:filter_by_length" compatibility="5.3.002" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="45" y="300">
<parameter key="min_chars" value="3"/>
<parameter key="max_chars" value="50"/>
</operator>
<operator activated="true" class="text:transform_cases" compatibility="5.3.002" expanded="true" height="60" name="Transform Cases" width="90" x="246" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
<connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Porter)" to_port="document"/>
<connect from_op="Stem (Porter)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
<connect from_op="Filter Tokens (by Length)" from_port="document" to_op="Transform Cases" to_port="document"/>
<connect from_op="Transform Cases" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<connect from_op="Process Documents from Files" from_port="example set" to_port="result 1"/>
<connect from_op="Process Documents from Files" from_port="word list" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
thanks, but this is not the problem.
What I did (System: Ubuntu 14.04 with Oracle Java 1.8.0_31):
- completely deleted rapidminer, including .RapidMiner5 in $HOME
- downloaded rapidminer 5.3.13 from Sourceforge
- unzip below my local homedir
- set all dirs to 755, all files to 644
- copied 5 arbitrary english articles from the web and put them in a directory
- copied RapidMinerGUI from scripts to installation dir
- started rapidminer
- made update to 5.3.15
- installed extensions
- created a new process, consisting of the elements I already mentioned (xml see below)
- get no errors, as well as no results after running the process, ready within 0s
Same process runs on Windows 7 without any problem
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="text:process_document_from_file" compatibility="5.3.002" expanded="true" height="76" name="Process Documents from Files" width="90" x="112" y="30">
<list key="text_directories">
<parameter key="Files" value="/home/XXX/Data/ParticleFoam/test.null"/>
</list>
<parameter key="prune_method" value="percentual"/>
<parameter key="prune_above_percent" value="95.0"/>
<process expanded="true">
<operator activated="true" class="text:tokenize" compatibility="5.3.002" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
<operator activated="true" class="text:filter_stopwords_english" compatibility="5.3.002" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="45" y="120"/>
<operator activated="true" class="text:stem_porter" compatibility="5.3.002" expanded="true" height="60" name="Stem (Porter)" width="90" x="45" y="210"/>
<operator activated="true" class="text:filter_by_length" compatibility="5.3.002" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="45" y="300">
<parameter key="min_chars" value="3"/>
<parameter key="max_chars" value="50"/>
</operator>
<operator activated="true" class="text:transform_cases" compatibility="5.3.002" expanded="true" height="60" name="Transform Cases" width="90" x="246" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
<connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Porter)" to_port="document"/>
<connect from_op="Stem (Porter)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
<connect from_op="Filter Tokens (by Length)" from_port="document" to_op="Transform Cases" to_port="document"/>
<connect from_op="Transform Cases" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<connect from_op="Process Documents from Files" from_port="example set" to_port="result 1"/>
<connect from_op="Process Documents from Files" from_port="word list" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
Hi,
there is one difference which might be responsible: On Linux, you're using your local Java installation, i.e. Java 8 in your case. On Windows, a JRE is shipped with RapidMiner, so you're using a Java 7 version there. You might want to try again on Linux with Java 7.
Regards,
Marco
there is one difference which might be responsible: On Linux, you're using your local Java installation, i.e. Java 8 in your case. On Windows, a JRE is shipped with RapidMiner, so you're using a Java 7 version there. You might want to try again on Linux with Java 7.
Regards,
Marco
my bet would be some encoding issue. Check your process XML for weird symbols.
Regards,
Marco