Export wordlist into Database

guitarslinger
New Altair Community Member
Hi,
I am trying to export a wordlist into a database table or a csv.
How can I do this?
The standard operators only accept examplesets as inputs.
Can i convert a wordlist into an example set?
Thx in advance,
Martin
I am trying to export a wordlist into a database table or a csv.
How can I do this?
The standard operators only accept examplesets as inputs.
Can i convert a wordlist into an example set?
Thx in advance,
Martin
Tagged:
0
Answers
-
Hi Martin,
the operator "WordList to Data" should help you with that.
Greetings,
Matthias0 -
Oh, thanks...
Thank god there are no stupid questions...0 -
Hi!!
I am new to RapidMiner and I am trying to do the same, and I have found a problem when exporting to CSV using first the WordList to Data.
I have words with a "Total Occurrences" superior to 100, and when exporting it to the CSV I only get those under 100
Example in my wordlist I have
Word Total Occurrence
Rx 327
Dg 100
Viene 96
When exporting to CSV, I don't get "Dg" for example that has 100 ocurrences, I only get from "Viene" to under...
I don't get why the CSV is using the total occurrences column as %, and not showing data greater than 100.
Does anyone has an idea on how to solve this?0 -
Hi,
I don't experience this problem. The following process is working fine for me:<?xml version="1.0" encoding="UTF-8" standalone="no"?>
I am using the latest version available through subversion, maybe there are some relevant fixes included which the official version doesn't include yet. Then you could perhaps try to convert the attributes containing the word count to a nominal value ("Numerical to Polynominal" operator) and hope that no conversion to a percentage value takes place.
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
<process expanded="true" height="607" width="787">
<operator activated="true" class="web:get_webpage" compatibility="5.0.3" expanded="true" height="60" name="Get Page" width="90" x="45" y="30">
<parameter key="url" value="http://www.microsoft.com/en/us/default.aspx"/>
<parameter key="random_user_agent" value="true"/>
<list key="query_parameters"/>
</operator>
<operator activated="true" class="text:process_documents" compatibility="5.0.6" expanded="true" height="94" name="Process Documents" width="90" x="179" y="30">
<process expanded="true" height="607" width="787">
<operator activated="true" class="text:tokenize" compatibility="5.0.6" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="text:wordlist_to_data" compatibility="5.0.6" expanded="true" height="76" name="WordList to Data" width="90" x="313" y="75"/>
<operator activated="true" class="write_csv" compatibility="5.0.8" expanded="true" height="60" name="Write CSV" width="90" x="447" y="75">
<parameter key="csv_file" value="C:\test.csv"/>
</operator>
<connect from_op="Get Page" from_port="output" to_op="Process Documents" to_port="documents 1"/>
<connect from_op="Process Documents" from_port="example set" to_port="result 1"/>
<connect from_op="Process Documents" from_port="word list" to_op="WordList to Data" to_port="word list"/>
<connect from_op="WordList to Data" from_port="example set" to_op="Write CSV" to_port="input"/>
<connect from_op="Write CSV" from_port="through" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
Regards,
Matthias0 -
Hi,
does this only occur in the written csv file or already in the exampleSet? Set a breakpoint to find out what is in the example set.
Greetings,
Sebastian0 -
Add one "Process Documents from Data" between "WordList to Data" and "Write Database"0