[SOLVED] Write TSV
4of4
New Altair Community Member
Hi,
I need to write an "Example set" as a tsv file.
I'm trying to use the "Write CSV" operator.
What kind of value can I insert in "column separator" field?
The value "\t" seems to work only in "Read CSV" .....
Thanks in advance for support
I need to write an "Example set" as a tsv file.
I'm trying to use the "Write CSV" operator.
What kind of value can I insert in "column separator" field?
The value "\t" seems to work only in "Read CSV" .....
Thanks in advance for support
Tagged:
0
Answers
-
Are you looking to insert a tab separator? I'm not sure about that, but if you don't want to use any standard like comma or semi-colon, usually the pipe (|) is a commonly used option.0
-
Thanks for suggestion, but unfortunately it doesn't work.
Here's the case
Bye
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.005">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="5.3.005" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="120">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="write_csv" compatibility="5.3.005" expanded="true" height="76" name="Write CSV" width="90" x="514" y="120">
<parameter key="csv_file" value="C:\a.txt"/>
<parameter key="column_separator" value="|"/>
<parameter key="quote_nominal_values" value="false"/>
</operator>
<connect from_op="Retrieve Iris" from_port="output" to_op="Write CSV" to_port="input"/>
<connect from_op="Write CSV" from_port="through" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Well, for me it works, below are the first four lines that your process generates:
a1|a2|a3|a4|id|label
5.1|3.5|1.4|0.2|id_1|Iris-setosa
4.9|3.0|1.4|0.2|id_2|Iris-setosa
4.7|3.2|1.3|0.2|id_3|Iris-setosa
Which error message are you getting?0 -
No, msg error
The point is that I need that the columns are separated by Tab character to complete my data process .... This is a first part of an ETL process and the output file will be processed by another program .... that needs tab separators
Bye0 -
I see, you definitely need the tab as a separator. I found the following to work for me, but I'm not sure if this is really a solution, or just a work around. Yet, it works for me.
I manually created a tab separated file, then I read it with the "read CSV" operator, and chose "tab" as separator (again, while READING). Then, I went to the settings of this operator, and just copied whatever was in the field "column separator). It looked empty, but I just double clicked in it, then copied. You may also just double click between the two brackets below, and copy (without the brackets)
( )
Then, paste this as column separator into your "write CSV operator".
I hope that works, it did the job for me. I opened the generated file in LibreOffice and indicated "tab" as delimiter, and it opened as expected.0 -
I just realized that whatever I pasted between the brackets got lost when posting the message, sorry about that.
But just follow the procedure as I described, and copy the field separator value from the "read CSV" to the "write CSV operator", this should do the job.
I know it doesn't look very smooth, but I hope it gets you a step forward...
Or, here's the code for the operator:<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="write_csv" compatibility="5.3.008" expanded="true" height="76" name="Write CSV" width="90" x="380" y="75">
<parameter key="csv_file" value="/home/macphotobiker/Desktop/tsv.tsv"/>
<parameter key="column_separator" value="	"/>
</operator>
<connect from_op="Write CSV" from_port="through" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
The problem is that the Java framework does not allow to enter a tab character into the input field, because when pressing the tab key the cursor moves to the next field.
To get a tab character into the parameter, you have to copy it from somewhere. You can e.g. press tab in a normal text editor and copy the resulting (seemingly empty) character into RapidMiner.
Best regards,
Marius0 -
Thank you very much, MacPhotoBiker!!!
Your solution works perfectly for my purpose !!!!!
Thanks also to Marius
Bye0 -
Perfect 4of4, glad I could help.
Good luck with your project.0