[SOLVED] Write TSV

4of4
4of4 New Altair Community Member
edited November 2024 in Community Q&A
Hi,
I need to write an "Example set" as a tsv file.
I'm trying to use the "Write CSV" operator.
What kind of value can I insert in "column separator" field?
The value "\t" seems to work only in "Read CSV" .....
Thanks in advance for support
Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • MacPhotoBiker
    MacPhotoBiker New Altair Community Member
    Are you looking to insert a tab separator? I'm not sure about that, but if you don't want to use any standard like comma or semi-colon, usually the pipe (|) is a commonly used option.
  • 4of4
    4of4 New Altair Community Member
    Thanks for suggestion, but unfortunately it doesn't work.
    Here's the case
    Bye

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.005">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="5.3.005" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="120">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="write_csv" compatibility="5.3.005" expanded="true" height="76" name="Write CSV" width="90" x="514" y="120">
            <parameter key="csv_file" value="C:\a.txt"/>
            <parameter key="column_separator" value="|"/>
            <parameter key="quote_nominal_values" value="false"/>
          </operator>
          <connect from_op="Retrieve Iris" from_port="output" to_op="Write CSV" to_port="input"/>
          <connect from_op="Write CSV" from_port="through" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • MacPhotoBiker
    MacPhotoBiker New Altair Community Member
    Well, for me it works, below are the first four lines that your process generates:

    a1|a2|a3|a4|id|label
    5.1|3.5|1.4|0.2|id_1|Iris-setosa
    4.9|3.0|1.4|0.2|id_2|Iris-setosa
    4.7|3.2|1.3|0.2|id_3|Iris-setosa

    Which error message are you getting?
  • 4of4
    4of4 New Altair Community Member
    No, msg error
    The point is that I need that the columns are separated by Tab character to complete my data process .... This is a first part of an ETL process and the output file will be processed by another program .... that needs tab separators
    Bye
  • MacPhotoBiker
    MacPhotoBiker New Altair Community Member
    I see, you definitely need the tab as a separator. I found the following to work for me, but I'm not sure if this is really a solution, or just a work around. Yet, it works for me.

    I manually created a tab separated file, then I read it with the "read CSV" operator, and chose "tab" as separator (again, while READING). Then, I went to the settings of this operator, and just copied whatever was in the field "column separator). It looked empty, but I just double clicked in it, then copied. You may also just double click between the two brackets below, and copy (without the brackets)

    ( )

    Then, paste this as column separator into your "write CSV operator".

    I hope that works, it did the job for me. I opened the generated file in LibreOffice and indicated "tab" as delimiter, and it opened as expected.
  • MacPhotoBiker
    MacPhotoBiker New Altair Community Member
    I just realized that whatever I pasted between the brackets got lost when posting the message, sorry about that.

    But just follow the procedure as I described, and copy the field separator value from the "read CSV" to the "write CSV operator", this should do the job.

    I know it doesn't look very smooth, but I hope it gets you a step forward...

    Or, here's the code for the operator:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.008">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="write_csv" compatibility="5.3.008" expanded="true" height="76" name="Write CSV" width="90" x="380" y="75">
           <parameter key="csv_file" value="/home/macphotobiker/Desktop/tsv.tsv"/>
           <parameter key="column_separator" value="&#9;"/>
         </operator>
         <connect from_op="Write CSV" from_port="through" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
  • MariusHelf
    MariusHelf New Altair Community Member
    The problem is that the Java framework does not allow to enter a tab character into the input field, because when pressing the tab key the cursor moves to the next field.

    To get a tab character into the parameter, you have to copy it from somewhere. You can e.g. press tab in a normal text editor and copy the resulting (seemingly empty) character into RapidMiner.

    Best regards,
    Marius
  • 4of4
    4of4 New Altair Community Member
    Thank you very much, MacPhotoBiker!!!
    Your solution works perfectly for my purpose !!!!!
    Thanks also to Marius
    Bye
  • MacPhotoBiker
    MacPhotoBiker New Altair Community Member
    Perfect 4of4, glad I could help.

    Good luck with your project.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.