"normalizing error (works backwards in workflow??)"

New Altair Community Member

Nov 28, 2011

Updated Nov 5, 2024 by Jocelyn

Hello,

I have a workflow which starts with a

1. Excel file reader
2. then selects attributes
3. then send the original data to a CSV writer
4. and the selected attributes to a normalizer for further processing.

If I now run the workflow and have a look to the written csv file, all "real" columns are normalized despite the normalizer is applied after sending the data to the CSV writer. This is really strange.

So - what can I do to store the original data?

Find more posts tagged with

AI Studio

Errors

Sort by:

1 - 3 of 31

Marco_Boeck

New Altair Community Member

Nov 30, 2011

Hi,

the normalization "branch" of your process is done before your write csv operator starts working. I suggest the following quick fix:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.014">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.014" expanded="true" name="Process">
    <process expanded="true" height="235" width="681">
      <operator activated="true" class="read_excel" compatibility="5.1.014" expanded="true" height="60" name="Read Excel (2)" width="90" x="45" y="30">
        <list key="annotations"/>
        <list key="data_set_meta_data_information"/>
      </operator>
      <operator activated="true" class="write_csv" compatibility="5.1.014" expanded="true" height="60" name="Write CSV" width="90" x="179" y="30">
        <parameter key="csv_file" value="C:\Users\boeck\Desktop\Test.csv"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="5.1.014" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Test"/>
      </operator>
      <operator activated="true" class="normalize" compatibility="5.1.014" expanded="true" height="94" name="Normalize" width="90" x="447" y="30"/>
      <connect from_op="Read Excel (2)" from_port="output" to_op="Write CSV" to_port="input"/>
      <connect from_op="Write CSV" from_port="through" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Normalize" to_port="example set input"/>
      <connect from_op="Normalize" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

That way, your csv gets created before anything else, and then your data is modified.

Regards,
Marco

michaelhecht

New Altair Community Member

Dec 5, 2011

Thank You, this might work, but isn't a solution to my original problem.
Nevertheless, I'm glad (not really) that it is a bug ant not my own incompetence

I've got data with different (more than one) id-columns that I want to pass through the workflow.
If I don't care, RapidMiner selects one column to be the only id. The selected column unfortunately
isn't unique. Therefore I removed all non-necessary (non-unique) id columns prior to the actual workflow
but want to add these again at the end of the workflow, before I write all to the csv-file. To be able to
understand the result of the workflow I also wanted to write the non-normalized columns - which didn't
work. That's why I need to write the csv at the end of the workflow.

Meanwhile I found that I can join all data with the ori-output of the normalizer. This
seems to be a better workaround.

By the way: I wonder why there is no de-normalizer node to improve the readability of
the output.

Marco_Boeck

New Altair Community Member

Dec 7, 2011

Hi,

Just a hint, you can switch the ID role to the real ID column by using the Exchange role operator. That way, you don't need to remove any columns for your process.

Regards,
Marco

"normalizing error (works backwards in workflow??)"

Find more posts tagged with

Quick Links