nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Siemens Community Catalyst Program

The Siemens Community Catalyst program was co-created with our community to acknowledge technology leaders who consistently contribute to the Siemens Community. Nominations are accepted on a rolling basis.

Nominate Now

"split operator - export data not complete for further use (operators)"

joei

Hello,

the split operator gives me only the first three columns for further use even if the operator created more. That means that in the result view I see all split columns (more than thee) but I cannot choose them in another operator (only the first three are visible).

Here is a simple table one can try it:
bla split
asdf 2345x2134
dsaf 2345x2345x345x456x356x3546
sadf 2435x2345

Find more posts tagged with

AI Studio

Split

Accepted answers

All comments

Marco_Boeck

Hi,

my quick test process worked fine, I could select up to "split_6" attribute in further operators:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000-SNAPSHOT">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.000-SNAPSHOT" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.1.000-SNAPSHOT" expanded="true" height="68" name="Retrieve 123" width="90" x="45" y="34">
        <parameter key="repository_entry" value="//Local Repository/123"/>
      </operator>
      <operator activated="true" class="split" compatibility="7.1.000-SNAPSHOT" expanded="true" height="82" name="Split" width="90" x="179" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="split"/>
        <parameter key="split_pattern" value="x"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="7.1.000-SNAPSHOT" expanded="true" height="103" name="Filter Examples" width="90" x="313" y="34">
        <list key="filters_list">
          <parameter key="filters_entry_key" value="split_6.contains.35"/>
        </list>
      </operator>
      <connect from_op="Retrieve 123" from_port="output" to_op="Split" to_port="example set input"/>
      <connect from_op="Split" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
      <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
      <connect from_op="Filter Examples" from_port="original" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Can you provide your process XML which does not work?

Regards,
Marco

joei

of course. (my post wasn't complete. accidently created two posts...)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.013">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="read_excel" compatibility="5.3.013" expanded="true" height="60" name="Read Excel" width="90" x="45" y="75">
        <parameter key="excel_file" value="rapidminer_split_text.xlsx"/>
        <parameter key="imported_cell_range" value="A1:B4"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="bla.true.polynominal.attribute"/>
          <parameter key="1" value="split.true.nominal.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="split" compatibility="5.3.013" expanded="true" height="76" name="Split" width="90" x="180" y="52">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="|split"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="split_pattern" value="x"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.3.013" expanded="true" height="94" name="Multiply" width="90" x="315" y="30"/>
      <operator activated="true" class="select_attributes" compatibility="5.3.013" expanded="true" height="76" name="Select Attributes" width="90" x="450" y="30"/>
      <connect from_op="Read Excel" from_port="output" to_op="Split" to_port="example set input"/>
      <connect from_op="Split" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_port="result 2"/>
      <connect from_op="Select Attributes" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Marco_Boeck

Hi,

1. RapidMiner 5.3 is old. Like really old. We cannot provide help for that anymore here. Please consider using RapidMiner Studio 7.0. instead.
2. You are using the split operator after "Read Excel". The problem is that the output of Read Excel depends on actually reading the excel file at runtime. So until then, we don't know what the result will be. Therefore the split operator creates a dummy output to show an example of how it could look like.
To use actual data, load it into the repository first, then access it with a "Retrieve" operator. That way, you have full metadata available and the split operator preview will be correct.

Regards,
Marco

joei

The filter example operator also works in my example.
But I still cant see the split columns higher than 3 in the operators select attributes, rename, remove duplicates (subset).

Marco_Boeck

Hi,

yes, that is expected due to the "can't know beforehand" problem. You can still manually change those parameters if you know you will end up with 6 splits for example.
But the easiest solution is to read the data into your repository, then only use the data from the repository in your process. That way you have the actual information available during construction time.

Regards,
Marco

joei

ok thank you.

joei

How does it work with the manually change? The data is to big for loading it into the repository.

Marco_Boeck

Hi,

your local repository sits on your file system - data cannot be to big for that

Manually depends on the parameter. For example for "Remove Duplicates", you can select 'subset', then add the name like "split_6" to the upper right textfield and press +

Regards,
Marco