I want to generate profile Id in sequence

sgnarkhede2016
sgnarkhede2016 New Altair Community Member
edited November 5 in Community Q&A
Hello,

I want to Generate profile Id in below format. 

CBP00001
CBP00002
CBP00003
.
,

First time my ProfileId CBP00001 but next iteration comes it should incremented by One "CBP00002"

How to Do this?
Tagged:

Best Answer

  • YYH
    YYH
    Altair Employee
    edited March 2020 Answer ✓
    Hi @sgnarkhede2016,

    It is very easy to generate the sequence numbers with "Create ExampleSet". But you will have some kind of text transformation to concatenate the prefix “CBP” in each row. Check out this process as one of the hundreds ways.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value="yhuang@rapidminer.com"/>
        <parameter key="process_duration_for_mail" value="1"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.6.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85">
            <parameter key="generator_type" value="numeric series"/>
            <parameter key="number_of_examples" value="1000"/>
            <parameter key="use_stepsize" value="true"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration">
              <parameter key="ProfileID" value="linear.1\.0.1\.0"/>
            </list>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.6.000" expanded="true" height="82" name="Generate Attributes" width="90" x="447" y="85">
            <list key="function_descriptions">
              <parameter key="NEW_ID" value="concat(&quot;CBP&quot;,suffix(concat(&quot;0000&quot;,str(ProfileID)),5))"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    

    Hope it helps.

    YY

Answers

  • YYH
    YYH
    Altair Employee
    edited March 2020 Answer ✓
    Hi @sgnarkhede2016,

    It is very easy to generate the sequence numbers with "Create ExampleSet". But you will have some kind of text transformation to concatenate the prefix “CBP” in each row. Check out this process as one of the hundreds ways.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value="yhuang@rapidminer.com"/>
        <parameter key="process_duration_for_mail" value="1"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.6.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85">
            <parameter key="generator_type" value="numeric series"/>
            <parameter key="number_of_examples" value="1000"/>
            <parameter key="use_stepsize" value="true"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration">
              <parameter key="ProfileID" value="linear.1\.0.1\.0"/>
            </list>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.6.000" expanded="true" height="82" name="Generate Attributes" width="90" x="447" y="85">
            <list key="function_descriptions">
              <parameter key="NEW_ID" value="concat(&quot;CBP&quot;,suffix(concat(&quot;0000&quot;,str(ProfileID)),5))"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    

    Hope it helps.

    YY
  • sgnarkhede2016
    sgnarkhede2016 New Altair Community Member
    edited March 2020
    The solution is correct but I don't how much records are coming it may millon or billons . I am facing issues during iteration on how to increase the profile Id.
    E.g. In first iteration only five records coming then profile Id is 1 to 5, In next iteration 30 records are coming then profile id start from 6 how to do that
  • sgnarkhede2016
    sgnarkhede2016 New Altair Community Member
    I don't want zeros constant in my sequence CBP00001 like if ten id comes then id must be CBP00010

    <?xml version="1.0" encoding="UTF-8"?><process version="9.5.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.5.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="false" class="read_csv" compatibility="9.5.001" expanded="true" height="68" name="Read CSV" width="90" x="849" y="340">
            <parameter key="csv_file" value="C:\Users\snarkhede\Desktop\test.csv"/>
            <parameter key="column_separators" value=";"/>
            <parameter key="trim_lines" value="false"/>
            <parameter key="use_quotes" value="true"/>
            <parameter key="quotes_character" value="&quot;"/>
            <parameter key="escape_character" value="\"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="comment_characters" value="#"/>
            <parameter key="starting_row" value="1"/>
            <parameter key="parse_numbers" value="true"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="date_format" value="M/d/yy h:mm a"/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="encoding" value="windows-1252"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Institution Id.true.polynominal.attribute"/>
              <parameter key="1" value="Customer Id.true.polynominal.attribute"/>
              <parameter key="2" value="AVG_TRANSACTED_AMOUNT.true.polynominal.attribute"/>
              <parameter key="3" value="MIN_TRANSACTED_AMOUNT.true.polynominal.attribute"/>
              <parameter key="4" value="MAX_TRANSACTED_AMOUNT.true.polynominal.attribute"/>
              <parameter key="5" value="STD_TRANSACTED_AMOUNT.true.polynominal.attribute"/>
              <parameter key="6" value="MEDIAN_TRANSACTED_AMOUNT.true.polynominal.attribute"/>
              <parameter key="7" value="TOTAL_AMOUNT.true.polynominal.attribute"/>
              <parameter key="8" value="Account Number.true.polynominal.attribute"/>
              <parameter key="9" value="Customer.true.polynominal.attribute"/>
              <parameter key="10" value="AVG_NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="11" value="MIN_NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="12" value="MAX_NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="13" value="STD_NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="14" value="MEDIAN_NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="15" value="NO_OF_TRANSACTIONS.true.polynominal.attribute"/>
              <parameter key="16" value="AVG_TOTAL_AMOUNT_PER_BIN.true.polynominal.attribute"/>
              <parameter key="17" value="MIN_TOTAL_AMOUNT_PER_BIN.true.polynominal.attribute"/>
              <parameter key="18" value="MAX_TOTAL_AMOUNT_PER_BIN.true.polynominal.attribute"/>
              <parameter key="19" value="STD_TOTAL_AMOUNT_PER_BIN.true.polynominal.attribute"/>
              <parameter key="20" value="LAST_NORMAL_BUILD_DATE.true.date_time.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="true"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="false" class="utility:create_exampleset" compatibility="9.5.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="493">
            <parameter key="generator_type" value="numeric series"/>
            <parameter key="number_of_examples" value="1000"/>
            <parameter key="use_stepsize" value="true"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration">
              <parameter key="ProfileID" value="linear.1\.0.1\.0"/>
            </list>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="false" class="generate_attributes" compatibility="9.5.001" expanded="true" height="82" name="Generate Attributes (5)" width="90" x="246" y="493">
            <list key="function_descriptions">
              <parameter key="NEW_ID" value="concat(&quot;CBP&quot;,suffix(concat(&quot;0000&quot;,str(ProfileID)),5))"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <operator activated="true" class="generate_data" compatibility="9.5.001" expanded="true" height="68" name="Generate Data" width="90" x="179" y="238">
            <parameter key="target_function" value="random"/>
            <parameter key="number_examples" value="100"/>
            <parameter key="number_of_attributes" value="5"/>
            <parameter key="attributes_lower_bound" value="-10.0"/>
            <parameter key="attributes_upper_bound" value="10.0"/>
            <parameter key="gaussian_standard_deviation" value="10.0"/>
            <parameter key="largest_radius" value="10.0"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="true" class="set_macro" compatibility="9.5.001" expanded="true" height="82" name="Set Macro (3)" width="90" x="313" y="238">
            <parameter key="macro" value="myMacro"/>
            <parameter key="value" value="5.0"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.5.001" expanded="true" height="82" name="Generate Attributes (6)" width="90" x="447" y="238">
            <list key="function_descriptions">
              <parameter key="id1" value="%{myMacro}"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <operator activated="true" class="replace" compatibility="6.0.003" expanded="true" height="82" name="Replace (6)" width="90" x="581" y="238">
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="nominal"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="file_path"/>
            <parameter key="block_type" value="single_value"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="single_value"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="replace_what" value=".0"/>
          </operator>
          <operator activated="true" class="parse_numbers" compatibility="6.0.003" expanded="true" height="82" name="Parse Numbers" width="90" x="715" y="238">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="id1"/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="nominal"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="file_path"/>
            <parameter key="block_type" value="single_value"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="single_value"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="unparsable_value_handling" value="fail"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (7)" width="90" x="849" y="238">
            <parameter key="macro" value="profile_id"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="average"/>
            <parameter key="attribute_name" value="id1"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="9.5.001" expanded="true" height="82" name="Generate ID (9)" width="90" x="983" y="238">
            <parameter key="create_nominal_ids" value="false"/>
            <parameter key="offset" value="%{profile_id}"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.5.001" expanded="true" height="82" name="Generate Attributes (11)" width="90" x="1117" y="238">
            <list key="function_descriptions">
              <parameter key="id" value="concat(&quot;CBP&quot;,suffix(concat(&quot;0000&quot;,str(id)),5))"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes (5)" to_port="example set input"/>
          <connect from_op="Generate Data" from_port="output" to_op="Set Macro (3)" to_port="through 1"/>
          <connect from_op="Set Macro (3)" from_port="through 1" to_op="Generate Attributes (6)" to_port="example set input"/>
          <connect from_op="Generate Attributes (6)" from_port="example set output" to_op="Replace (6)" to_port="example set input"/>
          <connect from_op="Replace (6)" from_port="example set output" to_op="Parse Numbers" to_port="example set input"/>
          <connect from_op="Parse Numbers" from_port="example set output" to_op="Extract Macro (7)" to_port="example set"/>
          <connect from_op="Extract Macro (7)" from_port="example set" to_op="Generate ID (9)" to_port="example set input"/>
          <connect from_op="Generate ID (9)" from_port="example set output" to_op="Generate Attributes (11)" to_port="example set input"/>
          <connect from_op="Generate Attributes (11)" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • sgnarkhede2016
    sgnarkhede2016 New Altair Community Member
    working