Time Series Gaps for Arima - How to fill them?

pedrodomingosdv
pedrodomingosdv New Altair Community Member
edited November 5 in Community Q&A
Hello,
I using auto-arima (operator R Script) with some success, but I'm facing now an issue. My data sometimes is not provided with all dates. For example, my data is recorded by week and to be in a date format I use the every monday of each week.

Tipically I do not have gaps, but ever in a while I have and it takes a lot of time to create those rows for every runs I have to do. So basically I would like to know if there are any ways of filling the missing date points in Rapidminer. It would be helpful because I want to replace those gaps with interpolation or average.

I see that there are some operators that are related with similar issues. I thought that "Fill Data Gaps" might be the one, but every time I set the step size as 7 the process freezes and no outcome is delivered at all.

Enclosed an example of the data source in excel and a short process file.

Thanks,
Pedro

Best Answers

  • pedrodomingosdv
    pedrodomingosdv New Altair Community Member
    Answer ✓
    Hi Marco, I'm Portuguese and looking to your name I guess that you are too :smile:
    Sorry for the late reply, but I've been away from desk.

    I tried to adapt the process you supplied, but with no success at all.

    Two questions:
    1) Enclosed the "adapted" process. What am I doing wrong? I feel that I need a couple of spare hours to understand the all process and that's why I did few adaptations.
    2) Being able to have 1) correct, how can I apply that to fill my time series?
    Is the output supposed to be already the time series with no gaps?

    The output I'm getting doesn't seem to me to be correct.

    Obrigado,
    Pedro
  • Marco_Barradas
    Marco_Barradas
    Altair Employee
    edited January 2019 Answer ✓
    Hi @pedrodomingosdv I'm actually from Mexico so I guess we need to stick with English then.

    I've made some changes to the process and connected it with the mock file you gave me. In my example the date attribute was named as DAY.
    Also I'm attaching a picture of the process. and Added a breakpoint on the Days Operator please check what value is thrown to the Macro at that point and place that value on the Create Example Set operator.
    sgenzer do you know how I can set the number of examples of the create example through a macro? I've tried with set macro (real) but id does not work.


    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_excel" compatibility="9.1.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="136">
            <parameter key="excel_file" value="C:\Users\mbarradas\Downloads\XL Mock File.xlsx"/>
            <parameter key="sheet_selection" value="sheet number"/>
            <parameter key="sheet_number" value="1"/>
            <parameter key="imported_cell_range" value="A1"/>
            <parameter key="encoding" value="SYSTEM"/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="date_format" value=""/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="DATE.true.date.attribute"/>
              <parameter key="1" value="SCORE.true.real.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID (2)" width="90" x="179" y="34">
            <parameter key="create_nominal_ids" value="false"/>
            <parameter key="offset" value="0"/>
          </operator>
          <operator activated="true" class="sort" compatibility="9.1.000" expanded="true" height="82" name="Sort" width="90" x="313" y="34">
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="sorting_direction" value="increasing"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal" width="90" x="447" y="34">
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.1.000" expanded="true" height="103" name="Multiply" width="90" x="380" y="136"/>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Min_Day" width="90" x="581" y="34">
            <parameter key="macro" value="min_day"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="min"/>
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="sort" compatibility="9.1.000" expanded="true" height="82" name="Sort (2)" width="90" x="715" y="34">
            <parameter key="attribute_name" value="id"/>
            <parameter key="sorting_direction" value="decreasing"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Max_Day" width="90" x="849" y="34">
            <parameter key="macro" value="max_day"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="max"/>
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" breakpoints="after" class="generate_macro" compatibility="9.1.000" expanded="true" height="82" name="Days" width="90" x="983" y="34">
            <list key="function_descriptions">
              <parameter key="Time_lapse" value="(date_diff(date_parse_custom(%{min_day},&quot;dd/MM/yyyy&quot;),date_parse_custom(%{max_day},&quot;dd/MM/yyyy&quot;))/(1000*60*60*24))+1"/>
            </list>
            <description align="center" color="green" colored="true" width="126">Use this value as the number of exampples on your Create Example Set</description>
          </operator>
          <operator activated="false" class="subprocess" compatibility="9.1.000" expanded="true" height="82" name="Your Data Set" width="90" x="45" y="34">
            <process expanded="true">
              <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet (range)" origin="GENERATED_TUTORIAL" width="90" x="45" y="34">
                <parameter key="generator_type" value="date_series"/>
                <parameter key="number_of_examples" value="10"/>
                <parameter key="use_stepsize" value="true"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration"/>
                <list key="date_series_configuration">
                  <parameter key="Days" value="2018-01-01 00:00:00.2019-01-01 00:00:00"/>
                </list>
                <list key="date_series_configuration (interval)">
                  <parameter key="DAY" value="2019-01-01 00:00:00.1.day"/>
                </list>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="false"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID" width="90" x="179" y="34">
                <parameter key="create_nominal_ids" value="false"/>
                <parameter key="offset" value="0"/>
              </operator>
              <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet (2)" width="90" x="45" y="238">
                <parameter key="generator_type" value="numeric_series"/>
                <parameter key="number_of_examples" value="10"/>
                <parameter key="use_stepsize" value="false"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration">
                  <parameter key="Value" value="linear.0\.0.1\.0"/>
                </list>
                <list key="date_series_configuration"/>
                <list key="date_series_configuration (interval)"/>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="false"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID (3)" width="90" x="179" y="238">
                <parameter key="create_nominal_ids" value="false"/>
                <parameter key="offset" value="0"/>
              </operator>
              <operator activated="true" class="concurrency:join" compatibility="9.1.000" expanded="true" height="82" name="Join" width="90" x="313" y="136">
                <parameter key="remove_double_attributes" value="true"/>
                <parameter key="join_type" value="inner"/>
                <parameter key="use_id_attribute_as_key" value="false"/>
                <list key="key_attributes">
                  <parameter key="id" value="id"/>
                </list>
                <parameter key="keep_both_join_attributes" value="false"/>
              </operator>
              <operator activated="true" class="numerical_to_polynominal" compatibility="9.1.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="313" y="34">
                <parameter key="attribute_filter_type" value="single"/>
                <parameter key="attribute" value="id"/>
                <parameter key="attributes" value=""/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="numeric"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="real"/>
                <parameter key="block_type" value="value_series"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_series_end"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>
              <operator activated="true" class="filter_examples" compatibility="9.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="447" y="34">
                <parameter key="parameter_expression" value=""/>
                <parameter key="condition_class" value="custom_filters"/>
                <parameter key="invert_filter" value="false"/>
                <list key="filters_list">
                  <parameter key="filters_entry_key" value="id.is_not_in.2;7"/>
                </list>
                <parameter key="filters_logic_and" value="false"/>
                <parameter key="filters_check_metadata" value="true"/>
              </operator>
              <operator activated="true" class="select_attributes" compatibility="9.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="581" y="34">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attribute" value="DAY"/>
                <parameter key="attributes" value="|Value|DAY"/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="attribute_value"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="time"/>
                <parameter key="block_type" value="attribute_block"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_matrix_row_start"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>
              <operator activated="true" class="remember" compatibility="9.1.000" expanded="true" height="68" name="Remember" width="90" x="715" y="34">
                <parameter key="name" value="DataSet"/>
                <parameter key="io_object" value="ExampleSet"/>
                <parameter key="store_which" value="1"/>
                <parameter key="remove_from_process" value="true"/>
              </operator>
              <connect from_op="Create ExampleSet (range)" from_port="output" to_op="Generate ID" to_port="example set input"/>
              <connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="left"/>
              <connect from_op="Create ExampleSet (2)" from_port="output" to_op="Generate ID (3)" to_port="example set input"/>
              <connect from_op="Generate ID (3)" from_port="example set output" to_op="Join" to_port="right"/>
              <connect from_op="Join" from_port="join" to_op="Numerical to Polynominal" to_port="example set input"/>
              <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
              <connect from_op="Select Attributes" from_port="example set output" to_op="Remember" to_port="store"/>
              <connect from_op="Remember" from_port="stored" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="generate_data_user_specification" compatibility="9.1.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="289">
            <list key="attribute_values">
              <parameter key="Min_Inicial" value="%{min_day}"/>
            </list>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="nominal_to_date" compatibility="9.1.000" expanded="true" height="82" name="Nominal to Date" width="90" x="179" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="date_type" value="date"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="adjust_date" compatibility="9.1.000" expanded="true" height="82" name="Adjust Date" width="90" x="313" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <list key="adjustments">
              <parameter key="1" value="Day"/>
            </list>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal (2)" width="90" x="447" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Min_Day (2)" width="90" x="581" y="289">
            <parameter key="macro" value="min_day2"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="min"/>
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="442">
            <parameter key="generator_type" value="date_series"/>
            <parameter key="number_of_examples" value="736"/>
            <parameter key="use_stepsize" value="true"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration">
              <parameter key="Series" value="%{min_day} 00:00.%{max_day} 00:00"/>
            </list>
            <list key="date_series_configuration (interval)">
              <parameter key="Day_in_series" value="%{min_day2} 00:00.1.day"/>
            </list>
            <parameter key="date_format" value="dd/MM/yyyy HH:mm"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal (4)" width="90" x="246" y="442">
            <parameter key="attribute_name" value="Day_in_series"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.1.000" expanded="true" height="82" name="Join Info" width="90" x="514" y="442">
            <parameter key="remove_double_attributes" value="true"/>
            <parameter key="join_type" value="left"/>
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Day_in_series" value="DATE"/>
            </list>
            <parameter key="keep_both_join_attributes" value="false"/>
            <description align="center" color="orange" colored="true" width="126">Joining original Data with the date Series to find missings</description>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_op="Generate ID (2)" to_port="example set input"/>
          <connect from_op="Generate ID (2)" from_port="example set output" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_op="Date to Nominal" to_port="example set input"/>
          <connect from_op="Date to Nominal" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Min_Day" to_port="example set"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Join Info" to_port="right"/>
          <connect from_op="Min_Day" from_port="example set" to_op="Sort (2)" to_port="example set input"/>
          <connect from_op="Sort (2)" from_port="example set output" to_op="Max_Day" to_port="example set"/>
          <connect from_op="Max_Day" from_port="example set" to_op="Days" to_port="through 1"/>
          <connect from_op="Generate Data by User Specification" from_port="output" to_op="Nominal to Date" to_port="example set input"/>
          <connect from_op="Nominal to Date" from_port="example set output" to_op="Adjust Date" to_port="example set input"/>
          <connect from_op="Adjust Date" from_port="example set output" to_op="Date to Nominal (2)" to_port="example set input"/>
          <connect from_op="Date to Nominal (2)" from_port="example set output" to_op="Min_Day (2)" to_port="example set"/>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Date to Nominal (4)" to_port="example set input"/>
          <connect from_op="Date to Nominal (4)" from_port="example set output" to_op="Join Info" to_port="left"/>
          <connect from_op="Join Info" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <description align="center" color="red" colored="true" height="242" resized="true" width="1125" x="20" y="10">Extracting the min and max date on your data set and calculating teh amount of days the series need to create&lt;br&gt;</description>
          <description align="center" color="yellow" colored="false" height="442" resized="true" width="892" x="31" y="252">Generating a Data Set that includes all teh dates that are covered by your information and joining with your original DataSet</description>
        </process>
      </operator>
    </process>
    


Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Hi @pedrodomingosdv,

    Have you tried the Replace Missing Values (Series) operator of the Time Series module ?


    Hope it helps,

    Regards,

    Lionel


  • pedrodomingosdv
    pedrodomingosdv New Altair Community Member
    Hi guys,
    Thanks for your replies.
    @MarcoBarradas I think that your proposal it is more close to what I need.

    Though I'm still struggling to make it fit in my process.

    Regards,
    Pedro
  • pedrodomingosdv
    pedrodomingosdv New Altair Community Member
    Answer ✓
    Hi Marco, I'm Portuguese and looking to your name I guess that you are too :smile:
    Sorry for the late reply, but I've been away from desk.

    I tried to adapt the process you supplied, but with no success at all.

    Two questions:
    1) Enclosed the "adapted" process. What am I doing wrong? I feel that I need a couple of spare hours to understand the all process and that's why I did few adaptations.
    2) Being able to have 1) correct, how can I apply that to fill my time series?
    Is the output supposed to be already the time series with no gaps?

    The output I'm getting doesn't seem to me to be correct.

    Obrigado,
    Pedro
  • Marco_Barradas
    Marco_Barradas
    Altair Employee
    edited January 2019 Answer ✓
    Hi @pedrodomingosdv I'm actually from Mexico so I guess we need to stick with English then.

    I've made some changes to the process and connected it with the mock file you gave me. In my example the date attribute was named as DAY.
    Also I'm attaching a picture of the process. and Added a breakpoint on the Days Operator please check what value is thrown to the Macro at that point and place that value on the Create Example Set operator.
    sgenzer do you know how I can set the number of examples of the create example through a macro? I've tried with set macro (real) but id does not work.


    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_excel" compatibility="9.1.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="136">
            <parameter key="excel_file" value="C:\Users\mbarradas\Downloads\XL Mock File.xlsx"/>
            <parameter key="sheet_selection" value="sheet number"/>
            <parameter key="sheet_number" value="1"/>
            <parameter key="imported_cell_range" value="A1"/>
            <parameter key="encoding" value="SYSTEM"/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="date_format" value=""/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="DATE.true.date.attribute"/>
              <parameter key="1" value="SCORE.true.real.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID (2)" width="90" x="179" y="34">
            <parameter key="create_nominal_ids" value="false"/>
            <parameter key="offset" value="0"/>
          </operator>
          <operator activated="true" class="sort" compatibility="9.1.000" expanded="true" height="82" name="Sort" width="90" x="313" y="34">
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="sorting_direction" value="increasing"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal" width="90" x="447" y="34">
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.1.000" expanded="true" height="103" name="Multiply" width="90" x="380" y="136"/>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Min_Day" width="90" x="581" y="34">
            <parameter key="macro" value="min_day"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="min"/>
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="sort" compatibility="9.1.000" expanded="true" height="82" name="Sort (2)" width="90" x="715" y="34">
            <parameter key="attribute_name" value="id"/>
            <parameter key="sorting_direction" value="decreasing"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Max_Day" width="90" x="849" y="34">
            <parameter key="macro" value="max_day"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="max"/>
            <parameter key="attribute_name" value="DATE"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" breakpoints="after" class="generate_macro" compatibility="9.1.000" expanded="true" height="82" name="Days" width="90" x="983" y="34">
            <list key="function_descriptions">
              <parameter key="Time_lapse" value="(date_diff(date_parse_custom(%{min_day},&quot;dd/MM/yyyy&quot;),date_parse_custom(%{max_day},&quot;dd/MM/yyyy&quot;))/(1000*60*60*24))+1"/>
            </list>
            <description align="center" color="green" colored="true" width="126">Use this value as the number of exampples on your Create Example Set</description>
          </operator>
          <operator activated="false" class="subprocess" compatibility="9.1.000" expanded="true" height="82" name="Your Data Set" width="90" x="45" y="34">
            <process expanded="true">
              <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet (range)" origin="GENERATED_TUTORIAL" width="90" x="45" y="34">
                <parameter key="generator_type" value="date_series"/>
                <parameter key="number_of_examples" value="10"/>
                <parameter key="use_stepsize" value="true"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration"/>
                <list key="date_series_configuration">
                  <parameter key="Days" value="2018-01-01 00:00:00.2019-01-01 00:00:00"/>
                </list>
                <list key="date_series_configuration (interval)">
                  <parameter key="DAY" value="2019-01-01 00:00:00.1.day"/>
                </list>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="false"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID" width="90" x="179" y="34">
                <parameter key="create_nominal_ids" value="false"/>
                <parameter key="offset" value="0"/>
              </operator>
              <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet (2)" width="90" x="45" y="238">
                <parameter key="generator_type" value="numeric_series"/>
                <parameter key="number_of_examples" value="10"/>
                <parameter key="use_stepsize" value="false"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration">
                  <parameter key="Value" value="linear.0\.0.1\.0"/>
                </list>
                <list key="date_series_configuration"/>
                <list key="date_series_configuration (interval)"/>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="false"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="generate_id" compatibility="9.1.000" expanded="true" height="82" name="Generate ID (3)" width="90" x="179" y="238">
                <parameter key="create_nominal_ids" value="false"/>
                <parameter key="offset" value="0"/>
              </operator>
              <operator activated="true" class="concurrency:join" compatibility="9.1.000" expanded="true" height="82" name="Join" width="90" x="313" y="136">
                <parameter key="remove_double_attributes" value="true"/>
                <parameter key="join_type" value="inner"/>
                <parameter key="use_id_attribute_as_key" value="false"/>
                <list key="key_attributes">
                  <parameter key="id" value="id"/>
                </list>
                <parameter key="keep_both_join_attributes" value="false"/>
              </operator>
              <operator activated="true" class="numerical_to_polynominal" compatibility="9.1.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="313" y="34">
                <parameter key="attribute_filter_type" value="single"/>
                <parameter key="attribute" value="id"/>
                <parameter key="attributes" value=""/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="numeric"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="real"/>
                <parameter key="block_type" value="value_series"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_series_end"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>
              <operator activated="true" class="filter_examples" compatibility="9.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="447" y="34">
                <parameter key="parameter_expression" value=""/>
                <parameter key="condition_class" value="custom_filters"/>
                <parameter key="invert_filter" value="false"/>
                <list key="filters_list">
                  <parameter key="filters_entry_key" value="id.is_not_in.2;7"/>
                </list>
                <parameter key="filters_logic_and" value="false"/>
                <parameter key="filters_check_metadata" value="true"/>
              </operator>
              <operator activated="true" class="select_attributes" compatibility="9.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="581" y="34">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attribute" value="DAY"/>
                <parameter key="attributes" value="|Value|DAY"/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="attribute_value"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="time"/>
                <parameter key="block_type" value="attribute_block"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_matrix_row_start"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>
              <operator activated="true" class="remember" compatibility="9.1.000" expanded="true" height="68" name="Remember" width="90" x="715" y="34">
                <parameter key="name" value="DataSet"/>
                <parameter key="io_object" value="ExampleSet"/>
                <parameter key="store_which" value="1"/>
                <parameter key="remove_from_process" value="true"/>
              </operator>
              <connect from_op="Create ExampleSet (range)" from_port="output" to_op="Generate ID" to_port="example set input"/>
              <connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="left"/>
              <connect from_op="Create ExampleSet (2)" from_port="output" to_op="Generate ID (3)" to_port="example set input"/>
              <connect from_op="Generate ID (3)" from_port="example set output" to_op="Join" to_port="right"/>
              <connect from_op="Join" from_port="join" to_op="Numerical to Polynominal" to_port="example set input"/>
              <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
              <connect from_op="Select Attributes" from_port="example set output" to_op="Remember" to_port="store"/>
              <connect from_op="Remember" from_port="stored" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="generate_data_user_specification" compatibility="9.1.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="289">
            <list key="attribute_values">
              <parameter key="Min_Inicial" value="%{min_day}"/>
            </list>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="nominal_to_date" compatibility="9.1.000" expanded="true" height="82" name="Nominal to Date" width="90" x="179" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="date_type" value="date"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="adjust_date" compatibility="9.1.000" expanded="true" height="82" name="Adjust Date" width="90" x="313" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <list key="adjustments">
              <parameter key="1" value="Day"/>
            </list>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal (2)" width="90" x="447" y="289">
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="9.1.000" expanded="true" height="68" name="Min_Day (2)" width="90" x="581" y="289">
            <parameter key="macro" value="min_day2"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="statistics" value="min"/>
            <parameter key="attribute_name" value="Min_Inicial"/>
            <parameter key="example_index" value="1"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.7.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="442">
            <parameter key="generator_type" value="date_series"/>
            <parameter key="number_of_examples" value="736"/>
            <parameter key="use_stepsize" value="true"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration">
              <parameter key="Series" value="%{min_day} 00:00.%{max_day} 00:00"/>
            </list>
            <list key="date_series_configuration (interval)">
              <parameter key="Day_in_series" value="%{min_day2} 00:00.1.day"/>
            </list>
            <parameter key="date_format" value="dd/MM/yyyy HH:mm"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.1.000" expanded="true" height="82" name="Date to Nominal (4)" width="90" x="246" y="442">
            <parameter key="attribute_name" value="Day_in_series"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="keep_old_attribute" value="false"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.1.000" expanded="true" height="82" name="Join Info" width="90" x="514" y="442">
            <parameter key="remove_double_attributes" value="true"/>
            <parameter key="join_type" value="left"/>
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Day_in_series" value="DATE"/>
            </list>
            <parameter key="keep_both_join_attributes" value="false"/>
            <description align="center" color="orange" colored="true" width="126">Joining original Data with the date Series to find missings</description>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_op="Generate ID (2)" to_port="example set input"/>
          <connect from_op="Generate ID (2)" from_port="example set output" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_op="Date to Nominal" to_port="example set input"/>
          <connect from_op="Date to Nominal" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Min_Day" to_port="example set"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Join Info" to_port="right"/>
          <connect from_op="Min_Day" from_port="example set" to_op="Sort (2)" to_port="example set input"/>
          <connect from_op="Sort (2)" from_port="example set output" to_op="Max_Day" to_port="example set"/>
          <connect from_op="Max_Day" from_port="example set" to_op="Days" to_port="through 1"/>
          <connect from_op="Generate Data by User Specification" from_port="output" to_op="Nominal to Date" to_port="example set input"/>
          <connect from_op="Nominal to Date" from_port="example set output" to_op="Adjust Date" to_port="example set input"/>
          <connect from_op="Adjust Date" from_port="example set output" to_op="Date to Nominal (2)" to_port="example set input"/>
          <connect from_op="Date to Nominal (2)" from_port="example set output" to_op="Min_Day (2)" to_port="example set"/>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Date to Nominal (4)" to_port="example set input"/>
          <connect from_op="Date to Nominal (4)" from_port="example set output" to_op="Join Info" to_port="left"/>
          <connect from_op="Join Info" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <description align="center" color="red" colored="true" height="242" resized="true" width="1125" x="20" y="10">Extracting the min and max date on your data set and calculating teh amount of days the series need to create&lt;br&gt;</description>
          <description align="center" color="yellow" colored="false" height="442" resized="true" width="892" x="31" y="252">Generating a Data Set that includes all teh dates that are covered by your information and joining with your original DataSet</description>
        </process>
      </operator>
    </process>
    


  • pedrodomingosdv
    pedrodomingosdv New Altair Community Member
    Hi Marco,
    Done :smile:
    I two small made a few changes to your process:
    1) "Adjust Date" was removed. That as adding one day to each row and in the end it was causing an  incorrect "join"
    2) In the last part of the process I just added "Nominal to Date" to have a dates



  • sgenzer
    sgenzer
    Altair Employee
    hi @MarcoBarradas I'm checking with @mschmitz on the macro in Create ExampleSet...