How to automatically aggregate the numerial value of every 10/20/30... coloums

cindyliu_au
cindyliu_au New Altair Community Member
edited November 5 in Community Q&A
the original data: 300 attributes: from day1 to day300



I need create 3 datasets with generating features (each row is still each student (id))

dataset1: feature generation: aggregate every 10 days, resulting in 30 attributes (day1-10, day11-20...)
dataset2: feature generation: aggregate every 20 days, resulting in 15 attributes (day1-20, day21-40...)
dataset3: feature generation: aggregate every 30 days, resulting in 10 attributes (day1-30, day31-60...)

I know I can use generate attribute operator then manually select day1 to day10, then day11 to day20...
but I want to know how to automatically generate these aggregated features?

Thank you!

Best Answer

  • BalazsBarany
    BalazsBarany New Altair Community Member
    Answer ✓
    Hi!

    Here's an automatic solution. It transposes a copy of the data, so you have day1-dayN in rows. Then it processes these in batches using Loop Batches. You just enter the number of elements in a batch in the batch size parameter. I tested with different values, it works with every setting >= 2.

    Inside the batch, the process generates a macro for selecting the dayX attributes, generates a name like day1-dayN and executes Generate Aggregation with this regular expression based attribute filter. 

    Regards,
    Balázs

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Hi @cindyliu_au,

    I know it is not an optimal method, but you can use Generate Aggregation operator and select subset.
    In attached file, you can find a process...

    Hope this helps,

    Regards,

    Lionel
  • cindyliu_au
    cindyliu_au New Altair Community Member
    lionelderkrikor 

    the way you provided is a manual method, which I have already achieved.

    I am wondering the automatic way becasue I could have 4800 attributes later on, and I would try every 10/20/30/40 days, as well as every 7/14/21/28/35 days. That would be a great workload if I do it manually...

    but, still thank you for your help anyway!

    I'm waiting for someone could give me some clues of the automatic ways.

  • BalazsBarany
    BalazsBarany New Altair Community Member
    Answer ✓
    Hi!

    Here's an automatic solution. It transposes a copy of the data, so you have day1-dayN in rows. Then it processes these in batches using Loop Batches. You just enter the number of elements in a batch in the batch size parameter. I tested with different values, it works with every setting >= 2.

    Inside the batch, the process generates a macro for selecting the dayX attributes, generates a name like day1-dayN and executes Generate Aggregation with this regular expression based attribute filter. 

    Regards,
    Balázs
  • cindyliu_au
    cindyliu_au New Altair Community Member
    edited December 2021
    Hi BalazsBarany,

    This is awesome solution!!! it works very well!!!

    Thank you so much!!!!

    Btw, in your solution, you use the operator "recall" and the operator "remember changes", looks very interesting!  I'll have to learn what are they and how they work  :D

    Thanks again BalazsBarany

    Regards,
    Cindy