Is there a way to split a file up into smaller sizes using DPS?

Linda_22165
Linda_22165 Altair Community Member
edited April 25 in Community Q&A

I have a file that i created in DPS that is intended for upload into our core, its quite large and I have to break it up into smaller files so that it will load properly.  Is there anyway to design that into a workspace or export?  Say my file has 65000 rows, i would need to break it down to 4 files of 13000 rows each or about 291KB each and i would like to build that into the ws/export if possible

Answers

  • CPorthouse
    CPorthouse
    Altair Employee
    edited April 25

    You can try creating filters on the Analyze tab to filter your data to return smaller rows.  For instance, say you have a column called "Organization" which lists the name of several organizations.  You could create two filters that would display just the name of the organizations that begin with a number or a letter A through M (upper or lower case).  In this case, I would use the RegexIsMatch function:

    RegexIsMatch([Organization],'^[0-9A-Ma-m]')

    and give it a name "0-9A-M".

    I would then duplicate that filter and change the regular expression to show the letters N-Z

    RegexIsMatch([Organization],'^[N-Zn-z]')

    The above regular expressions simply say match the beginning of the field with any number or letter (upper case) A through M or a through m (lower case), and the second one is simply the same but beginning with the letters N through Z.

    Once the filters are created, you can then go to the Export tab and select "All Filters" under the Filter section:

    image

    You can then select how the exported file should be saved.

    There are other possibilities that do not include regular expressions, but I found this to be pretty quick.  You could also modify your workspace to include a row number and then create a filter to select rows 1-65000, 65001-13000, etc.

    Hope this helps.

    -Chris

  • Mahmoud
    Mahmoud
    Altair Employee
    edited April 25

    You can try creating filters on the Analyze tab to filter your data to return smaller rows.  For instance, say you have a column called "Organization" which lists the name of several organizations.  You could create two filters that would display just the name of the organizations that begin with a number or a letter A through M (upper or lower case).  In this case, I would use the RegexIsMatch function:

    RegexIsMatch([Organization],'^[0-9A-Ma-m]')

    and give it a name "0-9A-M".

    I would then duplicate that filter and change the regular expression to show the letters N-Z

    RegexIsMatch([Organization],'^[N-Zn-z]')

    The above regular expressions simply say match the beginning of the field with any number or letter (upper case) A through M or a through m (lower case), and the second one is simply the same but beginning with the letters N through Z.

    Once the filters are created, you can then go to the Export tab and select "All Filters" under the Filter section:

    image

    You can then select how the exported file should be saved.

    There are other possibilities that do not include regular expressions, but I found this to be pretty quick.  You could also modify your workspace to include a row number and then create a filter to select rows 1-65000, 65001-13000, etc.

    Hope this helps.

    -Chris

    Hi Linda,

    I like the RowNo method as Chris mentioned, because it will give you the exact number of rows per files.  Add a new formula field, call it RowNo, and use rowno() function in the expression. image

    You can then hide this new column (RowNo) so it will not be exported.  Create defined filters on the Analyze tab.  Create formula filters per following expressions:

    filter 1 to 13000:   RowN<=13000

    filter 13001 to 26000:  RowNo>13000 .and. RowNo<=26000

    filter 26001 to 39000:  RowNo>26001.and. RowNo<=39000

    filter 39001 to 52000:  RowNo>39001 .and. RowNo<=52000

    filter >52001:  RowNo>52001

    image

    Create an export and use all filters option.  The file name will be based on the filter name, therefore, the export creates 5 .csv files, each with 13,000 rows, except the last one.

    image

    Mo