Error with resuming EDEM-Fluent simulation

Megan_21536
Megan_21536 Altair Community Member
edited June 2023 in Community Q&A

I am currently running a coupled simulation and encountered the following error after reaching 0.0347s of Flow time. 

Advancing DPM injections ....    Setting forces and torques on 273 (of 646) EDEM particles...    Done. terminate called after throwing an instance of 'std::bad_alloc'   what():  std::bad_alloc  ==============================================================================  Node 999999: Process 5774: Received signal SIGIOT.  ============================================================================== 

To resume the simulation, I rolled back EDEM and Fluent to the last common saved timestamp: 0.034s. However, when trying to resume the coupled simulation, I encounter this error.

Advancing DPM injections ....   Getting particle data from EDEM... Host sending injections to compute nodes.  ==============================================================================  Node 0: Process 86644: Received signal SIGSEGV.  ==============================================================================  ===============Message from the Cortex Process================================  Fatal error in one of the compute processes.  ============================================================================== 

This error continues even if EDEM and Fluent are rolled back to a previous timestamp (0.033s). I have tried moving EDEM one timestep back and coupling with Fluent to resolve this as well but it was not successful. 

I have verified that both programs are able to resume their simulations independently. I am still unsure what caused this error. At this point, it seems like the only possible way solution is to re-run the simulations from timestep 0s. However, this is undesirable as the simulations took many days to achieve the current progress. As such, I would like to seek guidance on how to resolve this or if there is any other way to resolve this.

I have included some details of the system below.

OS: CentOS7  Version of EDEM installed: EDEM 2022.1  Version of Fluent installed: 22.1 and 21.1  Installation Location: Remote HPC server with both EDEM and Fluent installed

Thank you

Best Regards,

Megan

Tagged:

Answers

  • Stephen Cole
    Stephen Cole
    Altair Employee
    edited June 2023

    Hi Megan,

    The e-learning gives an overview of restarting a simulation:

    https://learn.altair.com/course/view.php?id=171

    For DDPM there is an extra step to do:

    If the simulation is using the Discrete Particle Model, it should continue as normal after restarting both EDEM and Fluent. However, doing the same for Dense Discrete Particle Model simulations will result in a “floating point exception” error. To work around this issue, move EDEM one time step back, then couple with Fluent. Match Fluent’s time to the new EDEM time. This is done by going to Models, EDEM Coupling, Connection, and selecting “Synchronize to EDEM Time”.

    You could also try exporting an EDEM input deck at the last save point (Analyst > File > Export > Simulation Deck) and resetting the EDEM time to 0.  You could then restart your fluent case from 0 with the EDEM particles already in the case, it would be an intermediate step between restarting from both programs from zero but you would still need to converge the CFD solution.

    Regards

    Stephen

  • Mahdi_22303
    Mahdi_22303 Altair Community Member
    edited June 2023

    Hi Megan,

    I also need to couple EDEM and Fluent on our cluster but have not succeeded yet. I wonder if you are using batch scripting and journal file to run the EDEM and Fluent on Linux, or you are using a GUI based coupling of the two softwares. I appreciate any help with this.

    Thank you. 

    Mahdi.

     

  • Megan_21536
    Megan_21536 Altair Community Member
    edited June 2023

    Hi Megan,

    The e-learning gives an overview of restarting a simulation:

    https://learn.altair.com/course/view.php?id=171

    For DDPM there is an extra step to do:

    If the simulation is using the Discrete Particle Model, it should continue as normal after restarting both EDEM and Fluent. However, doing the same for Dense Discrete Particle Model simulations will result in a “floating point exception” error. To work around this issue, move EDEM one time step back, then couple with Fluent. Match Fluent’s time to the new EDEM time. This is done by going to Models, EDEM Coupling, Connection, and selecting “Synchronize to EDEM Time”.

    You could also try exporting an EDEM input deck at the last save point (Analyst > File > Export > Simulation Deck) and resetting the EDEM time to 0.  You could then restart your fluent case from 0 with the EDEM particles already in the case, it would be an intermediate step between restarting from both programs from zero but you would still need to converge the CFD solution.

    Regards

    Stephen

    Hi Stephen,

    I have tried the solutions suggested. However, they did not manage to resolve the error I was facing. 

    To clarify, I encounter the SIGSEV error without encountering the "floating point exception", and almost immediately after starting the calculation. 

    Additionally, I sometimes encounter this additional error while trying to resume the simulations

    *** Error in `/app1/common/ansys/ansys_v21.1/ansys_inc/v211/fluent/fluent21.1.0/lnamd64/3ddp_node/fluent_mpi.21.1.0': double free or corruption (!prev): 0x0000000005d3ced0 *** 

    Does this mean that the coupled simulation is corrupted and there is no chance of recovery? However, I am still able to resume both simulations independently.

     

    Best Regards,

    Megan

  • Megan_21536
    Megan_21536 Altair Community Member
    edited June 2023

    Hi Megan,

    I also need to couple EDEM and Fluent on our cluster but have not succeeded yet. I wonder if you are using batch scripting and journal file to run the EDEM and Fluent on Linux, or you are using a GUI based coupling of the two softwares. I appreciate any help with this.

    Thank you. 

    Mahdi.

     

    Hi Mahdi,

    I am mainly using batch scripting and journal file to run EDEM and Fluent on Linux.

    Regards

    Megan

     

  • Stephen Cole
    Stephen Cole
    Altair Employee
    edited June 2023

    Hi Stephen,

    I have tried the solutions suggested. However, they did not manage to resolve the error I was facing. 

    To clarify, I encounter the SIGSEV error without encountering the "floating point exception", and almost immediately after starting the calculation. 

    Additionally, I sometimes encounter this additional error while trying to resume the simulations

    *** Error in `/app1/common/ansys/ansys_v21.1/ansys_inc/v211/fluent/fluent21.1.0/lnamd64/3ddp_node/fluent_mpi.21.1.0': double free or corruption (!prev): 0x0000000005d3ced0 *** 

    Does this mean that the coupled simulation is corrupted and there is no chance of recovery? However, I am still able to resume both simulations independently.

     

    Best Regards,

    Megan

    Hi Megan,


    I checked with some of my colleagues and this isn't an error we've seen before.  Is it possible to share the files and we can review to see if we see the same error.

    If you click on my username you should be able to e-mail them directly if it isn't something sharable with the community.

    Regards

    Stephen

  • Megan_21536
    Megan_21536 Altair Community Member
    edited June 2023

    Hi Megan,


    I checked with some of my colleagues and this isn't an error we've seen before.  Is it possible to share the files and we can review to see if we see the same error.

    If you click on my username you should be able to e-mail them directly if it isn't something sharable with the community.

    Regards

    Stephen

    Hi Stephen, 

    I am unable to find the email listed in your username. Is it possible for you to share the email address here? I am unable to share the files here as it contains confidential information.

    Thank you.

    Best Regards,

    Megan

  • Stephen Cole
    Stephen Cole
    Altair Employee
    edited June 2023

    Hi Stephen, 

    I am unable to find the email listed in your username. Is it possible for you to share the email address here? I am unable to share the files here as it contains confidential information.

    Thank you.

    Best Regards,

    Megan

    Hi Megan,


    No problem, you can send it to me at scole@altair.com

     

    Regards

    Stephen

  • Stephen Cole
    Stephen Cole
    Altair Employee
    edited June 2023

    Hi Stephen, 

    I am unable to find the email listed in your username. Is it possible for you to share the email address here? I am unable to share the files here as it contains confidential information.

    Thank you.

    Best Regards,

    Megan

    Hi Megan,

    Just to add one suggestion we didn't look at was to try rewinding EDEM to the second to last time point and restarting the EDEM simulation from that point and the Fluent case from the last point.

    Fluent runs first in the simulation cycle and then EDEM catches up, on stopping EDEM will catch up but that info isn't passed to fluent as the simulation will have ended and doesn't see the newly updated particle data.


    Regards

    Stephen

  • Megan_21536
    Megan_21536 Altair Community Member
    edited June 2023

    Hi Megan,

    Just to add one suggestion we didn't look at was to try rewinding EDEM to the second to last time point and restarting the EDEM simulation from that point and the Fluent case from the last point.

    Fluent runs first in the simulation cycle and then EDEM catches up, on stopping EDEM will catch up but that info isn't passed to fluent as the simulation will have ended and doesn't see the newly updated particle data.


    Regards

    Stephen

    Hi Stephen, 

    I have sent the files over email and will try out the new suggestion in the meantime.

    Thank you for your time and assistance.

    Best Regards,

    Megan