What is the difference between abort, kill, and stop on Altair Compute Console Solver View

Shengjia Wu
Shengjia Wu New Altair Community Member
edited October 2022 in Community Q&A

Hello, my question is after the simulation job is submitted, it shows three buttons "Abort", "Kill" and "Stop" on Altair Compute Console Solver View.  The attached is the screenshot.

 

I am not sure what is the difference between these three?  I think they are all used to stop the current simulation job?  

Thank you for any explanation. 

Best Answer

  • PaulAltair
    PaulAltair
    Altair Employee
    edited October 2022 Answer ✓

    Hi Paul,

     

    Thank you for your reply.  I do have NLOUT in the model but no h3d is available until the job is finished.

    After the job is finished, we can access multiple convergent increment in h3d because of the NLOUT card.

     

    If "Stop" is for optimization, then possibly "Kill" is the one for nonlinear analysis.  I will try and experiment.

     

    Thank you.

    For Optistruct, Like Radioss, 'Kill' stops the job but doesn't produce output, which isn't what you want I don't think?

    And if you don't have intermediate results for your NLOUT, you can ask for those, see below.

    Tips that may help:

    To get output from your current running run:

    1: save a file called 'yourjobname.osquit' in the run directory, this will terminate the job right there (even in the middle of an increment) and write h3d file up to where it is completed so far

     

    To get h3d while job is running: (I think you just have the minimum default NLOUT request at the moment where it writes only at the end?)

    2: in the original deck, add NLOUT request directly in the results IO request you want intermediate output for, e.g. for STRESS here, this will write a rolling h3d (for Stress and Disp only) during the run of the form 'yourjobname_impl.h3d' containing results 'on the fly'

    image

    or 3: if you want all the results you can get (Stress, Strain, Contact Forces etc), the easiest option is to use 'PARAM,IMPLOUT,YES' above the subcase instead (this also writes the 'yourjobname_impl.h3d' but with all available results)

    image

     hope this helps

Answers

  • Ben Buchanan
    Ben Buchanan
    Altair Employee
    edited October 2022

    I found this help page that describes most of the options except kill:

    https://2022.help.altair.com/2022.1/hwdesktop/altair_help/topics/solvers/acc_solver_view_form_r.htm

    I also found this Radioss help page that the describes kill:

    https://2022.help.altair.com/2022.1/hwsolvers/rad/topics/solvers/rad/rad_user_guide_intro_c.htm

    Kill and abort seem similar in that they stop the run without producing the normal files.  Besides that it seems like it may be solver specific.

  • Shengjia Wu
    Shengjia Wu New Altair Community Member
    edited October 2022

    Hi Ben,

     

    Thank you a lot for your response and the document you find out.

    I think most of users like me are interested to stop the simulation while still getting the current convergent result.  So I think the "Stop" should be the one we looking for.

     

    Never try never know.  I will try it myself and update the results here later.

     

  • PaulAltair
    PaulAltair
    Altair Employee
    edited October 2022

    Yes, For Radioss:

    STOP sends a /STOP command to the solver to cleanly stop the job, restart file will be written and it could be restarted from that point

    KILL sends a /KILL command to the solver to stop the job without writing restart file, but temp files etc should be cleaned up

    Commands above are done by writing /STOP or /KILL into a filename jobname_0001.ctl (you can do this instead, e.g. if you are running on a cluster instead of using compute console)

    ABORT kills the solver process (no communication is sent to the solver itself) and can result in temp files being left behind, this is usually only used if the solver has stopped responding (e.g. no output has been written for some time and the job has stalled)

  • Shengjia Wu
    Shengjia Wu New Altair Community Member
    edited October 2022

    Hi Paul and Ben,

     

    I am using OptiStruct (implicit simulation) not Radioss.  I did try "Stop" this evening.  However, the simulation still continued. 

    It just showed: "Solver interface received command: Stop".  But nothing happend.

     

    I waited about an hour and it was still running.  So I clicked "Abort" and then my computer dead.

    So I had to restart my computer with no results..

     

    So with this and previous experience, I have a couple of questions:

    (1) I do think "Stop" is to terminate the simulation and should give the result at the last convergent increment.  But why the simulation still runs even if I sent the stop command..

    (2) Yesterday, I had a simulation consumed almost 200GB disk space and it went to the error "no enough disk space" this morning.  Then it just stopped and returned nothing to me..  Therefore, I asked if there is a way to stop a simulation and to return the results .h3d file.  

    (2.1) So I guess OptiStruct does not allow user to access the intermediate results unless the simulation is finished? 

    (2.2) Because the model is too big, so it runs out-of-core solution (using disk instead of memory?).  So can we run simulation in D disk?  I think yes but my D disk is hard drive not SSD, so possibly the speed will be reduced?

    (2.3) I searched the documents, but I cannot find out what the following files are and used for? 

    The .rs file and mumps_x_0_x file.

    I notice when the simulation is running, it generates several .rs files and many mumps_x_0_x files like mumps_1_0_1, mumps_2_0_2, etc.  What are they? 

    I guess one of them stores stiffness matrix and later on computer dumps all these files to RAM and then CPU solve it?

     

    Sorry guys, this is a long question.. 

    Thank you for your time and answers.

     

  • PaulAltair
    PaulAltair
    Altair Employee
    edited October 2022

    Hi Paul and Ben,

     

    I am using OptiStruct (implicit simulation) not Radioss.  I did try "Stop" this evening.  However, the simulation still continued. 

    It just showed: "Solver interface received command: Stop".  But nothing happend.

     

    I waited about an hour and it was still running.  So I clicked "Abort" and then my computer dead.

    So I had to restart my computer with no results..

     

    So with this and previous experience, I have a couple of questions:

    (1) I do think "Stop" is to terminate the simulation and should give the result at the last convergent increment.  But why the simulation still runs even if I sent the stop command..

    (2) Yesterday, I had a simulation consumed almost 200GB disk space and it went to the error "no enough disk space" this morning.  Then it just stopped and returned nothing to me..  Therefore, I asked if there is a way to stop a simulation and to return the results .h3d file.  

    (2.1) So I guess OptiStruct does not allow user to access the intermediate results unless the simulation is finished? 

    (2.2) Because the model is too big, so it runs out-of-core solution (using disk instead of memory?).  So can we run simulation in D disk?  I think yes but my D disk is hard drive not SSD, so possibly the speed will be reduced?

    (2.3) I searched the documents, but I cannot find out what the following files are and used for? 

    The .rs file and mumps_x_0_x file.

    I notice when the simulation is running, it generates several .rs files and many mumps_x_0_x files like mumps_1_0_1, mumps_2_0_2, etc.  What are they? 

    I guess one of them stores stiffness matrix and later on computer dumps all these files to RAM and then CPU solve it?

     

    Sorry guys, this is a long question.. 

    Thank you for your time and answers.

     

    Sorry, I didn't realise you were talking about a non-linear analysis. Stop for OptiStruct, I think is pertaining only to optimisation iterations rather than non-linear steps.

    If you are running non linear analyses, then you should have 'NLOUT' card for your non-linear subcase? This should be writing an h3d as the job progresses anyway, so the results up to the point you have reached should be available (even during the ongoing run) you can also use 'NLMON' output request to get results inside iterations/increments, but it is limited to displacement output.

    Yes The .rs and mumps_x_0_x are working scratch files used by the solver, they may be left behind if the job does not exit cleanly.

    Regarding running on a different disk. Yes, this will work, but if not SSD it may be much slower.

    You could try to use -tmpdir to control the amount of space used per disk (e.g. continue to run on your SSD but set a limit of 150GB and use the platter just as 'overflow') but I don't think this affects the Mumps processes.

  • Shengjia Wu
    Shengjia Wu New Altair Community Member
    edited October 2022

    Hi Paul,

     

    Thank you for your reply.  I do have NLOUT in the model but no h3d is available until the job is finished.

    After the job is finished, we can access multiple convergent increment in h3d because of the NLOUT card.

     

    If "Stop" is for optimization, then possibly "Kill" is the one for nonlinear analysis.  I will try and experiment.

     

    Thank you.

  • PaulAltair
    PaulAltair
    Altair Employee
    edited October 2022 Answer ✓

    Hi Paul,

     

    Thank you for your reply.  I do have NLOUT in the model but no h3d is available until the job is finished.

    After the job is finished, we can access multiple convergent increment in h3d because of the NLOUT card.

     

    If "Stop" is for optimization, then possibly "Kill" is the one for nonlinear analysis.  I will try and experiment.

     

    Thank you.

    For Optistruct, Like Radioss, 'Kill' stops the job but doesn't produce output, which isn't what you want I don't think?

    And if you don't have intermediate results for your NLOUT, you can ask for those, see below.

    Tips that may help:

    To get output from your current running run:

    1: save a file called 'yourjobname.osquit' in the run directory, this will terminate the job right there (even in the middle of an increment) and write h3d file up to where it is completed so far

     

    To get h3d while job is running: (I think you just have the minimum default NLOUT request at the moment where it writes only at the end?)

    2: in the original deck, add NLOUT request directly in the results IO request you want intermediate output for, e.g. for STRESS here, this will write a rolling h3d (for Stress and Disp only) during the run of the form 'yourjobname_impl.h3d' containing results 'on the fly'

    image

    or 3: if you want all the results you can get (Stress, Strain, Contact Forces etc), the easiest option is to use 'PARAM,IMPLOUT,YES' above the subcase instead (this also writes the 'yourjobname_impl.h3d' but with all available results)

    image

     hope this helps

  • Shengjia Wu
    Shengjia Wu New Altair Community Member
    edited October 2022

    Hi Paul,

     

    I quickly go over the documents.  And I think you save many lives :).  Many thanks.