should cuda file be changed for multi gpu

satri
satri Altair Community Member

Hello

I am trying to use the particle replacement API and when i run my model with a single gpu the replaced particle bonds do not break. when i use multi gpu the bonding v2 bonds break. so my question is do we need to do anything in the cuda api for multiple gpu?

Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • satri
    satri Altair Community Member

    I see this problem in linux RHEL9. Even on linux when i keep the file save rate as a very small number the bonds do not break. the moment i increase the save rate i run into issues.
    Is it because of the time step i am running it in 0.1% raleigh time step


    Run type 1 when i run the model like this i dont get any issurs the bond remains intact


    #!/bin/bash
    cmd="edem"
    mdl="genFib.dem"
    totT=5.0e-7
    wrt=1.0e-9
    gplFlg=2
    tmStp=2.5e-11
    mshSz=2.22
    prscCt=1
    $cmd -c -i "$mdl" -R -r "$totT" -w "$wrt" -E "$gplFlg" -D 0+1+2+3 -t "$tmStp" -g "$mshSz" -p "$prscCt" --file-logger "log.txt" >out2.txt


    but you see wrt the data save interval is very high I cant keep this high data save rate as i will run out of space on the hard drive.


    now when i change this to


    Run type 2


    #!/bin/bash
    cmd="edem"
    mdl="genFib.dem"
    totT=5.0e-7
    wrt=1.0e-7
    gplFlg=2
    tmStp=2.5e-11
    mshSz=2.22
    prscCt=1
    $cmd -c -i "$mdl" -R -r "$totT" -w "$wrt" -E "$gplFlg" -D 0+1+2+3 -t "$tmStp" -g "$mshSz" -p "$prscCt" --file-logger "log.txt" >out2.txt


    The bonds are broken in Run type 2


    Its the exact same simulation


    I also notice that the time step is set to 10% when i download the file to postprocess it it i dont know why this happens i thought keeping tmStp=2.5e-11 will keep the time step fixed.


    Can anyone please suggest what to do and how to proceed so that the time step remains as I specified and the bonds do not break

  • satri
    satri Altair Community Member

    I even tried giving double precision

    #!/bin/bash

    cmd="edem"
    mdl="genFib.dem"
    totT=5.0e-7
    wrt=1.0e-7
    gplFlg=2
    tmStp=2.5e-11
    mshSz=2.22
    prscCt=1

    $cmd -c -i "$mdl" -R -r "$totT" -w "$wrt" -E "$gplFlg" -D 0+1+2+3 -t "$tmStp" -g "$mshSz" -p "$prscCt" --gpu-split "Z" --precision 0 --file-logger "log.txt" >out2.txt

    totT=0.00048
    wrt=1.0e-5

    totT=1.0e-7
    wrt=1.0e-9

    and i am trying to see if gpu split is causing problem but i get this error

    Unknown option 'gpu-split'.

    this is the bond status from top view

    image.png

    bonds are not getting formed at x=0 maybe gpu split is the cause but i am getting the error

    and also the time step jumps like this

    image.png

    So can anyone please suggest how to keep the time step fixed at say 1% and what is the syntax for incorporating the gpu split and anything else if it is a mistake

  • Stephen Cole
    Stephen Cole
    Altair Employee

    Hi,

    You don't need to change anything for single or multi-GPU, other than specifying the number of GPU's to use.

    In the past there were 2 different GPU engines, CUDA and OpenCL. OpenCL is now discontinued and only CUDA is avalaible, the GPU slipt flag was a part of the old OpenCL solver and is not needed.

    Are you sure the bonds are not been broken, or is it just the data is not saved?

    Bond information is saved as a custom Contact. If the particles are in contact and a bond exists this information is stored at the save point. If the particles are in contact and the bond is broken then this information about the broken bond is also this is stored. However if the bond has broken and the particles have moved so they are not in contact, then there is no information about the bond saved. Its often good to query the total number of existing bonds, if this changes in time you can then workout the broken bonds from this, rather than tying to query broken bonds which may not be saved.

    Regards

    Stephen

  • satri
    satri Altair Community Member

    Well ok I will not use the --gpu-split flag then I am only using Nvidia gpu.

    As far as bond goes maybe I should not term it as broken.

    Bonds are formed in all locations. All particles that are in contact have bonds no particle has moved away from contact radius. The initial bond formation is correct when the save interval is 1e-9 but say I run the simulation like this

    Step 1 I run simulation till 2e-7 with save interval of 1e-9 then I put save interval of 1e-7 and run till 2e-6

    In the second step if I check the simulation at 3e-7 the formed bonds vanish and you will see the blue lines as shown in above image (here bond status is 0) how is this possible. Where did the already formed bonds go? I have seen this behavior only on Linux when I run on multiple gpu. If I run on one gpu everything is fine. In windows I don't see this behavior.

    So I don't know what is going on why the existing bonds vanish in certain locations (I am sure that the particles have not gone beyond contact radius)

    Please tell me how can I fix this? Is it the jump in time step that is causing this or what is the cause for such behavior and what is the solution

  • satri
    satri Altair Community Member

    As far as bond time goes

    I have no problem there.

    So I generate particles at 1e-9s and it gets replaced around 4e-9s and bond formation time is 1e-8s and the simulation runs till 2e-7 the save interval is 1e-9 and I see all bonds till 2e-7. After this I change the save interval to 1e-7.

    Now at 3e-7 there are blue lines.

  • satri
    satri Altair Community Member

    Hello Stephen

    Please explain this

    I have exactly same simulation settings on windows10 (1gpu-NVIDIA) and on HPC (linux) (4gpu-NVIDIA)

    I run the simulation for a time duration of 1e-7s the bonds are formed at 5e-9s

    Linux

    image.png image.png

    there are 4 bonds missing

    windows

    image.png image.png

    all 4 bonds are present i even check the bond status file through query

    and there are a few bonds missing

    image.png

    So i dont know why this mismatch; all the settings are the same except for the fact that i am using four GPUs on Linux. Please tell me how this can be fixed on Linux

  • Stephen Cole
    Stephen Cole
    Altair Employee

    Hi,

    For the missing bonds are there contacts detected in this position. Is there anything special about the contacts/particles that are missing?

    For example are they on the exact limit of where the particles may or may not create a bond. If a particle has a contact radius of exactly 5 mm and two particles are exactly 10 mm apart then any slight deviation of 1000th of a mm would not create the bond in one scenario over another. This could then lead into slightly different behaviour in different systems or different precision.

    In your setup are you running Double precision on GPU?

    Regards

    Stephen

  • satri
    satri Altair Community Member

    so i have sphero-cylinder particles of radius 8micron and contact radius of 10micron and length of 48micron the next particle is exactly 48micron apart there is no physical overlap but the contact radius overlaps for both particles so i should technically see the bond as there is a contact overlap of about 4micron between the two particles.

    As far as precision goes, I am using double precision.

    One thing I notice is that when I make the contact radius of 24 microns, there is a bond, but there are other bonds that I don't want. I want only the adjacent particles to be bonded, not with other particles. Is there anything that will specify bonding v2 to say bond only the adjacent particles not the other ones? I tried using the api bond 35, but that is causing other issues, even if I give the condition of the particle id then there can be a bond with another chain of the same material if the particle id (inbuilt variable, not custom property) difference is 1.

    Please tell me, can bond time prevent this?

    If i have other particles with adjacent id difference of 1 past bond time will the bond still be formed on bond35? Or not?

  • Stephen Cole
    Stephen Cole
    Altair Employee

    The current bond models bond based on proximity and time, there isn't a function to bond to neighbouring particles only, However you could implement this as a custom model. The new updates to the API in 2025 allow for a 'preserve contact' function which means particles can be considered in contact even if the contact radii no longer overlap.

    There is an example of this function been used here.

  • satri
    satri Altair Community Member

    Sounds good, I will try that. but what do you think of this?

    image.png

    line 117 will bond only adjacent ptkls and line 122 will prevent bond formation after 1time step after bond start time?

    If this works for the time being, my problem is resolved, but for future reference, I want to know how custom properties can be accessed in a CUDA file from another API. if you could help me get an answer to this question, it will be helpful

    (https://community.altair.com/discussion/62905/access-custom-property-from-one-api-to-another)

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.