NGF Optimisation and Limitations
Good afternoon,
I am seeking some assistance with optimising NGF and understanding its limitations. I am posting here as the answers on the forum are generally more informative than our paid support.
I have conducted 3 simulations; Initial Static NGF Run, a Dynamic NGF Run (non NGF structure modified post Initial Static NGF Run) and a Normal Run. Normal Run utilised the same mesh as the Dynamic NGF Run, NGF was just turned off. Normal Run files were based off Dynamic NGF Run files saved in new folders to avoid errornous timings.
Based on those 3 simulations, I get the following timings. It can be seen that there is only a 2% gain in running the Dynamic NGF Run as opposed to a Normal Run. Based on this, it would appear a 130x configurations would be required to break even. The total mesh for the Static components are 40K and the Dynamic components are 9K. ~1/5 of the mesh is dynamic. Pictures of the Static and Dynamic configuration below.
<?xml version="1.0" encoding="UTF-8"?>
Question:
1) Is this working as intended?
2) I can see obvious gains in Calc.of Matrix elements. How do I improve Solution of the system of linear eqns. ?
3) Another major loss in efficiency is in 'other'. What operations are within this line item?
4) We are currently running on 7.2K SAS scratch drives for the 80 GB NGF files. Can you identify which of the line items below has disk read operations and would benefit from SSD scratch drives.
5) Any other hints?
Summary | Initial Static Run | Dynamic Run | Normal Run | Dynamic Run Savings rel to Normal Run | |
runtime | runtime | runtime | |||
Reading and constructing the geometry | 12.300 | 8.376 | 8.174 | 98% | |
Checking the geometry | 6.595 | 4.109 | 3.986 | 97% | |
Initialisation of the Green's function | 0.001 | 0.000 | 0.001 | 0% | |
Calcul. of coupling for PO/Fock | 0.000 | 0.000 | 0.000 | 0% | |
Transformation to equivalent sources | 0.000 | 0.000 | 0.000 | 0% | |
Ray launching/tracing phase of RL-GO | 0.000 | 0.000 | 0.000 | 0% | |
Calcul. of matrix elements | 10171.596 | 3992.663 | 7903.813 | 198% | |
Calcul. of right-hand side vector | 0.165 | 0.130 | 0.125 | 96% | |
Preconditioning system of linear eqns. | 385.433 | 149.637 | 271.353 | 181% | |
Solution of the system of linear eqns. | 31734.568 | 8381.499 | 6437.611 | 77% | |
Calcul. of characteristic modes | 0.000 | 0.000 | 0.000 | 0% | |
Determination of surface currents | 0.000 | 0.000 | 0.000 | 0% | |
Calcul. of impedances/powers/losses | 0.109 | 0.097 | 0.093 | 96% | |
Calcul. of averaged SAR values | 0.000 | 0.000 | 0.000 | 0% | |
Calcul. of power receiving antenna | 0.000 | 0.000 | 0.000 | 0% | |
Calcul. of cable coupling | 0.000 | 0.000 | 0.000 | 0% | |
Calcul. of error estimates | 2.957 | 2.499 | 2.459 | 98% | |
Calcul. of electric near field | 7528.046 | 5479.296 | 5062.091 | 92% | |
Calcul. of magnetic near field | 5744.303 | 4215.584 | 4053.894 | 96% | |
Calcul. of far field | 0.000 | 0.000 | 0.000 | 0% | |
other | 11.490 | 1090.488 | 8.218 | 1% | |
0% | |||||
total times: | 55597.563 | 23324.378 | 23751.818 | 102% | |
(total times in hours: | 15.444) | 6.479) | 6.598) | 102% | |
Memory (avg/ process) | 889MB | 642MB | 626MB | ||
Static Mesh Number | 40632 | 40632 | |||
Dynamic Mesh Number | 8945 | 8945 |
Answers
-
Hi E3LT
In general the benefits of the NGF are greatest if the static part is very small. 20% is already quite large.
The disk access part would be evident if you compare CPU time vs Runtime in the OUT file - the runtime would differ substantially. Could you post those times, please.
How many parallel processes are you using? I will try and do a test on my machine - have an SSD drive and SAS to compare.
0 -
Hi mel,
Thank you for your reply.
You note that 20% of the model being static is quite large. Did you mean the items that are locked with the NGF function? To clarify, 4/5 of our model is Static (locked with NGF) and 1/5 dynamic (manipulated between simulations). I.e. per pictures, vehicle is static, crane is dynamic.
Can you identify or provide benchmarks of expected performance increases with differing % of Static: Dynamic mix?
Thanks for explaining the different between CPU time vs Runtime. They are basically identical (50sec difference over a 6hr simulation) for my simulations over 72 real CPU cores. I am guessing disk access speed is therefore not a factor.
Per my initial question, do you have any further information for,
2) I can see obvious gains in Calc.of Matrix elements. How do I improve Solution of the system of linear eqns. ?
3) Another major loss in efficiency is in 'other'. What operations are within this line item?
0 -
Hi E3LT
Apologies, I meant the dynamic part should be small. A 20% dynamic part is quite large, but there are still overall runtime benefits which seems to be consistent with what you are seeing.
I used the helicopter model from Example Guide B01 to create a model of similar number in mesh elements as yours. Memory requirement: 79 GByte.
The whole helicopter was the static part. The dynamic part was 3 monopoles attached in different places.
The runtime for the initial run (saving the ngf files of the static part, i.e. helicopter with no monopoles) took 3.3 hours.
The subsequent run where the ngf files are read back and with the monopoles now connected to the helicopter took 0.16 hours.
For interest, the runtime when disabling the ngf and solving the helicopter with the 3 monopoles attached was 2.9 hours.
It is somewhat faster - this is expected due to the different structure and solution algorithm of the NGF matrices.
But the runtime benefit is large: 0.16 hours vs 2.9 hours:
Every subsequent run where the monopoles would be connected in different places will take 0.16 hours, compared to 2.9 hours if we didn't employ the NGF.
I will attach my files shortly.
I was unable to test SAS drives vs SSD drives. The machine I have access to contains an SSD and two SAS drives in a RAID configuration. I did not see any timing differences between the two.
As to where 'other' is going, I don't know. It is difficult to comment further on your model. I assume you cannot share the model?
I will ask around about benchmarks but I can't promise anything.
0 -
My files attached.
0