Using clustering to detect erroneous runs in design explorations
It's no secret that I am an advocate of design exploration. With very little effort you can turn your simulation into a parameterized study with a wealth of data from which you can draw conclusions, predict future analysis results, and optimize. See for instance How you can get started with machine learning for CAE in 1 minute.
But, as with anything, there are pitfalls to look out for. Some of the simulations may fail, encounter numerical issues, or unphysical results. Luckily there are tools in place to help identifying those.
In this blog we will use both traditional tools and some newer ones based on unsupervised machine learning to look for errors in the design exploration of a crash beam.
The beam is impacted by a rigid barrier with an initial velocity causes it to deform. We are interested in measuring how far the plane moves before stopping as a function of mass. The design variation is controlled by 4 thickness variables, one for each of the coloured sections yellow, blue, green, red. Each thickness is allowed to take values between 0.7 - 1.3mm. We run 50 runs, each with a different thickness combination. HyperStudy is used throughout this post, both for data generation and analytics.
In the study we have introduced a common mistake in the model, lets see if we can figure out what it is by exploring our data. Follow along and see how many clues you need to figure out the mistake.
Clue #1 - Scatter plot
Let's look at the rigid wall displacement as function of the mass. It looks like the heavier the beam is, the further the rigid wall manages to deform the beam. That doesn't seem right...
Clue #2 - Box plot and parallel coordinate plot
Looking at the box plots we can see that there are a number of potential outliers identified. All of which have high rigid wall displacements. Looking at the ones with the highest displacements, run 29, 37 and 44, in a parallel coordinate plot we can see that all of them have high thicknesses.
Clue #3 - Clustering
Clustering is a form of unsupervised machine learning that arranges records (in our case simulations) into groups according to some similarity measure. Using Altair's PhysicsAI library we can run clustering directly on the simulation contours, in order to group based on results similarity.
The clustering results are often displayed in a dendrogram, a hierarchical plot that vizualises which points are grouped together and in what order (of similarity). More about dendrograms in How to read a dendrogram.
Below are the clustering results of the displacement field for our crash beam. We can see that there are four distinct deformation types. Three of them are split down the middle in various ways and one is not. The split behaviour does not look physical. We can see that the outliers previously identified using the box plots are the ones in the rightmost two clusters (with the most severely deformed shapes).
Clue #4 - First principles
From first principles we know that the total energy of the system should be constant (equal to the initial kinetic energy). In the chart below we can see that for the identified runs this does not hold true. Plotting the contact energy in the same chart shows that contact issues seems to be the culprit.
Revealing the mistake
Finally, comparing the initial run with the run that seems to have the most severe issues, 29, and vizualising the shell thicknesses, we can see that there are intersections for certain thickness combinarions. The solver will try to resolve these intersection with large forces pushing the flanges a part, causing the unphysical behaviour observed.
Clustering has earned it's place as one of my standard tools for exploring CAE data, leading to better insights, less mistakes, and faster development cycles.
How many clues did you need to figure out what the problem was? Please comment below!