Shape Descriptors to Represent Geometry for Machine Learning

Joseph Pajot
Joseph Pajot
Altair Employee
edited September 2022 in Other Discussion & Knowledge

Representing a geometry with shape descriptors doesn’t have to be hard.  But even if you could, what to do next?  Learn how easy it can be to build predictive models with them.

A recent engineering data science blog post touched on the subject of how machine learning algorithms “see” shape-based data.   The choice of how to represent geometries can feel a bit overwhelming.  Take one option such as shape descriptors, for example.  What is an example of a shape descriptor?  What to with them?  Do they have predictive power?  Each of these questions is a valid reaction to the topic. So let us walk through a simple example of using shape descriptors and then building a predictive machine learning model to demonstrate the overall concept.

Consider the shape of the bolt shown below.  Imagine we want to train a machine learning model to predict the first mass moment of inertia.

image

Values such as total surface area and volume are two simple shape descriptors that come to mind.  Other simple quantities, but maybe not as evident, are the characteristic lengths -- for example the maximum and minimum dimensions of the 3D box that would contain the shape: l_max and l_min, respectively.   Collecting these values from a shape is a straightforward task in many CAE programs.  These data can be extracted for many shapes and placed in a database or file.  Congratulations!  This seemingly simple task is a perfect example of what data scientists call feature extraction.  The quantities serve as inputs to a machine learning model and are generally known as machine learning features.

But why only use the area itself?  Why not use the square of the area, for example?  Or perhaps the ratio of the min edge length to the max edge lengths of the shape’s bounding box?  Once again, this logical step has a dedicated name the world of data science: feature engineering.

From a list of calculated features, the next step is to identify which features have predictive power and which do not provide discriminative value.  This task is known as feature selection.  One option for feature selection is manual testing.  For those looking to avoid a combination of trial and error and intuition, automatic methods for feature selection do exist.  One well known, albeit somewhat polarizing, algorithm for automatic feature selection is stepwise regression.   Stepwise regression retains only features that positively aid the prediction and dismisses those features that do not. 

The FAST data fitting algorithms in Altair HyperStudy combines aspects of feature engineering and feature selection into a simple push button process; you bring your data and the software does the rest.  Coming back to our example bolt problem, I extracted features for 50 bolt geometries and imported the data into HyperStudy.   Within seconds, I have a predictive model for my problem, as shown in the image below.

image

The goodness of fit metric, a cross validation R-squared, is near 1.0.  This indicating excellent predictive power.  But more important for this discussion, are the list of retained features the bottom of the image.  Although the algorithm could have retained any combination of features up to second order, only 5 features were retained – some linear, and some quadratic.  The other possible features, such as l_min^2, were dismissed by the automatic feature selection of FAST.

This example demonstrates one process to extract, engineer, and select features for machine learning with shapes.   Of course, this is not the only possible process.  Altair shapeAI technology contains automatic feature extraction routines that can be used to train predictive models to recognize shapes.   Imagine what you can create combining shapeAI’s shape descriptors with your own feature extraction!