Data Mining for Decision Makers
Making designs decision from data can sometimes lead to paralysis by analysis. Learn to be more efficient by borrowing some ideas from the world of multi-objective optimization.
Now days, it feels as though everyone talks about using analytics to inform decision making. Regardless of whether the data originates from physical testing, virtual simulations, or other sources, the outcomes from many observations are meticulously collected and mined to select the best course of action. In some cases, this can be as simple as identifying an extremum, such as the aerospace case of finding the lightest weight component among several options. In mathematical terms, this task is an optimization problem of a single objective. Solving these problems is straight forward for either a simple convex function optimization or a more involved global search optimization required for non-convex objective functions, as illustrated on the left and right images in following illustration.
In the real world, however, we are rarely confronted with single objective problem. Instead, there are oftentimes multiple competing objectives. Returning to our aerospace example, the lightest weight option may be an expensive to manufacture composite. A more comprehensive statement of the problem is to both minimize the weight and minimize the manufacturing cost – there are two objectives to consider. Luckily, this challenge can also be understood as an optimization problem.
The key to multi-objective thinking is known as Pareto optimality. Unlike the single objective functions above, there is generally not a single optimal solution; there are multiple solutions. At first glance this seems counter-intuitive, but the beauty lies in the definition of a Pareto optimal solution: the solution is optimal if no objective can be improved upon without making another objective worse. In other words, no other solution can outperform all the objectives of one of the optimal points. To make this example more concrete, let’s leave aside engineering and think about the price of real estate. In every city I’ve ever seen, there is a geographical feature that is desirable to live near. For the purposes of this example, let’s take the ocean. Everyone wants to live near the water, and everyone wants to spend the least amount of money. We can cast this as an optimization problem to find a home that minimizes cost and minimizes the distance to the ocean. Now, consider the graphic below that shows three homes and their respective prices.
As expected, the house farthest from the water is the cheapest. But perhaps surprisingly, this same home is also optimal! It is not possible to both get closer to water and spend less. The home closest to the water is also optimal, but the middle home at $110k is not because it is possible to be both closer and less expensive. Any buyer of home one or home three can correctly claim they made an optimal purchase.
Now armed with the idea of Pareto optimality, we can come back the world of data analytics. The next image shows the outcome of several computer experiments when training neural networks.
Each of the 50 points on the scatter plot represents an experimental trial, and the corresponding outcome metrics are plotted along the horizontal and vertical axes. Both of the metrics are interpreted such that smaller value is desirable. In this scatter plot visualization, the ideal solutions tend to the bottom left corner of the plot. Thinking optimally, we want to minimize both the validation loss and the validation mean absolute error. The four red points are all optimal, also known as the “non-dominated” set, while the remaining 46 blue points are referred to as “dominated”. Using this multi-objective assessment, only 4 of original 50 should be considered viable.
Borrowing from multi-objective optimization, it is possible to use Pareto optimality to focus data-driven decision making. Data itself will rarely, if ever, lead you to a single unambiguously clear answer, but data can inform the process if you know how to cut through the confusion that can arise with multiple metrics.