New Tutorial Videos: ARIMA, GLM Node, Imbalanced Classes, Simpson's Paradox, and Substitute Missing Values
We have several new tutorial videos available in the Knowledge Studio knowledge base that will help you get the most out of your implementation:
Build Autoregressive Integrated Moving Average (ARIMA) machine learning models
Knowledge Studio supports Autoregressive Integrated Moving Average (ARIMA) models, a powerful way to make accurate predictions based on time series data. You can add ARIMA models to your AI workflows with a fully menu-driven user interface. The software’s Auto ARIMA functions automatically estimate values for ARIMA parameters using a grid search or step-wise algorithm. ARIMA is a simple yet powerful method for making time series forecasts, often incorporating seasonal and other types of semi-regular variations. For example, you can use ARIMA models to forecast electricity and raw materials utilization in a factory, output volumes in an oil refinery, fuel consumption for truck fleet, rail, and seaborne shipping companies, patient churn and intake volumes in hospitals, and key financial indicators for any type of business.
Using the Generalized Linear Model (GLM) Node
In the context of machine learning applications, GLM models allow the response variable to have an error distribution other than a normal distribution. This video shows how easy it is to use Knowledge Studio’s GLM node to utilize this advanced statistical technique to build more accurate machine learning models.
Working with Imbalanced Classes
Most machine learning algorithms assume there are equal numbers of examples for each class in the source data. Many datasets contain substantially different numbers of records for important classes — resulting in an imbalanced class problem. Failure to handle this properly results in models with poor predictive performance. Knowledge Studio has a node specifically built to handle imbalanced class issues. In this video, you will learn how to identify an imbalanced class problem and use the software’s Handle Class Imbalance node to correct it.
In simple terms, Simpson’s Paradox occurs when a trend appears in subgroups but disappears or is reversed when subgroups are combined into a single dataset. Knowledge Studio supports detection of this statistical phenomenon. In this video, you will see an example of how Simpson’s Paradox can manifest itself and how you can use Knowledge Studio to detect its presence automatically.
Datasets often have missing values due to file corruption, failure to record data points, or other causes. Handling missing data values correctly is critical to developing accurate predictive models. Knowledge Studio makes it easy to identify datasets containing missing values and generate new substitute values based on a variety of substitution algorithms. This video walks you through a simple example of how the software’s Substitute Missing Values node works.