Messaging With Kafka in Altair Accelerator 2021.1.0

Stuart Taylor_20654
Stuart Taylor_20654
Altair Employee
edited December 2021 in Altair HPCWorks

Messaging With Kafka in Altair Accelerator 2021.1.0

The latest release of Altair Accelerator™, version 2021.1.0, includes some exciting updates that lay the groundwork for Accelerator users to realize significant value increase compared to other high-throughput schedulers. The addition of a native Kafka interface provides the ability to visualize a vast and broad range of information about your batch scheduling environment, information that is notoriously difficult and costly to capture in competitive solutions.

The Challenges of Batch Systems

Monitoring a high-performance batch system is hard. Most techniques end up slowing down the batch system because it has to report on its status, during which time it can’t be processing jobs or other requests.

The problem is compounded when there are several consumers of the monitored data. To get around that we can periodically respond to such requests, but now the problem is that the data isn’t fresh — it’s not generated in real time.

Additionally, we have the problem of multiple batch systems. Some customer configurations may have several separate clusters, and they want to see the aggregate data then break it down by queue.

Message systems such as Kafka can help with this by allowing for there to be many readers of the data. The data is published once but many consumers can read the message, so there is only one extra load on the batch system instead of many. For the multiple cluster problem, we can have multiple batch systems publishing to a single Kafka installation. However, we still need to be careful about when and how we publish the data. Do we publish a message every time there is a change, and how is that data extracted from the batch system? 

It’s important to understand the rate of change within a batch system. For example, Altair Accelerator can dispatch several hundred jobs per second, and each dispatch may change the state of 20-30 metrics. That’s a huge volume of data for just some basic measures.

Measuring Events in Close to Real Time

Accelerator leverages our internal metrics system, which accumulates measures over a short time window — around 10 seconds. While this loses resolution at the individual dispatch loop level, the resulting data is often more useful because of the high variance between dispatch loop iterations. Even with such accumulation, there’s still the overhead of getting the data out of the inner loop. While we could have used an external query and publish mechanism, this involves significant overhead, especially for metrics that are more event-based. In Accelerator v2021.1.0 our approach was to directly code the Kafka publisher routines into the batch system core for lower overhead.

While the accumulators for the current set of data are around 10s, other data packets can be more event-based, and here the direct-publish architecture should give us the ability to measure events closer to real time. Some simple benchmarks indicate that it takes about 1/10th second to go from the batch system into Kafka and out to a consumer.

Capturing Actionable Information

Another benefit of message systems like Kafka is their ability to store messages for a period of time. This gives us the ability to look at recent history and interpret trends from that history. Real-time data and historical data in the same system without any database design!

Publishing the data efficiently is important, but the real win is being able to capture actionable information. For that, we chose Altair Panopticon™ visualization solution that can read from a Kafka data stream natively.

Examples

Here are a few examples of what we’ve been able to visualize so far.

A simple table view of a couple of queues and their aggregate data:

image

Here what we’re looking at for a breakdown of the scheduler phases - we've grouped the use workload towards the top and the system workload at the bottom - any idle time is the gap between the two and is represented by the ivory colored region.

image

And here we’re comparing the time spent scheduling to a policy limit (red line)

image

In this screenshot, the faint lines show the measures from the accumulated results. Even averaging over 10s, the data gets quite noisy. By adding some further smoothing (simple moving average over 30 samples) we get a more readable measure - that’s the heavier line overlaying the noisy data.

image


Combining a few of these charts with a few others, we get a usable, real-time dashboard with support for multiple data sources - without any coding!

image