After using the suggested solution for a few weeks, I would like to re-express the need for a dedicated option for this (which is probably feasible since it is already available for other outputs of this operator).
Although the suggested solutions work and are simple enough, I found that they do not scale well to larger datasets. I tried running explain predictions in a ~80k rows and ~70 columns dataset.
The explain predictions operator itself ran under a reasonsable time, but the filter to keep only the top 5 important explaining attributes per example was taking too long to run: after 14 hours, it was only 25% done and therefore I terminated it. After more investigation, it seems the "Append" operator is the one taking the longest to execute under these conditions.
Also, some of our users have a Professional license, which is limited to 100,000 data rows, so I assume this limit will apply in this case as well, which is a problem because the data that is actually used (for example top 5 explaining attributes for each prediction) would be below that limit.
This is probably not a situation that will happen on a production model (it will be used on smaller datasets covering at most a few weeks of recent data), but I'm trying to use this feature to help investigate cases where the model was wrong in a dataset with several years of data. So I could probably filter to only keep the examples where the model was the most wrong, but it's also useful to have examples where the model was right for comparison.
Thanks